Linux之bpfjit(2)使用分析和mini-tcpdump实现
Author: Once Day Date: 2024年4月13日
一位热衷于Linux学习和开辟的菜鸟,试图谱写一场冒险之旅,大概尽头只是一场白日梦…
漫漫长路,有人对你微笑过嘛…
全系列文章可以参考专栏:Linux基础知识_Once-Day的博客-CSDN博客。
参考文章:
- rmind/npf: NPF: packet filter with stateful inspection, NAT, IP sets, etc. (github.com)
- sljit - stack-less jit compiler (zherczeg.github.io)
- alnsn/bpfjit: Just-in-Time compilation of bpf (github.com)
- bpfjit/src at master · rmind/bpfjit (github.com)
1. 概述
1.1 BPF(Berkeley Packet Filter)
BPF(Berkeley Packet Filter)最初设计用于数据包过滤,即网络封包的捕获与筛选。随着时间的发展,BPF已经演酿成为一个更加通用、功能强盛的技术,尤其是在Linux内核中,它已经被扩展为eBPF(extended Berkeley Packet Filter)。
根本概念:
- BPF:最初的设计目标是进步网络封包处理的效率,通过在用户空间与内核空间之间提供一个机动的数据包过滤机制。
- eBPF:是BPF的扩展版本,不但能够进行数据包过滤,还能进行性能监控、网络监控、安全审计等多种功能。eBPF提供了一种在不改变内核源代码的情况下,向Linux内核动态添加自界说代码(重要是监控和跟踪代码)的本领。
在FastPath,目前使用的包过滤技术,重要是BPF。eBPF必要内核支持,暂时没有合适的用户空间实现机制。
BPF本质上是一堆预界说的字节码,可以模拟加减乘除、分支判定、跳转、寄存器存入和读取等操作。这些基础操作组合起来,就能实现复杂的过滤处理逻辑。
比方,下面是一些BPF的算术和判定操作符界说:
- //(NetBSD) - sys/net/bpf.h
- /* alu/jmp fields */
- #define BPF_OP(code) ((code) & 0xf0)
- #define BPF_ADD 0x00
- #define BPF_SUB 0x10
- #define BPF_MUL 0x20
- #define BPF_DIV 0x30
- #define BPF_OR 0x40
- #define BPF_AND 0x50
- #define BPF_LSH 0x60
- #define BPF_RSH 0x70
- #define BPF_NEG 0x80
- #define BPF_MOD 0x90
- #define BPF_XOR 0xa0
- ......
复制代码 1.2 BPFJIT(Berkeley Packet Filter Just In Time)
BPFJIT 是 JIT 编译技术在 BPF(Berkeley Packet Filter)上的应用。BPF最初是为了高效的数据包过滤而设计的,它允许在用户空间编写过滤规则,然后在内核空间实行,大幅提升了网络数据包处理的效率。
BPFJIT 则是进一步优化了这个过程,将 BPF 字节码即时编译成机器码,以便内核可以直接实行,这样可以进一步进步过滤效率。
BPF工具可以天生一个包含BPF指令码的字节序列,但是这个字节序列的实行可以有多种情势,如下:
- // 一个典型的(tcpdump tcp)命令生成的bpf_filter字节码,用于过滤TCP报文
- // 该字节码判断IPv4和IPv6协议类型,针对IPv6还考虑了分片情况处理。
- // 如果报文符合条件,返回8192。如果报文不符合条件,则返回0
- (000) ldh [12]
- (001) jeq #0x86dd jt 2 jf 8
- (002) ldb [20]
- (003) jeq #0x6 jt 12 jf 4
- (004) ldb [20]
- (005) jeq #0x2c jt 6 jf 8
- (006) ldb [54]
- (007) jeq #0x6 jt 12 jf 8
- (008) ldh [12]
- (009) jeq #0x800 jt 10 jf 13
- (010) ldb [23]
- (011) jeq #0x6 jt 12 jf 13
- (012) ret #8192
- (013) ret #0
复制代码 对于上述BPF指令,BSD内核代码通过一个C函数直接迭代解析,在数据报mbuf原地上进行处理。
- //(NetBSD) - sys/net/bpf_filter.h
- u_int
- bpf_filter(const struct bpf_insn *pc, const u_char *p, u_int wirelen,
- u_int buflen)
- #endif
- {
- uint32_t A, X, k;
- #ifndef _KERNEL
- uint32_t mem[BPF_MEMWORDS];
- bpf_args_t args_store = {
- .pkt = p,
- .wirelen = wirelen,
- .buflen = buflen,
- .mem = mem,
- .arg = NULL
- };
- bpf_args_t * const args = &args_store;
- #else
- const uint8_t * const p = args->pkt;
- #endif
- if (pc == 0) {
- /*
- * No filter means accept all.
- */
- return (u_int)-1;
- }
- /*
- * Note: safe to leave memwords uninitialised, as the validation
- * step ensures that it will not be read, if it was not written.
- */
- A = 0;
- X = 0;
- --pc;
- for (;;) {
- ++pc;
- switch (pc->code) {
- default:
- #ifdef _KERNEL
- return 0;
- #else
- abort();
- /*NOTREACHED*/
- #endif
- case BPF_RET|BPF_K:
- return (u_int)pc->k;
- case BPF_RET|BPF_A:
- return (u_int)A;
- case BPF_LD|BPF_W|BPF_ABS:
- k = pc->k;
- if (k > args->buflen ||
- sizeof(int32_t) > args->buflen - k) {
- #ifdef _KERNEL
- int merr;
- if (args->buflen != 0)
- return 0;
- A = xword(args->pkt, k, &merr);
- if (merr != 0)
- return 0;
- continue;
- #else
- return 0;
- #endif
- }
- A = EXTRACT_LONG(&p[k]);
- continue;
- case BPF_LD|BPF_H|BPF_ABS:
- //...(省略大量代码)...
复制代码 FastPath的报文过滤的BPF指令码实行函数,采用就是该函数的实现方式。
除了这种C函数直接循环迭代解析之外,还可以通过JIT(即时编译)技术增加处理效率。
在NetBSD实现里面,采用SLJIT技术,将BPF指令码一一对应转换为SLJIT指令码,在即时编译后,天生机器特定汇编代码,最终就可以采用C函数指针直接调用实行。
- //(NetBSD) - sys/net/bpfjit.c - generate_insn_code
- // ...(省略大量代码)...
- case BPF_LD:
- /* BPF_LD+BPF_IMM A <- k */
- if (pc->code == (BPF_LD|BPF_IMM)) {
- status = sljit_emit_op1(compiler,
- SLJIT_MOV,
- BJ_AREG, 0,
- SLJIT_IMM, (uint32_t)pc->k);
- if (status != SLJIT_SUCCESS)
- goto fail;
- continue;
- }
- /* BPF_LD+BPF_MEM A <- M[k] */
- if (pc->code == (BPF_LD|BPF_MEM)) {
- if ((uint32_t)pc->k >= memwords)
- goto fail;
- status = emit_memload(compiler,
- BJ_AREG, pc->k, extwords);
- if (status != SLJIT_SUCCESS)
- goto fail;
- continue;
- }
- /* BPF_LD+BPF_W+BPF_LEN A <- len */
- if (pc->code == (BPF_LD|BPF_W|BPF_LEN)) {
- status = sljit_emit_op1(compiler,
- SLJIT_MOV, /* size_t source */
- BJ_AREG, 0,
- SLJIT_MEM1(BJ_ARGS),
- offsetof(struct bpf_args, wirelen));
- if (status != SLJIT_SUCCESS)
- goto fail;
- continue;
- }
- mode = BPF_MODE(pc->code);
- if (mode != BPF_ABS && mode != BPF_IND)
- goto fail;
- if (unconditional_ret)
- continue;
- status = emit_pkt_read(compiler, hints, pc,
- to_mchain_jump, &ret0, &ret0_size, &ret0_maxsize);
- if (status != SLJIT_SUCCESS)
- goto fail;
- continue;
- // ...(省略大量代码)...
复制代码 SLJIT即时编译后通常返回一个函数,BPFJIT固定了这个函数的情势,如下:
- /*
- * Return value of a function generated by sljit have sljit_uw type
- * which can have a greater width. In such cases, we rely on the fact
- * that calling conventions use same registers for smaller types.
- * SLJIT_MOV_UI is passed to sljit_emit_return() to make sure that the
- * return value is truncated to unsigned int.
- */
- typedef unsigned int (*bpfjit_func_t)(const bpf_ctx_t *, bpf_args_t *);
复制代码 通过这个函数,BPF可以实行更多复杂的操作,同时分身效率。
1.3 SLJIT(Simple Just-In-Time)
SLJIT 是一个独立的、通用的 JIT 编译库,它不特定于任何范畴,可以被用于任何必要 JIT 功能的场所。SLJIT 的设计哲学是简单和通用,它提供了一套低层次的 API,使得开辟者可以根据自己的需求天生机器码。比如,SLJIT 可以用于实现正则表达式的快速匹配,也可以用于脚本语言的即时编译。
SLJIT和下面的JIT技术属于同类工具:
- Libjit/liblighning,the backend of GNU.net
- Libgccjit,introduced in GCC5.0, its different from other JIT lib, this one seems like constructing a C code, it use the backend of GCC.
- AsmJIT,branch from the famous V8 project (JavaScript engine in Chrome), support only X86/X86_64.
- DynASM,used in LuaJIT.
SLJIT架构支持CPU架构指令情况如下所示:
- Intel-x86 32
- AMD-x86 64
- ARM 32 (ARM-v5, ARM-v7 and Thumb2 instruction sets)
- ARM 64
- PowerPC 32
- PowerPC 64
- MIPS 32 (III, R1)
- MIPS 64 (III, R1)
- RISC-V 32
- RISC-V 64
- s390x (64)
- loogarch #目前看到loogarch支持代码提交记录
复制代码 SLJIT使用方式雷同于汇编编程,通过中间层转换,可以屏蔽复杂的处理逻辑,下面是一个原始编程的例子:
- typedef sljit_sw (*func3_t)(sljit_sw a, sljit_sw b, sljit_sw c);
- static int branch(sljit_sw a, sljit_sw b, sljit_sw c)
- {
- void *code;
- sljit_uw len;
- func3_t func;
- struct sljit_jump *ret_c;
- struct sljit_jump *out;
- /* Create a SLJIT compiler */
- struct sljit_compiler *C = sljit_create_compiler(NULL);
- /* 3 arg, 1 temp reg, 3 save reg */
- sljit_emit_enter(C, 0, SLJIT_ARG1(SW) | SLJIT_ARG2(SW) | SLJIT_ARG3(SW), 1, 3, 0, 0, 0);
- /* R0 = a & 1, S0 is argument a */
- sljit_emit_op2(C, SLJIT_AND, SLJIT_R0, 0, SLJIT_S0, 0, SLJIT_IMM, 1);
- /* if R0 == 0 then jump to ret_c, where is ret_c? we assign it later */
- ret_c = sljit_emit_cmp(C, SLJIT_EQUAL, SLJIT_R0, 0, SLJIT_IMM, 0);
- /* R0 = b, S1 is argument b */
- sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_S1, 0);
- /* jump to out */
- out = sljit_emit_jump(C, SLJIT_JUMP);
- /* here is the 'ret_c' should jump, we emit a label and set it to ret_c */
- sljit_set_label(ret_c, sljit_emit_label(C));
- /* R0 = c, S2 is argument c */
- sljit_emit_op1(C, SLJIT_MOV, SLJIT_RETURN_REG, 0, SLJIT_S2, 0);
- /* here is the 'out' should jump */
- sljit_set_label(out, sljit_emit_label(C));
- /* end of function */
- sljit_emit_return(C, SLJIT_MOV, SLJIT_RETURN_REG, 0);
- /* Generate machine code */
- code = sljit_generate_code(C);
- len = sljit_get_generated_code_size(C);
- /* Execute code */
- func = (func3_t)code;
- printf("func return %ld\n", func(a, b, c));
- dump_code(code, len);
- /* Clean up */
- sljit_free_compiler(C);
- sljit_free_code(code);
- return 0;
- }
复制代码 这段SLJIT处理天生了一个简单的函数,C语言等价表示如下:
- sljit_sw func(sljit_sw a, sljit_sw b, sljit_sw c)
- {
- if ((a & 1) == 0)
- return c;
- return b;
- }
复制代码 SLJIT即时编译天生的汇编代码经过反编译后,输出如下:
- Disassembly of section .data:
- 0000000000000000 <.data>:
- 0: f3 0f 1e fa endbr64
- 4: 53 push %rbx
- 5: 41 57 push %r15
- 7: 41 56 push %r14
- 9: 48 8b df mov %rdi,%rbx
- c: 4c 8b fe mov %rsi,%r15
- f: 4c 8b f2 mov %rdx,%r14
- 12: 48 89 d8 mov %rbx,%rax
- 15: 48 83 e0 01 and $0x1,%rax
- 19: 48 83 f8 00 cmp $0x0,%rax
- 1d: 74 05 je 0x24
- 1f: 4c 89 f8 mov %r15,%rax
- 22: eb 03 jmp 0x27
- 24: 4c 89 f0 mov %r14,%rax
- 27: 41 5e pop %r14
- 29: 41 5f pop %r15
- 2b: 5b pop %rbx
- 2c: c3 ret
复制代码 这个汇编代码并不算高效,因为很多无效堆栈生存操作。不外这也是编译器优化的痛点所在,即使用GCC编译C代码,在没有高效优化模型和编程技巧下,天生的汇编指令也是非常繁复。
1.4 BPF和eBPF的兼容性
eBPF(extended Berkeley Packet Filter)是BPF(Berkeley Packet Filter)的一个扩展,它们在焦点概念上是兼容的,但eBPF提供了更多的功能和更大的机动性。下面是两者之间的关系和兼容性方面的一些细节:
基础兼容性:
- eBPF是BPF的超集,这意味着所有有效的BPF程序都是有效的eBPF程序。
- eBPF扩展了BPF的指令集,增加了新的指令和功能,但保持了与传统BPF指令集的兼容性。
指令集和功能:
- eBPF引入了更多的寄存器,提供了64位寄存器支持,而传统BPF是基于32位的。
- eBPF支持更复杂的数据布局(如maps),而传统BPF的功能重要范围于数据包过滤。
- eBPF程序可以附加到多种内核挂钩点,而BPF重要用于网络数据包捕获和过滤。
向后兼容:
- Linux内查对eBPF提供了向后兼容支持,意味着旧的BPF程序可以在新的内核中运行,但是大概无法利用eBPF提供的所有新功能。
eBPF支持类C语言语法,相比于BPF的原始字节码,易用性大大进步,但是整个框架也更加复杂,必要Clang专门工具进行编译和开辟。
1.5 常见BPF技术区别和联系
BPFJIT一般在内核中有实现,支持三类操作:
- bpf_filter,标准Unix网络数据包过滤操作,直接表明BPF字节码。
- bpf_validate,用于验证BPF字节码是否正常,避免无穷循环和错误逻辑。
- bpf_jit_generate,使用SLJIT即时天生BPF字节码的机器汇编指令。
整体处理逻辑如下:
libpcap库支持将常见的tcpdump下令转换为BPF指令码,从而实现机动抓包功能。对于用户空间开辟的程序,也可以支持雷同的技术。在第二章,会借助bpf技术实现一个Tcpdump-mini程序,在里面实行上述三种bpf操作,并给出对比数据。
2. 简易抓包程序(Tcpdump-mini)
2.1 获取SLJIT和BPFJIT源码
SLJIT源码下载: zherczeg/sljit: Platform independent low-level JIT compiler (github.com)
BPFJIT源码下载: alnsn/bpfjit: Just-in-Time compilation of bpf (github.com)
Ubuntu下创建一个干净的目录,必要安装好GNU开辟套件(缺啥直接apt安装即可):
- # 例如安装mk-configure
- sudo apt install mk-configure
复制代码 先下载BPFJIT源码,再下载SLJIT源码:
- ubuntu->bpf-sop:$ git clone https://github.com/alnsn/bpfjit.git
- Cloning into 'bpfjit'...
- remote: Enumerating objects: 1092, done.
- remote: Total 1092 (delta 0), reused 0 (delta 0), pack-reused 1092
- Receiving objects: 100% (1092/1092), 215.03 KiB | 78.00 KiB/s, done.
- Resolving deltas: 100% (666/666), done.
- ubuntu->bpf-sop:$ git clone https://github.com/zherczeg/sljit.git
- Cloning into 'sljit'...
- remote: Enumerating objects: 6679, done.
- remote: Counting objects: 100% (6679/6679), done.
- remote: Compressing objects: 100% (1310/1310), done.
- remote: Total 6679 (delta 5411), reused 6545 (delta 5330), pack-reused 0
- Receiving objects: 100% (6679/6679), 3.99 MiB | 58.00 KiB/s, done.
- Resolving deltas: 100% (5411/5411), done.
- ubuntu->bpf-sop:$ ll
- drwxrwxr-x 9 ubuntu ubuntu 4096 Mar 28 22:45 bpfjit/
- drwxrwxr-x 7 ubuntu ubuntu 4096 Mar 28 22:50 sljit/
复制代码 SLJIT的版本一直在更新,但是API存在不兼容变革,因此必要先找到对应版本的SLJIT,目前是BPFJIT指定的版本可以直接用。
- ubuntu->bpf-sop:$ cd sljit/
- ubuntu->sljit:$ git checkout 8d536bf7c334f9e31a5cc366e5d5d8cd1cd431b1
- Note: switching to '8d536bf7c334f9e31a5cc366e5d5d8cd1cd431b1'.
- # 2020年的版本
- ubuntu->sljit:$ git log
- commit 8d536bf7c334f9e31a5cc366e5d5d8cd1cd431b1 (HEAD)
- Author: Carlo Marcelo Arenas Belón <carenas@gmail.com>
- Date: Tue Aug 11 03:17:47 2020 -0700
- config: detect gcc support for fastcall (#75)
-
- haiku x86 uses gcc 2.95.2 as the system compiler and fails to build,
- because support for the fastcall calling convention was added with 3.4.
-
- detect the gcc version before enabling the attribute and while at it
- reverse the condition and refactor the surrrounding code.
复制代码 打包当前版本sljit源码,复制到bpfjit目录下面解压,必要留意别覆盖了原有的Makefile文件,否则会编译报错。
- ubuntu->sljit:$ git archive --format=tar --output=sljit.tar HEAD
- ubuntu->sljit:$ cd ../bpfjit/sljit/
- ubuntu->sljit:$ mv Makefile bpf-sljit.mk
- ubuntu->sljit:$ tar -xf ../../sljit/sljit.tar
- ubuntu->sljit:$ ll
- total 56
- drwxrwxr-x 6 ubuntu ubuntu 4096 Mar 28 23:01 ./
- drwxrwxr-x 9 ubuntu ubuntu 4096 Mar 28 22:45 ../
- -rw-rw-r-- 1 ubuntu ubuntu 5627 Aug 11 2020 API_CHANGES
- -rw-rw-r-- 1 ubuntu ubuntu 44 Mar 28 22:45 bpf-sljit.mk
- drwxrwxr-x 3 ubuntu ubuntu 4096 Aug 11 2020 doc/
- -rw-rw-r-- 1 ubuntu ubuntu 11 Aug 11 2020 .gitignore
- -rw-rw-r-- 1 ubuntu ubuntu 245 Aug 11 2020 INTERNAL_CHANGES
- -rw-rw-r-- 1 ubuntu ubuntu 4290 Aug 11 2020 Makefile
- -rw-rw-r-- 1 ubuntu ubuntu 1033 Aug 11 2020 README
- drwxrwxr-x 2 ubuntu ubuntu 4096 Aug 11 2020 regex_src/
- drwxrwxr-x 2 ubuntu ubuntu 4096 Aug 11 2020 sljit_src/
- drwxrwxr-x 2 ubuntu ubuntu 4096 Aug 11 2020 test_src/
复制代码 交换Makefile和bpf-sljit.mk两个文件的名字,bpfjit有自己的一套编译流程,所以必要分开编译。
- ubuntu->sljit:$ mv Makefile sljit-self.mk
- ubuntu->sljit:$ mv bpf-sljit.mk Makefile
复制代码 修改一下Makefile文件,通过Make子进程单独编译测试程序,默认SLJIT是源码分发,不编译动态库和静态库。
先编译SLJIT,并且测试一下功能:
- ubuntu->sljit:$ make -f sljit-self.mk
- mkdir -p bin
- cc -DSLJIT_CONFIG_AUTO=1 -Isljit_src -O2 -Wall -c -o bin/sljitMain.o test_src/sljitMain.c
- cc -DSLJIT_CONFIG_AUTO=1 -Isljit_src -O2 -Wall -c -o bin/sljitTest.o test_src/sljitTest.c
- cc -DSLJIT_CONFIG_AUTO=1 -Isljit_src -O2 -Wall -c -o bin/sljitLir.o sljit_src/sljitLir.c
- cc -O2 -Wall bin/sljitMain.o bin/sljitTest.o bin/sljitLir.o -o bin/sljit_test -lm -lpthread
- cc -DSLJIT_CONFIG_AUTO=1 -Isljit_src -O2 -Wall -fshort-wchar -c -o bin/regexMain.o regex_src/regexMain.c
- cc -DSLJIT_CONFIG_AUTO=1 -Isljit_src -O2 -Wall -fshort-wchar -c -o bin/regexJIT.o regex_src/regexJIT.c
- cc -O2 -Wall bin/regexMain.o bin/regexJIT.o bin/sljitLir.o -o bin/regex_test -lm -lpthread
- ubuntu->sljit:$ export LD_LIBRARY_PATH=./bin
- ubuntu->sljit:$ ./bin/sljit_test
- Pass -v to enable verbose, -s to disable this hint.
- SLJIT tests: all tests are PASSED on x86 64bit (little endian + unaligned) (with fpu)
- ubuntu->sljit:$ ./bin/regex_test
- Pass -v to enable verbose, -s to disable this hint.
- REGEX tests: all tests are PASSED on x86 64bit (little endian + unaligned)
复制代码 测试完毕可以看到功能正常,然后继续编译bpfjit。必要修改以下部分代码,避免编译报错异常退出:
- # bpfjit/test/test_empty.c 39行 添加初始化值
- struct bpf_insn dummy = {0};
复制代码 使用mkcmake直接编译,如果正常,将直接编译乐成,如果存在问题,按照编译提示修改即可(编译器版本不同,会有新增报错,这个很正常)。
- ubuntu->bpfjit:$ mkcmake
- ==================================================
- all ===> sljit
- ==================================================
- all ===> sljit/sljit_src
- ==================================================
- all ===> src
- ==================================================
- all ===> test
- ==================================================
- all ===> benchmark
- cc -I ../src -I ../sljit/sljit_src/ -DSLJIT_CONFIG_AUTO=1 -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wswitch -Wshadow -Wcast-qual -Wwrite-strings -Wno-unused-parameter -Werror -c -o benchmark.o -O2 -g benchmark.c
- cc -I ../src -I ../sljit/sljit_src/ -DSLJIT_CONFIG_AUTO=1 -Wall -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wswitch -Wshadow -Wcast-qual -Wwrite-strings -Wno-unused-parameter -Werror -c -o c.o -O2 -g c.c
- cc -o bpfjit_benchmark benchmark.o c.o -L /home/ubuntu/NetBSD/bpf-sop/bpfjit/benchmark/../src -L /home/ubuntu/NetBSD/bpf-sop/bpfjit/benchmark/../sljit/sljit_src -lpcap -lbpfjit -lsljit
复制代码 安装到固定目录中:
- ubuntu->bpfjit:$ export DESTDIR=/home/ubuntu/NetBSD/bpf-sop
- ubuntu->bpfjit:$ env PREFIX=/ mkcmake install
- ==================================================
- install ===> sljit
- ==================================================
- install ===> sljit/sljit_src
- if test -n "/home/ubuntu/NetBSD/bpf-sop//lib"; then mkc_install -c -d -m 755 /home/ubuntu/NetBSD/bpf-sop//lib; fi
- mkc_install -c -o ubuntu -g ubuntu -m 644 libsljit.a /home/ubuntu/NetBSD/bpf-sop//lib/libsljit.a
- mkc_install -c -o ubuntu -g ubuntu -m 644 libsljit_pic.a /home/ubuntu/NetBSD/bpf-sop//lib/libsljit_pic.a
- mkc_install -c -o ubuntu -g ubuntu -m 644 libsljit.so.1.0.0 /home/ubuntu/NetBSD/bpf-sop//lib/libsljit.so.1.0.0
- ln -s -f libsljit.so.1.0.0 /home/ubuntu/NetBSD/bpf-sop//lib/libsljit.so.1
- ln -s -f libsljit.so.1.0.0 /home/ubuntu/NetBSD/bpf-sop//lib/libsljit.so
- ==================================================
- ......
复制代码 可以实行一下bpfjit的单位测试程序,如下:
- ubuntu->bpf-sop:$ export LD_LIBRARY_PATH=./lib
- ubuntu->bpf-sop:$ export PATH=$PATH:./bin
- ubuntu->bpf-sop:$ ./bin/bpfjit_test
- bpfjit_test: test_copx.c:139 (in test_copx_ret_A): code(&ctx, &args) == 13
- bpfjit_test: test_copx_extmem.c:96 (in test_copx_ret_mem): code(&ctx, &args) == 13
- bpfjit_test: test_copx_extmem.c:138 (in test_copx_ret_preinited_mem): code(&ctx, &args) == 3
复制代码 这里打印三个测试信息,阐明bpfjit有三个单位测试用例无法测试通过,目前可以先忽略。
查看lib目录下面,就有完备的动态库和静态库文件,除此之外,还必要有相应的头文件,这里没有安装,功能有所欠缺。
- ubuntu->bpf-sop:$ ll lib/ -h
- -rw-r--r-- 1 ubuntu ubuntu 65K Mar 28 23:30 libbpfjit.a
- -rw-r--r-- 1 ubuntu ubuntu 65K Mar 28 23:30 libbpfjit_pic.a
- lrwxrwxrwx 1 ubuntu ubuntu 18 Mar 28 23:30 libbpfjit.so -> libbpfjit.so.1.0.0
- lrwxrwxrwx 1 ubuntu ubuntu 18 Mar 28 23:30 libbpfjit.so.1 -> libbpfjit.so.1.0.0
- -rw-r--r-- 1 ubuntu ubuntu 55K Mar 28 23:30 libbpfjit.so.1.0.0
- -rw-r--r-- 1 ubuntu ubuntu 471K Mar 28 23:30 libsljit.a
- -rw-r--r-- 1 ubuntu ubuntu 474K Mar 28 23:30 libsljit_pic.a
- lrwxrwxrwx 1 ubuntu ubuntu 17 Mar 28 23:30 libsljit.so -> libsljit.so.1.0.0
- lrwxrwxrwx 1 ubuntu ubuntu 17 Mar 28 23:30 libsljit.so.1 -> libsljit.so.1.0.0
- -rw-r--r-- 1 ubuntu ubuntu 275K Mar 28 23:30 libsljit.so.1.0.0
复制代码 2.2 编写tcpdump-mini程序
第一个文件ether-input.c,用于初始化原始套接字,从网卡读取原始以太报文。
- // ether-input.c 初始化Raw套接字,并且收取原始报文。
- extern int32_t ether_sock_init(const char *if_name);
- extern int32_t ether_recv_packet(int32_t sock, char *buffer, int32_t len);
复制代码 第二个文件mini-tcpdump.c,实现抓包过滤处理逻辑,将tcpdump过滤参数转换为bpf和机器指令,打印符合条件的报文信息。
- // min-tcpdump.c 处理参数和过滤,打印目标报文信息
- int32_t deal_tcpdump_code(pcap_t **handle, struct bpf_program *fp, const char *filter_exp);
- void print_packet_info(const char *packet);
- int32_t capture_packets(int32_t sock, struct bpf_program *fp, bpfjit_func_t code);
- int32_t main(int32_t argc, char *argv[])
- {
- int32_t sock;
- pcap_t *handle;
- struct bpf_program fp;
- bpfjit_func_t code;
- if (argc != 3) {
- fprintf(stderr, "Usage: %s <interface> <tcpdump code>\n", argv[0]);
- return -1;
- }
- sock = 0;
- handle = NULL;
- code = NULL;
- memset(&fp, 0, sizeof(struct bpf_program));
- /* 也许用一下伪Lambda函数?至少可以避免全局变量使用 */
- lambda (void, free_source, int32_t signo) {
- /* 打印提示信息, 回收资源 */
- if (signo != -1) {
- printf("\nCtrl+C is pressed(sig %d), exit with 0.\n", signo);
- }
- if (fp.bf_insns) {
- pcap_freecode(&fp);
- }
- if (handle) {
- pcap_close(handle);
- }
- if (code) {
- bpfjit_free_code(code);
- }
- if (sock) {
- close(sock);
- }
- exit(0);
- }
- /* 注册ctrl+c信号处理函数 */
- signal(SIGINT, free_source);
- printf("Try to dump packet from interface(%s) with filter(%s)\n", argv[1], argv[2]);
- sock = ether_sock_init(argv[1]);
- if (sock < 0) {
- fprintf(stderr, "Failed to init socket\n");
- free_source(-1);
- return -1;
- }
- /* 编译tcpdump参数为bpf指令码 */
- if (deal_tcpdump_code(&handle, &fp, argv[2]) != 0) {
- fprintf(stderr, "Failed to deal tcpdump code\n");
- free_source(-1);
- return -1;
- }
- /* 验证bpf指令码的正确性 */
- if (bpf_validate(fp.bf_insns, fp.bf_len) == 0) {
- fprintf(stderr, "Failed to validate bpf code\n");
- free_source(-1);
- return -1;
- }
- /* 编译bpf指令码为机器指令 */
- code = bpfjit_generate_code(NULL, fp.bf_insns, fp.bf_len);
- if (code == 0) {
- fprintf(stderr, "Failed to compile bpf code\n");
- free_source(-1);
- return -1;
- }
- /* 循环抓包到结束 */
- if (capture_packets(sock, &fp, code) != 0) {
- fprintf(stderr, "Failed to capture packets\n");
- free_source(-1);
- return -1;
- }
- return 0;
- }
复制代码 在mini-tcpdump程序主函数里面,实行了如下操作流程:
- 起首,函数担当两个参数:argc和argv。argc表示下令行参数的数目,argv是一个指向参数字符串的指针数组。
- 接下来,函数检查下令行参数的数目是否为3。如果不是3个参数,它会打印出使用阐明并返回-1,表示程序实行失败。
- 然后,函数初始化一些变量,包括一个整型变量sock、一个指向pcap_t布局体的指针handle、一个struct bpf_program布局体变量fp和一个指向bpfjit_func_t范例的指针code。这些变量用于后续的操作。
- 函数界说了一个伪Lambda函数free_source,用于释放资源。这个函数会在程序接收到SIGINT信号(即用户按下Ctrl+C)时被调用。它会打印提示信息,并释放之前分配的资源,包括fp的指令码、handle的资源、code的机器指令以及sock的文件形貌符。
- 接下来,函数注册了SIGINT信号的处理函数为free_source,以便在用户按下Ctrl+C时实行资源释放操作。
- 函数打印一条提示信息,表现要从指定的网络接口抓取数据包,并使用指定的过滤器进行过滤。
- 函数调用ether_sock_init函数初始化一个套接字,并将返回的文件形貌符生存在sock变量中。如果初始化失败,函数会打印错误信息,调用free_source函数释放资源,并返回-1,表示程序实行失败。
- 函数调用deal_tcpdump_code函数,将指定的tcpdump代码编译为BPF指令码,并将结果生存在handle和fp变量中。如果编译失败,函数会打印错误信息,调用free_source函数释放资源,并返回-1,表示程序实行失败。
- 函数调用bpf_validate函数验证BPF指令码的正确性。如果验证失败,函数会打印错误信息,调用free_source函数释放资源,并返回-1,表示程序实行失败。
- 函数调用bpfjit_generate_code函数将BPF指令码编译为机器指令,并将结果生存在code变量中。如果编译失败,函数会打印错误信息,调用free_source函数释放资源,并返回-1,表示程序实行失败。
- 末了,函数调用capture_packets函数开始循环抓取数据包,直到结束。如果抓包失败,函数会打印错误信息,调用free_source函数释放资源,并返回-1,表示程序实行失败。
- 如果所有操作都乐成完成,函数返回0,表示程序实行乐成。
这个函数的重要目标是从指定的网络接口抓取数据包,并根据指定的tcpdump代码进行过滤和处理。它使用了一些库函数和自界说函数来实现这些功能,并在程序实行过程中处理了一些错误情况,以确保程序的稳定性和正确性。
2.3 对比bpf-filter和bpfjit的开销
除了正常验证bpf过滤功能之外,这里还简单对比了一下函数解析BPF指令码和JIT即使编译实行的性能:
- /* 获取绝对时间差值 */
- static inline int64_t get_current_time(void)
- {
- struct timespec ts;
- clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
- return ts.tv_sec * 1000000000 + ts.tv_nsec;
- }
- int32_t capture_packets(int32_t sock, struct bpf_program *fp, bpfjit_func_t code)
- {
- int32_t ret, temp;
- int64_t start_time, end_time;
- int64_t filtered_count, captured_count;
- int64_t func_cost_time, jit_cost_time;
- char buffer[PACKET_SIZE];
- func_cost_time = jit_cost_time = 0;
- filtered_count = captured_count = 0;
- printf("Start to capture packets...\n");
- while (1) {
- int32_t len = ether_recv_packet(sock, buffer, PACKET_SIZE);
- if (len < 0) {
- perror("Failed to receive packet");
- break;
- }
- /* 执行BPF过滤器函数 */
- start_time = get_current_time();
- ret = bpf_filter(fp->bf_insns, (const u_char *)buffer, len, len);
- end_time = get_current_time();
- func_cost_time += end_time - start_time;
- /* 执行BPF过滤即时编译指令 */
- start_time = get_current_time();
- temp = jitcall(code, (const u_char *)buffer, len, len);
- end_time = get_current_time();
- jit_cost_time += end_time - start_time;
- if (temp != ret) {
- fprintf(stderr,
- "Warning, Result of executing bpf jit code is not equal to filter func: %d -> "
- "%d.\n",
- ret, temp);
- return -1;
- }
- if (ret == 0) {
- filtered_count++;
- continue;
- }
- captured_count++;
- /* 打印抓到的报文信息 */
- printf("[%ld]Packet captured ! Bypass %ld, Time avg cost: %ld ns(func) - %ld ns(jit).\n",
- captured_count, filtered_count, func_cost_time / (captured_count + filtered_count),
- jit_cost_time / (captured_count + filtered_count));
- print_packet_info(buffer);
- }
- return -1;
- }
复制代码 这个函数用于捕获网络数据包并实行BPF过滤器函数和即时编译指令:
- 起首,函数声明了一些变量,包括ret、temp、start_time、end_time、filtered_count、captured_count、func_cost_time、jit_cost_time和buffer。
- 接下来,函数初始化了func_cost_time和jit_cost_time为0,用于纪录实行BPF过滤器函数和即时编译指令的时间。
- 函数打印了一条提示信息,表示开始捕获数据包。
- 然后,函数进入一个无穷循环,用于不断捕获数据包并进行处理。
- 在循环中,函数调用ether_recv_packet函数接收一个数据包,并将其存储在buffer中。如果接收失败,函数会打印错误信息并跳出循环。
- 接下来,函数实行BPF过滤器函数。它调用bpf_filter函数,将过滤器指令、数据包和数据包长度作为参数传递给它。函数还纪录了实行过滤器函数的起始时间和结束时间,以盘算实行时间。
- 然后,函数实行BPF即时编译指令。它调用jitcall函数,将即时编译的代码、数据包和数据包长度作为参数传递给它。函数同样纪录了实行即时编译指令的起始时间和结束时间。
- 函数比较了BPF过滤器函数和即时编译指令的返回值。如果它们不相等,函数会打印告诫信息,并返回-1。
- 如果返回值为0,表示数据包被过滤掉了,函数会增加filtered_count的计数,并继续下一次循环。
- 如果返回值不为0,表示数据包符合过滤条件,函数会增加captured_count的计数,并打印捕获到的数据包信息。
- 循环会一直实行,直到出现错误或者手动停止循环。
- 末了,函数返回-1,表示捕获过程出现了错误。
这个函数的重要目标是捕获数据包并实行BPF过滤器函数和即时编译指令,以实现网络数据包的过滤和处理功能。
2.4 实际效果演示
起首抓取一下icmp报文看看,如下:
- onceday->bpf-sop:# export LD_LIBRARY_PATH=./lib
- onceday->bpf-sop:# ./mini-tcpdump eth0 "icmp"
- Try to dump packet from interface(eth0) with filter(icmp)
- BPF bytecode length: 6
- BPF bytecode:
- 28, 00, 00, 0c
- 15, 00, 03, 800
- 30, 00, 00, 17
- 15, 00, 01, 01
- 06, 00, 00, 2000
- 06, 00, 00, 00
- (000) ldh [12]
- (001) jeq #0x800 jt 2 jf 5
- (002) ldb [23]
- (003) jeq #0x1 jt 4 jf 5
- (004) ret #8192
- (005) ret #0
- Start to capture packets...
- [1]Packet captured ! Bypass 20, Time avg cost: 932 ns(func) - 204 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.128.17 -> 10.0.4.17, Proto: 1, Total Length: 28.
- [2]Packet captured ! Bypass 20, Time avg cost: 895 ns(func) - 197 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17 -> 169.254.128.17, Proto: 1, Total Length: 28.
- ^C
- Ctrl+C is pressed(sig 2), exit with 0.
复制代码 表达式复杂度可以再高一些(捕获目标IP为169.254.0.4的80端口TCP报文,或者ICMP报文):
- onceday->bpf-sop:# export LD_LIBRARY_PATH=./lib
- onceday->bpf-sop:# ./mini-tcpdump eth0 "(tcp port 80 and host 169.254.0.4) or icmp"
- Try to dump packet from interface(eth0) with filter((tcp port 80 and host 169.254.0.4) or icmp)
- BPF bytecode length: 50
- BPF bytecode:
- 28, 00, 00, 0c
- 15, 00, 06, 86dd
- 30, 00, 00, 14
- 15, 00, 04, 06
- 28, 00, 00, 36
- 15, 0e, 00, 50
- 28, 00, 00, 38
- 15, 0c, 00, 50
- 28, 00, 00, 0c
- 15, 00, 22, 800
- 30, 00, 00, 17
- 15, 00, 20, 06
- 28, 00, 00, 14
- 45, 1e, 00, 1fff
- b1, 00, 00, 0e
- 48, 00, 00, 0e
- 15, 03, 00, 50
- b1, 00, 00, 0e
- 48, 00, 00, 10
- 15, 00, 18, 50
- 28, 00, 00, 0c
- 15, 00, 02, 800
- 20, 00, 00, 1a
- 15, 18, 00, a9fe0004
- 28, 00, 00, 0c
- 15, 00, 02, 800
- 20, 00, 00, 1e
- 15, 14, 00, a9fe0004
- 28, 00, 00, 0c
- 15, 00, 02, 806
- 20, 00, 00, 1c
- 15, 10, 00, a9fe0004
- 28, 00, 00, 0c
- 15, 00, 02, 806
- 20, 00, 00, 26
- 15, 0c, 00, a9fe0004
- 28, 00, 00, 0c
- 15, 00, 02, 8035
- 20, 00, 00, 1c
- 15, 08, 00, a9fe0004
- 28, 00, 00, 0c
- 15, 00, 02, 8035
- 20, 00, 00, 26
- 15, 04, 00, a9fe0004
- 28, 00, 00, 0c
- 15, 00, 03, 800
- 30, 00, 00, 17
- 15, 00, 01, 01
- 06, 00, 00, 2000
- 06, 00, 00, 00
- (000) ldh [12]
- (001) jeq #0x86dd jt 2 jf 8
- (002) ldb [20]
- (003) jeq #0x6 jt 4 jf 8
- (004) ldh [54]
- (005) jeq #0x50 jt 20 jf 6
- (006) ldh [56]
- (007) jeq #0x50 jt 20 jf 8
- (008) ldh [12]
- (009) jeq #0x800 jt 10 jf 44
- (010) ldb [23]
- (011) jeq #0x6 jt 12 jf 44
- (012) ldh [20]
- (013) jset #0x1fff jt 44 jf 14
- (014) ldxb 4*([14]&0xf)
- (015) ldh [x + 14]
- (016) jeq #0x50 jt 20 jf 17
- (017) ldxb 4*([14]&0xf)
- (018) ldh [x + 16]
- (019) jeq #0x50 jt 20 jf 44
- (020) ldh [12]
- (021) jeq #0x800 jt 22 jf 24
- (022) ld [26]
- (023) jeq #0xa9fe0004 jt 48 jf 24
- (024) ldh [12]
- (025) jeq #0x800 jt 26 jf 28
- (026) ld [30]
- (027) jeq #0xa9fe0004 jt 48 jf 28
- (028) ldh [12]
- (029) jeq #0x806 jt 30 jf 32
- (030) ld [28]
- (031) jeq #0xa9fe0004 jt 48 jf 32
- (032) ldh [12]
- (033) jeq #0x806 jt 34 jf 36
- (034) ld [38]
- (035) jeq #0xa9fe0004 jt 48 jf 36
- (036) ldh [12]
- (037) jeq #0x8035 jt 38 jf 40
- (038) ld [28]
- (039) jeq #0xa9fe0004 jt 48 jf 40
- (040) ldh [12]
- (041) jeq #0x8035 jt 42 jf 44
- (042) ld [38]
- (043) jeq #0xa9fe0004 jt 48 jf 44
- (044) ldh [12]
- (045) jeq #0x800 jt 46 jf 49
- (046) ldb [23]
- (047) jeq #0x1 jt 48 jf 49
- (048) ret #8192
- (049) ret #0
- Start to capture packets...
- [1]Packet captured ! Bypass 7, Time avg cost: 368 ns(func) - 95 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.128.6 -> 10.0.4.17, Proto: 1, Total Length: 28.
- [2]Packet captured ! Bypass 7, Time avg cost: 353 ns(func) - 94 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17 -> 169.254.128.6, Proto: 1, Total Length: 28.
- [3]Packet captured ! Bypass 27, Time avg cost: 1170 ns(func) - 284 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.128.17 -> 10.0.4.17, Proto: 1, Total Length: 28.
- [4]Packet captured ! Bypass 27, Time avg cost: 1138 ns(func) - 277 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17 -> 169.254.128.17, Proto: 1, Total Length: 28.
- [5]Packet captured ! Bypass 55, Time avg cost: 1283 ns(func) - 302 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.128.6 -> 10.0.4.17, Proto: 1, Total Length: 28.
- [6]Packet captured ! Bypass 55, Time avg cost: 1264 ns(func) - 298 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17 -> 169.254.128.6, Proto: 1, Total Length: 28.
- [7]Packet captured ! Bypass 68, Time avg cost: 1345 ns(func) - 324 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/50252 -> 169.254.0.4/80, TCP(6), Total Length: 60.
- [8]Packet captured ! Bypass 69, Time avg cost: 1338 ns(func) - 322 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/50252, TCP(6), Total Length: 52.
- [9]Packet captured ! Bypass 69, Time avg cost: 1325 ns(func) - 319 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/50252 -> 169.254.0.4/80, TCP(6), Total Length: 40.
- [10]Packet captured ! Bypass 69, Time avg cost: 1312 ns(func) - 315 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/50252 -> 169.254.0.4/80, TCP(6), Total Length: 221.
- [11]Packet captured ! Bypass 70, Time avg cost: 1302 ns(func) - 312 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/50252, TCP(6), Total Length: 40.
- [12]Packet captured ! Bypass 70, Time avg cost: 1290 ns(func) - 310 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/50252 -> 169.254.0.4/80, TCP(6), Total Length: 835.
- [13]Packet captured ! Bypass 71, Time avg cost: 1278 ns(func) - 308 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/50252, TCP(6), Total Length: 40.
- [14]Packet captured ! Bypass 72, Time avg cost: 1272 ns(func) - 306 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/50252, TCP(6), Total Length: 256.
- [15]Packet captured ! Bypass 72, Time avg cost: 1261 ns(func) - 304 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/50252 -> 169.254.0.4/80, TCP(6), Total Length: 40.
- [16]Packet captured ! Bypass 72, Time avg cost: 1249 ns(func) - 301 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/50252, TCP(6), Total Length: 40.
- [17]Packet captured ! Bypass 73, Time avg cost: 1227 ns(func) - 296 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/50252 -> 169.254.0.4/80, TCP(6), Total Length: 40.
- [18]Packet captured ! Bypass 74, Time avg cost: 1220 ns(func) - 295 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/50252, TCP(6), Total Length: 40.
- [19]Packet captured ! Bypass 84, Time avg cost: 1248 ns(func) - 302 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.128.17 -> 10.0.4.17, Proto: 1, Total Length: 28.
- [20]Packet captured ! Bypass 84, Time avg cost: 1238 ns(func) - 299 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17 -> 169.254.128.17, Proto: 1, Total Length: 28.
- ^C
- Ctrl+C is pressed(sig 2), exit with 0.
复制代码 可以更进一步抓包,以下tcpdump表达式来捕获所有含有SYN标识的TCP报文:
- onceday->bpf-sop:# export LD_LIBRARY_PATH=./lib
- onceday->bpf-sop:# ./mini-tcpdump eth0 "tcp[tcpflags] & (tcp-syn) != 0"
- Try to dump packet from interface(eth0) with filter(tcp[tcpflags] & (tcp-syn) != 0)
- BPF bytecode length: 38
- BPF bytecode:
- 28, 00, 00, 0c
- 15, 00, 23, 800
- 28, 00, 00, 0c
- 15, 00, 06, 86dd
- 30, 00, 00, 14
- 15, 08, 00, 06
- 30, 00, 00, 14
- 15, 00, 02, 2c
- 30, 00, 00, 36
- 15, 04, 00, 06
- 28, 00, 00, 0c
- 15, 00, 19, 800
- 30, 00, 00, 17
- 15, 00, 17, 06
- 28, 00, 00, 14
- 45, 15, 00, 1fff
- 00, 00, 00, 0d
- 02, 00, 00, 00
- b1, 00, 00, 0e
- 60, 00, 00, 00
- 0c, 00, 00, 00
- 07, 00, 00, 00
- 50, 00, 00, 0e
- 02, 00, 00, 01
- 00, 00, 00, 02
- 02, 00, 00, 02
- 61, 00, 00, 02
- 60, 00, 00, 01
- 5c, 00, 00, 00
- 02, 00, 00, 02
- 00, 00, 00, 00
- 02, 00, 00, 03
- 61, 00, 00, 03
- 60, 00, 00, 02
- 1c, 00, 00, 00
- 15, 01, 00, 00
- 06, 00, 00, 2000
- 06, 00, 00, 00
- (000) ldh [12]
- (001) jeq #0x800 jt 2 jf 37
- (002) ldh [12]
- (003) jeq #0x86dd jt 4 jf 10
- (004) ldb [20]
- (005) jeq #0x6 jt 14 jf 6
- (006) ldb [20]
- (007) jeq #0x2c jt 8 jf 10
- (008) ldb [54]
- (009) jeq #0x6 jt 14 jf 10
- (010) ldh [12]
- (011) jeq #0x800 jt 12 jf 37
- (012) ldb [23]
- (013) jeq #0x6 jt 14 jf 37
- (014) ldh [20]
- (015) jset #0x1fff jt 37 jf 16
- (016) ld #0xd
- (017) st M[0]
- (018) ldxb 4*([14]&0xf)
- (019) ld M[0]
- (020) add x
- (021) tax
- (022) ldb [x + 14]
- (023) st M[1]
- (024) ld #0x2
- (025) st M[2]
- (026) ldx M[2]
- (027) ld M[1]
- (028) and x
- (029) st M[2]
- (030) ld #0x0
- (031) st M[3]
- (032) ldx M[3]
- (033) ld M[2]
- (034) sub x
- (035) jeq #0x0 jt 37 jf 36
- (036) ret #8192
- (037) ret #0
- Start to capture packets...
- [1]Packet captured ! Bypass 33, Time avg cost: 990 ns(func) - 208 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/33886 -> 13.107.5.93/443, TCP(6), Total Length: 60.
- [2]Packet captured ! Bypass 34, Time avg cost: 1009 ns(func) - 211 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/41920 -> 169.254.0.4/80, TCP(6), Total Length: 60.
- [3]Packet captured ! Bypass 35, Time avg cost: 1035 ns(func) - 216 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/41920, TCP(6), Total Length: 52.
- [4]Packet captured ! Bypass 41, Time avg cost: 1183 ns(func) - 214 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 13.107.5.93/443 -> 10.0.4.17/33886, TCP(6), Total Length: 52.
- [5]Packet captured ! Bypass 85, Time avg cost: 1273 ns(func) - 236 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/41926 -> 169.254.0.4/80, TCP(6), Total Length: 60.
- [6]Packet captured ! Bypass 86, Time avg cost: 1274 ns(func) - 236 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/41926, TCP(6), Total Length: 52.
- [7]Packet captured ! Bypass 311, Time avg cost: 610 ns(func) - 134 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/33902 -> 13.107.5.93/443, TCP(6), Total Length: 60.
- [8]Packet captured ! Bypass 312, Time avg cost: 612 ns(func) - 134 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 13.107.5.93/443 -> 10.0.4.17/33902, TCP(6), Total Length: 52.
- [9]Packet captured ! Bypass 343, Time avg cost: 673 ns(func) - 145 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/33906 -> 13.107.5.93/443, TCP(6), Total Length: 60.
- [10]Packet captured ! Bypass 344, Time avg cost: 677 ns(func) - 145 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 13.107.5.93/443 -> 10.0.4.17/33906, TCP(6), Total Length: 52.
- [11]Packet captured ! Bypass 400, Time avg cost: 775 ns(func) - 162 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/41930 -> 169.254.0.4/80, TCP(6), Total Length: 60.
- [12]Packet captured ! Bypass 401, Time avg cost: 779 ns(func) - 163 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/41930, TCP(6), Total Length: 52.
- [13]Packet captured ! Bypass 422, Time avg cost: 807 ns(func) - 168 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/42400 -> 169.254.0.4/80, TCP(6), Total Length: 60.
- [14]Packet captured ! Bypass 423, Time avg cost: 808 ns(func) - 168 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/42400, TCP(6), Total Length: 52.
- [15]Packet captured ! Bypass 447, Time avg cost: 835 ns(func) - 173 ns(jit).
- Ethernet: 52:54:00:85:f0:22 -> fe:ee:8f:bf:86:99, Type : 0x0800
- IP: 10.0.4.17/42402 -> 169.254.0.4/80, TCP(6), Total Length: 60.
- [16]Packet captured ! Bypass 448, Time avg cost: 837 ns(func) - 173 ns(jit).
- Ethernet: fe:ee:8f:bf:86:99 -> 52:54:00:85:f0:22, Type : 0x0800
- IP: 169.254.0.4/80 -> 10.0.4.17/42402, TCP(6), Total Length: 52.
- ^C
- Ctrl+C is pressed(sig 2), exit with 0.
复制代码 2.5 tcpdump-mini源码文件
2.5.1 ether-input.c文件。
- #define _GNU_SOURCE
- #include <stdio.h>
- #include <stdint.h>
- #include <stdlib.h>
- #include <string.h>
- #include <errno.h>
- #include <unistd.h>
- #include <sys/socket.h>
- #include <sys/ioctl.h>
- #include <net/if.h>
- #include <netinet/in.h>
- #include <linux/if_packet.h>
- #include <linux/if_ether.h>
- #define BUFFER_SIZE 65536
- extern int32_t ether_sock_init(const char *if_name);
- extern int32_t ether_recv_packet(int32_t sock, char *buffer, int32_t len);
- /**
- * @description: 初始化原始套接字
- * @param {char} *if_name 接口名称
- * @return {sock} 返回套接字ID
- */
- int32_t ether_sock_init(const char *if_name)
- {
- int32_t sock;
- struct ifreq ifr;
- struct sockaddr_ll sll;
- /* 创建原始套接字, 抓取所有二层协议的报文, 不限于以太网协议 */
- sock = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
- if (sock < 0) {
- perror("Failed to create socket");
- exit(1);
- }
- /* 获取网络接口的索引 */
- memset(&ifr, 0, sizeof(ifr));
- strncpy(ifr.ifr_name, if_name, IFNAMSIZ - 1);
- if (ioctl(sock, SIOCGIFINDEX, &ifr) < 0) {
- perror("Failed to get interface index by ioctl");
- close(sock);
- exit(1);
- }
- /* 绑定到指定的网络接口 */
- memset(&sll, 0, sizeof(sll));
- sll.sll_family = AF_PACKET;
- sll.sll_ifindex = ifr.ifr_ifindex;
- sll.sll_protocol = htons(ETH_P_ALL);
- if (bind(sock, (struct sockaddr *)&sll, sizeof(sll)) < 0) {
- perror("Failed to bind to interface");
- close(sock);
- exit(1);
- }
- return sock;
- }
- /**
- * @description: 收取报文
- * @param {int32_t} sock
- * @param {char} *buffer
- * @param {int32_t} len
- * @return {*}
- */
- int32_t ether_recv_packet(int32_t sock, char *buffer, int32_t len)
- {
- int32_t ret;
- ret = recvfrom(sock, buffer, len, 0, NULL, NULL);
- if (ret < 0) {
- perror("Failed to receive packet");
- return -1;
- }
- return ret;
- }
复制代码 2.5.2 mini-tcpdump.c文件
- #define _GNU_SOURCE
- #include <pcap.h>
- #include <stdio.h>
- #include <stdint.h>
- #include <stdlib.h>
- #include <string.h>
- #include <signal.h>
- #include <time.h>
- #include <netpacket/packet.h>
- #include <net/ethernet.h>
- #include <netinet/ip.h>
- #include <netinet/tcp.h>
- #include <netinet/udp.h>
- #include "bpfjit.h"
- #include "bpf-compat.h"
- /* clang-format off */
- #ifndef __COMPILING
- /* 让ide不会报错, 可能无法识别嵌套函数语法 */
- #define lambda(ret, name, arg) ret (*name)(arg); name = NULL ; for (arg; 0;)
- #else
- #define lambda(ret, name, ...) ret name(__VA_ARGS__)
- #endif
- /* clang-format on */
- /* 执行即时编译的汇编指令 */
- #define jitcall(func, _pkt, _wirelen, _buflen) \
- (func(NULL, &((bpf_args_t){.pkt = _pkt, .wirelen = _wirelen, .buflen = _buflen})))
- extern int32_t ether_sock_init(const char *if_name);
- extern int32_t ether_recv_packet(int32_t sock, char *buffer, int32_t len);
- #define PACKET_SIZE 65536
- /**
- * @description: 根据Tcpdump过滤表达式生成BPF字节码
- * @param {pcap_t} **handler 句柄
- * @param {bpf_program *} fp BPF程序
- * @param {char} *filter_exp 过滤表达式
- * @return {*}
- */
- int32_t deal_tcpdump_code(pcap_t **handle, struct bpf_program *fp, const char *filter_exp)
- {
- int i;
- bpf_u_int32 net;
- pcap_t *temp_handle;
- char errbuf[PCAP_ERRBUF_SIZE];
- /* The IP of our sniffing device */
- net = 0;
- /* 使用pcap_open_dead()创建一个用于编译过滤器的空PCAP句柄 */
- temp_handle = pcap_open_dead(DLT_EN10MB, BUFSIZ);
- if (temp_handle == NULL) {
- fprintf(stderr, "Couldn't create dead pcap session: %s\n", errbuf);
- return -1;
- }
- /* 编译BPF过滤器,但不应用到任何捕获会话 */
- if (pcap_compile(temp_handle, fp, filter_exp, 0, net) == -1) {
- fprintf(stderr, "Couldn't parse filter %s: %s\n", filter_exp, pcap_geterr(temp_handle));
- pcap_close(temp_handle);
- return -1;
- }
- printf("BPF bytecode length: %d\n", fp->bf_len);
- printf("BPF bytecode: \n");
- for (i = 0; i < fp->bf_len; i++) {
- printf("%02x, %02x, %02x, %02x\n", fp->bf_insns[i].code, fp->bf_insns[i].jt,
- fp->bf_insns[i].jf, fp->bf_insns[i].k);
- }
- /* 打印bpf字节码 */
- bpf_dump(fp, 1);
- *handle = temp_handle;
- return 0;
- }
- /* 获取绝对时间差值 */
- static inline int64_t get_current_time(void)
- {
- struct timespec ts;
- clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
- return ts.tv_sec * 1000000000 + ts.tv_nsec;
- }
- /**
- * @description: 打印报文信息
- * @param {u_char} *packet
- * @return {*}
- */
- void print_packet_info(const char *packet)
- {
- struct ether_header *eth_header;
- struct ip *ip_header;
- char src_ip[INET_ADDRSTRLEN], dst_ip[INET_ADDRSTRLEN];
- struct tcphdr *tcp_header;
- struct udphdr *udp_header;
- /* 以太网头部 */
- eth_header = (struct ether_header *)packet;
- printf(
- "\tEthernet: %02x:%02x:%02x:%02x:%02x:%02x -> %02x:%02x:%02x:%02x:%02x:%02x, Type : 0x%04x "
- "\n",
- eth_header->ether_shost[0], eth_header->ether_shost[1], eth_header->ether_shost[2],
- eth_header->ether_shost[3], eth_header->ether_shost[4], eth_header->ether_shost[5],
- eth_header->ether_dhost[0], eth_header->ether_dhost[1], eth_header->ether_dhost[2],
- eth_header->ether_dhost[3], eth_header->ether_dhost[4], eth_header->ether_dhost[5],
- ntohs(eth_header->ether_type));
- /* 非IP协议直接Pass */
- if (ntohs(eth_header->ether_type) != ETHERTYPE_IP) {
- printf("Not an IP packet\n");
- return;
- }
- ip_header = (struct ip *)(packet + sizeof(struct ether_header));
- inet_ntop(AF_INET, &ip_header->ip_src, src_ip, INET_ADDRSTRLEN);
- inet_ntop(AF_INET, &ip_header->ip_dst, dst_ip, INET_ADDRSTRLEN);
- /* 分IP类型打印信息 */
- switch (ip_header->ip_p) {
- case IPPROTO_TCP:
- tcp_header = (struct tcphdr *)(ip_header + 1);
- printf("\tIP: %s/%d -> %s/%d, TCP(%d), Total Length: %d.\n", src_ip,
- ntohs(tcp_header->th_sport), dst_ip, ntohs(tcp_header->th_dport), ip_header->ip_p,
- ntohs(ip_header->ip_len));
- break;
- case IPPROTO_UDP:
- udp_header = (struct udphdr *)(ip_header + 1);
- printf("\tIP: %s/%d -> %s/%d, UDP(%d), Total Length: %d.\n", src_ip,
- ntohs(udp_header->uh_sport), dst_ip, ntohs(udp_header->uh_dport), ip_header->ip_p,
- ntohs(ip_header->ip_len));
- break;
- default:
- printf("\tIP: %s -> %s, Proto: %d, Total Length: %d.\n", src_ip, dst_ip, ip_header->ip_p,
- ntohs(ip_header->ip_len));
- break;
- }
- return;
- }
- /**
- * @description: 抓包函数
- * @param {int32_t} sock 套接字
- * @param {struct bpf_program} *fp BPF程序
- * @return {*}
- */
- int32_t capture_packets(int32_t sock, struct bpf_program *fp, bpfjit_func_t code)
- {
- int32_t ret, temp;
- int64_t start_time, end_time;
- int64_t filtered_count, captured_count;
- int64_t func_cost_time, jit_cost_time;
- char buffer[PACKET_SIZE];
- func_cost_time = jit_cost_time = 0;
- filtered_count = captured_count = 0;
- printf("Start to capture packets...\n");
- while (1) {
- int32_t len = ether_recv_packet(sock, buffer, PACKET_SIZE);
- if (len < 0) {
- perror("Failed to receive packet");
- break;
- }
- /* 执行BPF过滤器函数 */
- start_time = get_current_time();
- ret = bpf_filter(fp->bf_insns, (const u_char *)buffer, len, len);
- end_time = get_current_time();
- func_cost_time += end_time - start_time;
- /* 执行BPF过滤即时编译指令 */
- start_time = get_current_time();
- temp = jitcall(code, (const u_char *)buffer, len, len);
- end_time = get_current_time();
- jit_cost_time += end_time - start_time;
- if (temp != ret) {
- fprintf(stderr,
- "Warning, Result of executing bpf jit code is not equal to filter func: %d -> "
- "%d.\n",
- ret, temp);
- return -1;
- }
- if (ret == 0) {
- filtered_count++;
- continue;
- }
- captured_count++;
- /* 打印抓到的报文信息 */
- printf("[%ld]Packet captured ! Bypass %ld, Time avg cost: %ld ns(func) - %ld ns(jit).\n",
- captured_count, filtered_count, func_cost_time / (captured_count + filtered_count),
- jit_cost_time / (captured_count + filtered_count));
- print_packet_info(buffer);
- }
- return -1;
- }
- int32_t main(int32_t argc, char *argv[])
- {
- int32_t sock;
- pcap_t *handle;
- struct bpf_program fp;
- bpfjit_func_t code;
- if (argc != 3) {
- fprintf(stderr, "Usage: %s <interface> <tcpdump code>\n", argv[0]);
- return -1;
- }
- sock = 0;
- handle = NULL;
- code = NULL;
- memset(&fp, 0, sizeof(struct bpf_program));
- /* 也许用一下伪Lambda函数?至少可以避免全局变量使用 */
- lambda (void, free_source, int32_t signo) {
- /* 打印提示信息, 回收资源 */
- if (signo != -1) {
- printf("\nCtrl+C is pressed(sig %d), exit with 0.\n", signo);
- }
- if (fp.bf_insns) {
- pcap_freecode(&fp);
- }
- if (handle) {
- pcap_close(handle);
- }
- if (code) {
- bpfjit_free_code(code);
- }
- if (sock) {
- close(sock);
- }
- exit(0);
- }
- /* 注册ctrl+c信号处理函数 */
- signal(SIGINT, free_source);
- printf("Try to dump packet from interface(%s) with filter(%s)\n", argv[1], argv[2]);
- sock = ether_sock_init(argv[1]);
- if (sock < 0) {
- fprintf(stderr, "Failed to init socket\n");
- free_source(-1);
- return -1;
- }
- /* 编译tcpdump参数为bpf指令码 */
- if (deal_tcpdump_code(&handle, &fp, argv[2]) != 0) {
- fprintf(stderr, "Failed to deal tcpdump code\n");
- free_source(-1);
- return -1;
- }
- /* 验证bpf指令码的正确性 */
- if (bpf_validate(fp.bf_insns, fp.bf_len) == 0) {
- fprintf(stderr, "Failed to validate bpf code\n");
- free_source(-1);
- return -1;
- }
- /* 编译bpf指令码为机器指令 */
- code = bpfjit_generate_code(NULL, fp.bf_insns, fp.bf_len);
- if (code == 0) {
- fprintf(stderr, "Failed to compile bpf code\n");
- free_source(-1);
- return -1;
- }
- /* 循环抓包到结束 */
- if (capture_packets(sock, &fp, code) != 0) {
- fprintf(stderr, "Failed to capture packets\n");
- free_source(-1);
- return -1;
- }
- return 0;
- }
复制代码 2.5.3 Makefile文件
- CC=gcc
- TARGET=mini-tcpdump
- .PHONY: all clean build
- all: clean build
- clean:
- rm -f $(TARGET)
- build: $(TARGET)
- SOURCE=mini-tcpdump.c ether-input.c
- CFLAGS=-Wall -Werror -O0 -g -D__COMPILING=1
- INCLUDE=-I./bpfjit/src -I./bpfjit/sljit/sljit_src
- LDFLAGS= -L./lib -lsljit -lbpfjit -lpcap
- mini-tcpdump: $(SOURCE)
- $(CC) -o $@ $^ $(CFLAGS) $(INCLUDE) $(LDFLAGS)
复制代码 3. 总结(mini-tcpdump演示GIF)
根据MVP最小可用产物(tcpdump-mini)的验证结果,使用bpfjit的过滤效率还是挺不错的,复杂表达式下,开销都小于1us。如下:
场景bpf filterbpf jitICMP过滤900ns200nsICMP和TCP-80过滤1200ns300nsTCP SYN过滤800ns160ns 实际测试过程中,抓包越多,实行效率会更高,所以这里的耗时数据可作为一个参考值,但不能直接用于基准性能测试。
从数据中,明显可以看出,bpfjit效率比bpf-filter要高,耗时只有bpf-filter的20~30%左右。
Once Day
也信美人终作土,不堪幽梦太匆匆......
如果这篇文章为您带来了帮助或开导,不妨点个赞
|