HarrisIce
207 天前
结合我另外一篇帖子中一个大佬的回复,测网卡的问题已经解决,在这里把数据共享给大家做参考。
**测试环境**
一台机器,同时装了 Intel X520-DA2 ( 10G 双口)、Mellanox MCX512A ( 25G 双口)两张网卡,32G 内存( 8G hugepage )。设置了 Linux 启动时禁止调度到 2-7 号核心,2-7 专门留给 VPP 、Pktgen 用(并且 worker 核心不重叠),保证测试数据准确。
CPU 如下。
```
Intel(R) Core(TM) i3-10105 CPU @ 3.70GHz
analyzing CPU 6:(所有核心统一设置)
driver: intel_pstate
CPUs which run at the same hardware frequency: 6
CPUs which need to have their frequency coordinated by software: 6
maximum transition latency: 4294.55 ms.
hardware limits: 800 MHz - 4.40 GHz
available cpufreq governors: performance, powersave
current policy: frequency should be within 3.20 GHz and 4.40 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency is 4.20 GHz.
# lscpu -e
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ
0 0 0 0 0:0:0:0 yes 4400.0000 800.0000 4199.541
1 0 0 1 1:1:1:0 yes 4400.0000 800.0000 3700.000
2 0 0 2 2:2:2:0 yes 4400.0000 800.0000 3700.000
3 0 0 3 3:3:3:0 yes 4400.0000 800.0000 4199.999
4 0 0 0 0:0:0:0 yes 4400.0000 800.0000 4200.001
5 0 0 1 1:1:1:0 yes 4400.0000 800.0000 4200.000
6 0 0 2 2:2:2:0 yes 4400.0000 800.0000 4199.999
7 0 0 3 3:3:3:0 yes 4400.0000 800.0000 4200.000
```
系统和测试平台如下。
```
Operating System: Ubuntu 22.04.4 LTS
Kernel: Linux 5.15.0-105-generic
Architecture: x86-64
Hardware Vendor: HP
Hardware Model: HP EliteDesk 880 G6 Tower PC
```
**打流路径**
Pktgen 打流 -> Mellanox 1 号口 -> Intel 1 号口 -> VPP 用户态转发 -> Intel 2 号口 -> Mellanox 2 号口 -> Pktgen 接收统计
**测试数据**
pktgen 测试数据如下
```
\ Ports 0-1 of 2 <Main Page> Copyright(c) <2010-2023>, Intel Corporation
Port:Flags : 0:P------ Single 1:P------ Single
Link State : <UP-10000-FD> <UP-10000-FD> ---Total Rate---
Pkts/s Rx : 0 9,091,196 9,091,196
Tx : 14,911,104 0 14,911,104
MBits/s Rx/Tx : 0/9,543 5,818/0 5,818/9,543
Pkts/s Rx Max : 0 9,408,878 9,408,878
Tx Max : 15,058,304 0 15,058,304
Broadcast : 0 0
Multicast : 0 0
Sizes 64 : 0 575,707,178,176
65-127 : 0 0
128-255 : 0 0
256-511 : 0 0
512-1023 : 0 0
1024-1518 : 0 0
Runts/Jumbos : 0/0 0/0
ARP/ICMP Pkts : 0/0 0/0
Errors Rx/Tx : 0/0 0/0
Total Rx Pkts : 0 8,993,964,327
Tx Pkts : 14,541,116,160 0
Rx/Tx MBs : 0/9,306,314 5,756,137/0
TCP Flags : .A.... .A....
TCP Seq/Ack : 74616/74640 74616/74640
Pattern Type : abcd... abcd...
Tx Count/% Rate : Forever /100% Forever /100%
Pkt Size/Rx:Tx Burst: 64 / 64: 64 64 / 64: 64
TTL/Port Src/Dest : 64/ 1234/ 5678 64/ 1234/ 5678
Pkt Type:VLAN ID : IPv4 / UDP:0001 IPv4 / TCP:0001
802.1p CoS/DSCP/IPP : 0/ 0/ 0 0/ 0/ 0
VxLAN Flg/Grp/vid : 0000/ 0/ 0 0000/ 0/ 0
IP Destination : 1.1.1.1 192.168.0.1
Source : 192.168.1.2/24 192.168.2.2/24
MAC Destination : 94:94:26:00:00:01 ab:cd:ef:aa:bb:52
Source : ab:cd:ef:aa:bb:52 ab:cd:ef:aa:bb:53
NUMA/Vend:ID/PCI :-1/15b3:1017/0000:01:0-1/15b3:1017/0000:01:00.1
-- Pktgen 24.03.1 (DPDK 24.03.0) Powered by DPDK (pid:1690) -----------------
```
VPP 性能数据如下
```
Thread 1 vpp_wk_0 (lcore 3)
Time 4312.5, 10 sec internal node vector rate 0.00 loops/sec 6899106.25
vector rates in 8.2346e3, out 0.0000e0, drop 8.2346e3, punt 0.0000e0
Name State Calls Vectors Suspends Clocks Vectors/Call
abf-input-ip4 active 138720 35511313 0 1.24e2 255.99
dpdk-input polling 30192345449 35511318 0 2.86e5 0.00
drop active 138725 35511318 0 1.52e1 255.98
error-drop active 138725 35511318 0 6.71e0 255.98
ethernet-input active 138725 35511318 0 3.62e1 255.98
ip4-drop active 138720 35511313 0 6.85e0 255.99
ip4-input-no-checksum active 138720 35511313 0 4.15e1 255.99
ip4-lookup active 138720 35511313 0 3.07e1 255.99
ip6-input active 5 5 0 5.31e3 1.00
ip6-not-enabled active 5 5 0 1.99e3 1.00
unix-epoll-input polling 29455958 0 0 1.77e3 0.00
---------------
Thread 2 vpp_wk_1 (lcore 4)
Time 4312.5, 10 sec internal node vector rate 256.00 loops/sec 45554.75
vector rates in 9.6155e6, out 6.5761e6, drop 3.8843e5, punt 0.0000e0
Name State Calls Vectors Suspends Clocks Vectors/Call
TenGigabitEthernet3/0/1-output active 155558090 39791416116 0 4.95e0 255.79
TenGigabitEthernet3/0/1-tx active 155558090 28359013714 0 8.19e1 182.30
abf-input-ip4 active 176833510 41466399484 0 8.90e1 234.49
dpdk-input polling 13167973917 41466399488 0 9.99e1 3.15
drop active 21398772 1675106720 0 1.15e1 78.28
error-drop active 21398772 1675106720 0 7.87e0 78.28
ethernet-input active 176833513 41466399488 0 3.49e1 234.49
ip4-arp active 3708505 949377021 0 2.51e2 255.99
ip4-drop active 21398768 1675106716 0 5.87e0 78.28
ip4-input-no-checksum active 176833510 41466399484 0 2.48e1 234.49
ip4-load-balance active 159143247 40740669789 0 1.19e1 255.99
ip4-rewrite active 155434742 39791292768 0 2.10e1 255.99
ip6-input active 4 4 0 7.03e3 1.00
ip6-not-enabled active 4 4 0 1.95e3 1.00
unix-epoll-input polling 12846814 0 0 9.99e2 0.00
```
**测试内容**
没有拿上边的场景去做同等配置,只是做了几个简单配置了解一下转发率。
VPP 使用 DPDK 输入,配置了 6 条路由,3 条 ACL 规则,1 条 ABF (等同于 PBR 规则),此时发送打满了 10G 的线速,接收到了接近 6G 的速度,注意这是 64 字节的小包,大包没有太多参考意义。
**对比**
对比 CCR2004-16G-2S+的 8260.6kpps 转发率( routing fast path )(即 8.2Mpps )(数据来自官方),VPP 做到了 9Mpps 的转发率,这还是纯软件处理、只靠 2 个 CPU 核心、受到 hyperthreading 没关影响的数据,完全调整后应该还能再高一些(懒得弄了)。这台机器我手边随便拿的,正常买的话应该 1000 多块钱,加上全新的 X520-DA2 在 300 多块钱,你可以拥有比 CCR2004 更高的性能(仅限 L3 )。
**结论**
人生苦短,钱包不够,我选 RB5009 。