cpu :英特尔至强 E5-2640V3 处理器 2.6GHz 8 核 2 颗 mem : 8G , DDR4-2133 RDIMM , 32 条,共 256G 硬盘 1 : 1.2T ,万转 sas 做数据盘, 24 块 硬盘 2 : 600G ,万转 sas 做系统盘, 2 块 RAID 卡: 2G 缓存 网卡: 2*10GE ( SFP+),原厂的 操作系统: suse11sp4 Linux hebda_data_33 3.0.101-77-default #1 SMP Tue Jun 14 20:33:58 UTC 2016 (a082ea6) x86_64 x86_64 x86_64 GNU/Linux 上联交换机:华为 12812 网卡信息:
ethtool -i p4p2
driver: bnx2x
version: 1.710.51-0
firmware-version: FFV08.07.25 bc 7.13.54
bus-info: 0000:83:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
hebda_data_33:~ # ethtool -i em1
driver: bnx2x
version: 1.710.51-0
firmware-version: FFV08.07.25 bc 7.13.54
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
hebda_data_33:~ # lspci -s 0000:83:00.1 -vvv
83:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
Subsystem: Broadcom Corporation Device 1006
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 60
Region 0: Memory at c8000000 (64-bit, prefetchable) [size=8M]
Region 2: Memory at c8800000 (64-bit, prefetchable) [size=8M]
Region 4: Memory at ca000000 (64-bit, prefetchable) [size=64K]
Expansion ROM at ca500000 [disabled] [size=512K]
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data
Not readable
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00001000
Capabilities: [ac] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr+ NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <1us, L1 <2us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+
Capabilities: [13c v1] Device Serial Number f4-e9-d4-ff-fe-9d-ba-10
Capabilities: [150 v1] Power Budgeting <?>
Capabilities: [160 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [220 v1] #15
Kernel driver in use: bnx2x
Kernel modules: bnx2x
hebda_data_33:~ # lspci -s 0000:01:00.0 -vvv
01:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM57800 1/10 Gigabit Ethernet (rev 10)
Subsystem: Dell BCM57800 10-Gigabit Ethernet
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 40
Region 0: Memory at 95000000 (64-bit, prefetchable) [size=8M]
Region 2: Memory at 95800000 (64-bit, prefetchable) [size=8M]
Region 4: Memory at 96030000 (64-bit, prefetchable) [size=64K]
Expansion ROM at 96080000 [disabled] [size=512K]
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=8 DScale=1 PME-
Capabilities: [50] Vital Product Data
Not readable
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [a0] MSI-X: Enable+ Count=32 Masked-
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00001000
Capabilities: [ac] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr+ NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <1us, L1 <2us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn+ ChkCap+ ChkEn+
Capabilities: [13c v1] Device Serial Number 18-66-da-ff-fe-65-77-0b
Capabilities: [150 v1] Power Budgeting <?>
Capabilities: [160 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 1
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [220 v1] #15
Capabilities: [300 v1] #19
Kernel driver in use: bnx2x
Kernel modules: bnx2x
hebda_data_33:~ # ethtool -S p4p2|grep dis
[0]: rx_discards: 79516
[0]: rx_phy_ip_err_discards: 0
[0]: rx_skb_alloc_discard: 28517
[1]: rx_discards: 88484
[1]: rx_phy_ip_err_discards: 0
[1]: rx_skb_alloc_discard: 27102
[2]: rx_discards: 13667973
[2]: rx_phy_ip_err_discards: 0
[2]: rx_skb_alloc_discard: 35207
[3]: rx_discards: 33056205
[3]: rx_phy_ip_err_discards: 0
[3]: rx_skb_alloc_discard: 33533
[4]: rx_discards: 13263091
[4]: rx_phy_ip_err_discards: 0
[4]: rx_skb_alloc_discard: 34748
[5]: rx_discards: 7583294
[5]: rx_phy_ip_err_discards: 0
[5]: rx_skb_alloc_discard: 32756
[6]: rx_discards: 3703892
[6]: rx_phy_ip_err_discards: 0
[6]: rx_skb_alloc_discard: 28380
[7]: rx_discards: 31746726
[7]: rx_phy_ip_err_discards: 0
[7]: rx_skb_alloc_discard: 32609
rx_discards: 103189181
rx_mf_tag_discard: 0
rx_brb_discard: 90068
rx_phy_ip_err_discards: 0
rx_skb_alloc_discard: 252852
没有其它错误
hebda_data_23:~ # for i in `seq 1 10`; do ifconfig p4p2 | grep RX | grep overruns; sleep 1; done
RX packets:253639505018 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639552428 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639566818 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639585722 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639597202 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639610209 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639622800 errors:305619311 dropped:0 overruns:305375168 frame:244143
RX packets:253639642350 errors:305620450 dropped:0 overruns:305376307 frame:244143
RX packets:253639675509 errors:305620450 dropped:0 overruns:305376307 frame:244143
RX packets:253639723772 errors:305620471 dropped:0 overruns:305376328 frame:244143
hebda_data_23:~ # for i in `seq 1 10`; do ifconfig p4p2 | grep RX | grep overruns; sleep 1; done
RX packets:253639788669 errors:305620773 dropped:0 overruns:305376630 frame:244143
RX packets:253639812355 errors:305621201 dropped:0 overruns:305377058 frame:244143
RX packets:253639834600 errors:305621201 dropped:0 overruns:305377058 frame:244143
RX packets:253639892990 errors:305621455 dropped:0 overruns:305377312 frame:244143
RX packets:253639913026 errors:305621455 dropped:0 overruns:305377312 frame:244143
RX packets:253639919136 errors:305621455 dropped:0 overruns:305377312 frame:244143
RX packets:253639935095 errors:305622380 dropped:0 overruns:305378237 frame:244143
RX packets:253639954560 errors:305623012 dropped:0 overruns:305378869 frame:244143
RX packets:253639961150 errors:305623012 dropped:0 overruns:305378869 frame:244143
RX packets:253639971680 errors:305623012 dropped:0 overruns:305378869 frame:244143
Gp DB 4.3
安装应用后网卡的使用情况如下图: 但是在高峰时通过 nagios 会发现整个集群每个节点都报下面的错误,裸跑的时候也有类似的报错,但是没有来得及抓网卡的包:
Interface 11
Active checks of the service have been disabled - only passive checks are being accepted Perform Extra Service Actions
CRITICAL 09-20-2016 10:47:51 0d 0h 11m 46s 1/1 CRIT - [p4p2] (up) MAC: f4:e9:d4:9d:cb:92, 10.00 Gbit/s, in: 262.67 MB/s, in-errors: 0.16%(!!) >= 0.1, out: 237.76 MB/s
实际使用的命令是:
echo '<<<lnx_if:sep(58)>>>'
sed 1,2d /proc/net/dev
整体上来看, errors 在 0.1%-0.6%之间,极少的能达到 1%,当时的流量也从 20M-200MB 左右不等。
这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。
V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。
V2EX is a community of developers, designers and creative people.