New Woodcrest Node produce produce rx_crc_errors with Cisco W3750G-48TS-S

Issues related to configuring your network
Post Reply
tluszczy
Posts: 1
Joined: 2006/12/21 12:39:12
Contact:

New Woodcrest Node produce produce rx_crc_errors with Cisco

Post by tluszczy » 2006/12/21 13:02:15

Hi all,

I just discover a network problem with a brand new Woodcrest node.
I installed a new fresh CentOS 4.4 on a new Woodcrest node but...

While the node installation we discover issues which cause a major drop on performance on the on board GBE controller.
Even a simple ping to a other node show a packet lost of 50% - 70%.
Further investigations shows that the NS R440 nodes show a error on eth0 (which is the onboard Gigabit Ethernet Controller). Further investigation show that the rx_crc_error continualy increase on all 36 nodes on the eth0 interface if the interface speed is 1000 Mbit/s.
The ports on the switch are in auto mode but even a change to dedicated 1000 GBE don't solve the issue. Only if we set the speed on the switch to 100BaseTX the errors don't increase any more.

Also a quick test with a other GBE switch show that the nodes don't have this problem. But a other test with a Knoppix debian based liveCD show as well that a new kernel 2.6.18 as well don't have this issue if the nodes are still conected to the Cisco W3750G-48TS-S.

So it looks like ether the new kernel or the included e1000 driver solve the problem.

Please provide to us either a new e1000 driver which is included in the new kernel 2.6.18/2.6.19 or a complete new package which include all relevant modules with kernel 2.6.18.

As this problem as well apear while the PXE boot we need as wenn a new vanila PXE kernel to be able to isntall the nodes remotly via PXE.

Here a log of some of the tests I done:

Context:
- Ethernet Cisco switch WS-C3750G-48TS
- an administrative node connected from eth0 to switche's interface gi1/0/1
- a compute node connected from eth0 (Intel Corp 631xESB/632xESB DPT LAN Controller Copper)
to switche's interface gi1/0/7

Tests:
* step 1:
- ping from administrative node => a lot of packets are unreplied
- ping from the cisco switch => a lot of packets are unreplied
* step 2:
- show counters error gi1/0/7 => no error seen
- ethtool -S eth0 => many CRC errors on received packets
* step 3:
- set speed to 100 Mb, set duplex to full
- ping from administrative node => all packets are replied
* step 4:
- set speed to 1 Gb, set duplex to full
- ping from administrative node => a lot of packets are unreplied
* step 5:
- set speed to auto, set duplex to auto
- ping from administrative node => a lot of packets are unreplied
* step 6:
- set no mdix
- ping from administrative node => a lot of packets are unreplied
- set mdix to auto
* step 7:
- set down_when_loop
- show interfaces status => interface gi1/0/7 is down
- set no down_when_loop
- show interfaces status => interface gi1/0/7 is up

I search the WEB and this forum for this kind of issues but don't find any thing. For me it looks like a switch problem but It can be solved by using a new kernel or new e1000 driver.

The question is what to do?

Any idea how to get a new e1000 driver or a new kernel?
Perhaps from the Sec Patches?

Regards,
Thomas

tgross
Posts: 1
Joined: 2007/03/23 19:40:07

Re: New Woodcrest Node produce produce rx_crc_errors with Cisco W3750G-48TS-S

Post by tgross » 2007/03/23 19:49:42

Did you ever reach a resolution on this problem? My company is using supermicro machines with the 631xESB/632xESB DPT w/ CentOS 4.4. eth0 always has a problem where ypmatch replies are dropped but eth1 does not. Also they're only dropped when it's a superuser process requesting them. i.e. if you type ypmatch on the command line as a normal user the rate they are sent at is slower. Any ideas? Thanks - Tristan

Post Reply

Return to “CentOS 4 - Networking Support”