rsync hang my backupserver

Issues related to configuring your network
teomatto
Posts: 4
Joined: 2006/12/28 12:24:45
Contact:

rsync hang my backupserver

Post by teomatto » 2006/12/28 12:37:57

hi all,
i've configured 2 centos 4.4 server 2.6.9-42.0.3.EL.
the first is a samba pdc server for 20 users with 70gb of data.
the second one is a backup and ftp server. I've done a cron job for backup all user data via rsync and ssh.

when the job start after some giga of transfer, the system hang. the only think i can do is to restart it with tee power button.
i've found http://bugs.centos.org/view.php?id=1149 and i try to put noapic noacpi in grub.conf but nothing change.

can you help me ?
thanks matteo italy

my dmesg is:
Linux version 2.6.9-42.0.3.EL (buildsvn@build-i386) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)) #1 Fri Oct 6 05:59:54 CDT 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001bef0000 (usable)
BIOS-e820: 000000001bef0000 - 000000001bef3000 (ACPI NVS)
BIOS-e820: 000000001bef3000 - 000000001bf00000 (ACPI data)
BIOS-e820: 000000001c000000 - 0000000020000000 (reserved)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
0MB HIGHMEM available.
446MB LOWMEM available.
found SMP MP-table at 000f3a50
Using x86 segment limits to approximate NX protection
zapping low mappings.
On node 0 totalpages: 114416
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 110320 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
DMI 2.2 present.
ACPI: RSDP (v000 Nvidia ) @ 0x000f7b70
ACPI: RSDT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1bef3040
ACPI: FADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1bef30c0
ACPI: SSDT (v001 PTLTD POWERNOW 0x00000001 LTP 0x00000001) @ 0x1bef9680
ACPI: MCFG (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1bef9780
ACPI: MADT (v001 Nvidia AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x1bef95c0
ACPI: DSDT (v001 NVIDIA AWRDACPI 0x00001000 MSFT 0x0100000e) @ 0x00000000
ACPI: PM-Timer IO Port: 0x1008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 15:15 APIC version 16
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled)
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
Enabling APIC mode: Flat. Using 0 I/O APICs
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
ACPI: IRQ9 used by override.
ACPI: IRQ14 used by override.
ACPI: IRQ15 used by override.
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 30000000 (gap: 20000000:c0000000)
Built 1 zonelists
Kernel command line: ro root=/dev/VolGroup00/LogVol00
mapped APIC to ffffd000 (fee00000)
Initializing CPU#0
CPU 0 irqstacks, hard=c0400000 soft=c03ff000
PID hash table entries: 2048 (order: 11, 32768 bytes)
Detected 1808.214 MHz processor.
Using pmtmr for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 448440k/457664k available (2150k kernel code, 8660k reserved, 716k data, 164k init, 0k highmem)
Calibrating delay using timer specific routine.. 3620.21 BogoMIPS (lpj=1810107)
Security Scaffold v1.0.0 initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
There is already a security framework initialized, register_security failed.
selinux_register_security: Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 078bfbff e3d3fbff 00000000 00000000
CPU: After vendor identify, caps: 078bfbff e3d3fbff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After all inits, caps: 078bf3ff e3d3fbff 00000000 00000010
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Sempron(tm) Processor 3200+ stepping 02
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=0 pin2=-1
checking if image is initramfs... it is
Freeing initrd memory: 1120k freed
NET: Registered protocol family 16
PCI: PCI BIOS revision 3.00 entry at 0xfa7c0, last bus=3
PCI: Using MMCONFIG
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
PCI: Transparent bridge - 0000:00:10.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
ACPI: PCI Interrupt Link [LNK1] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK2] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK4] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK5] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK6] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNK7] (IRQs *5 7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LNK8] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LUBA] (IRQs 5 7 9 10 *11 14 15)
ACPI: PCI Interrupt Link [LUBB] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LMAC] (IRQs 5 7 9 *10 11 14 15)
ACPI: PCI Interrupt Link [LACI] (IRQs 5 7 9 10 *11 14 15)
ACPI: PCI Interrupt Link [LAZA] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LPMU] (IRQs *5 7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LMCI] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSMB] (IRQs 5 7 9 *10 11 14 15)
ACPI: PCI Interrupt Link [LUB2] (IRQs *5 7 9 10 11 14 15)
ACPI: PCI Interrupt Link [LIDE] (IRQs 5 7 9 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LSID] (IRQs 5 7 9 *10 11 14 15)
ACPI: PCI Interrupt Link [LFID] (IRQs 5 7 9 10 *11 14 15)
ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0, disabled.
ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
ACPI: PCI Interrupt Link [APC5] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC6] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC7] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APC8] (IRQs 16) *0, disabled.
ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APMU] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [AAZA] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0, disabled.
ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0, disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
ACPI: PCI Interrupt Link [APC7] enabled at IRQ 16
ACPI: PCI interrupt 0000:00:05.0[A] -> GSI 16 (level, low) -> IRQ 177
ACPI: PCI Interrupt Link [APCS] enabled at IRQ 23
ACPI: PCI interrupt 0000:00:0a.1[A] -> GSI 23 (level, low) -> IRQ 185
ACPI: PCI Interrupt Link [APCF] enabled at IRQ 22
ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 22 (level, low) -> IRQ 193
ACPI: PCI Interrupt Link [APCL] enabled at IRQ 21
ACPI: PCI interrupt 0000:00:0b.1[B] -> GSI 21 (level, low) -> IRQ 201
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 20
ACPI: PCI interrupt 0000:00:0e.0[A] -> GSI 20 (level, low) -> IRQ 209
ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 23
ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 23 (level, low) -> IRQ 185
ACPI: PCI Interrupt Link [APCJ] enabled at IRQ 22
ACPI: PCI interrupt 0000:00:10.2[C] -> GSI 22 (level, low) -> IRQ 193
ACPI: PCI Interrupt Link [APCH] enabled at IRQ 21
ACPI: PCI interrupt 0000:00:14.0[A] -> GSI 21 (level, low) -> IRQ 201
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
apm: overridden by ACPI.
audit: initializing netlink socket (disabled)
audit(1167251929.368:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux: Registering netfilter hooks
Initializing Cryptographic API
ksign: Installing public key data
Loading keyring
- Added public key 43FDDC2AA0FF41CC
- User ID: CentOS (Kernel Module GPG key)
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: Fan [FAN] (on)
ACPI: Processor [CPU0] (supports C1)
ACPI: Thermal Zone [THRM] (-11 C)
Real Time Clock Driver v1.12
Linux agpgart interface v0.100 (c) Dave Jones
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
divert: not allocating divert_blk for non-ethernet device lo
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
NFORCE-MCP51: IDE controller at PCI slot 0000:00:0d.0
NFORCE-MCP51: chipset revision 161
NFORCE-MCP51: not 100% native mode: will probe irqs later
NFORCE-MCP51: 0000:00:0d.0 (rev a1) UDMA133 controller
ide0: BM-DMA at 0xf400-0xf407, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xf408-0xf40f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
Probing IDE interface ide1...
hdc: HL-DT-STDVD-ROM GDR8164B, ATAPI CD/DVD-ROM drive
Using cfq io scheduler
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide0...
Probing IDE interface ide2...
Probing IDE interface ide3...
Probing IDE interface ide4...
Probing IDE interface ide5...
hdc: ATAPI 48X DVD-ROM drive, 256kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
usbcore: registered new driver hiddev
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
NET: Registered protocol family 2
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 6, 458752 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
Initializing IPsec netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
ACPI: (supports S0 S3 S4 S5)
ACPI wakeup devices:
HUB0 XVRA XVRB XVRC USB0 USB2 AZAD MMAC MMCI UAR1 UAR2
Freeing unused kernel memory: 164k freed
SCSI subsystem initialized
libata version 1.20 loaded.
sata_nv 0000:00:0e.0: version 0.8
ACPI: PCI interrupt 0000:00:0e.0[A] -> GSI 20 (level, low) -> IRQ 209
PCI: Setting latency timer of device 0000:00:0e.0 to 64
ata1: SATA max UDMA/133 cmd 0x9F0 ctl 0xBF2 bmdma 0xE000 irq 209
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xB72 bmdma 0xE008 irq 209
ata1: SATA link up 3.0 Gbps (SStatus 123)
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3468 86:3c01 87:4023 88:407f
ata1: dev 0 ATA-7, max UDMA/133, 156301488 sectors: LBA48
ata1: dev 0 configured for UDMA/133
scsi0 : sata_nv
ata2: SATA link up 3.0 Gbps (SStatus 123)
ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3468 86:3c01 87:4023 88:407f
ata2: dev 0 ATA-7, max UDMA/133, 156301488 sectors: LBA48
ata2: dev 0 configured for UDMA/133
scsi1 : sata_nv
Vendor: ATA Model: ST3808110AS Rev: 3.AA
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
SCSI device sda: drive cache: write back
sda: sda1 sda2
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Vendor: ATA Model: ST3808110AS Rev: 3.AA
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
SCSI device sdb: drive cache: write back
sdb: sdb1
Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 23 (level, low) -> IRQ 185
PCI: Setting latency timer of device 0000:00:0f.0 to 64
ata3: SATA max UDMA/133 cmd 0x9E0 ctl 0xBE2 bmdma 0xCC00 irq 185
ata4: SATA max UDMA/133 cmd 0x960 ctl 0xB62 bmdma 0xCC08 irq 185
ata3: SATA link down (SStatus 0)
scsi2 : sata_nv
ata4: SATA link down (SStatus 0)
scsi3 : sata_nv
device-mapper: 4.5.0-ioctl (2005-10-04) initialised: dm-devel@redhat.com
cdrom: open failed.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: Disabled at runtime.
SELinux: Unregistering netfilter hooks
inserting floppy driver for 2.6.9-42.0.3.EL
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
forcedeth.c: Reverse Engineered nForce ethernet driver. Version 0.41.
ACPI: PCI interrupt 0000:00:14.0[A] -> GSI 21 (level, low) -> IRQ 201
PCI: Setting latency timer of device 0000:00:14.0 to 64
divert: allocating divert_blk for eth0
eth0: forcedeth.c: subsystem: 0105b:0caf bound to 0000:00:14.0
ACPI: PCI interrupt 0000:00:0b.1[B] -> GSI 21 (level, low) -> IRQ 201
ehci_hcd 0000:00:0b.1: EHCI Host Controller
ehci_hcd 0000:00:0b.1: BIOS handoff failed (160, 1010001)
ehci_hcd 0000:00:0b.1: continuing after BIOS bug...
PCI: Setting latency timer of device 0000:00:0b.1 to 64
ehci_hcd 0000:00:0b.1: irq 201, pci mem dc836000
ehci_hcd 0000:00:0b.1: new USB bus registered, assigned bus number 1
PCI: cache line size of 64 is not supported by device 0000:00:0b.1
ehci_hcd 0000:00:0b.1: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 22 (level, low) -> IRQ 193
ohci_hcd 0000:00:0b.0: OHCI Host Controller
PCI: Setting latency timer of device 0000:00:0b.0 to 64
ohci_hcd 0000:00:0b.0: irq 193, pci mem dc842000
ohci_hcd 0000:00:0b.0: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 8 ports detected
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
ACPI: Power Button (FF) [PWRF]
EXT3 FS on dm-0, internal journal
cdrom: open failed.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 1966072k swap on /dev/VolGroup00/LogVol01. Priority:-1 extents:1
parport0: PC-style at 0x378 [PCSPP,TRISTATE]
ip_tables: (C) 2000-2002 Netfilter core team
ip_tables: (C) 2000-2002 Netfilter core team
i2c /dev entries driver
NET: Registered protocol family 10
Disabled Privacy Extensions on device c0384d60(lo)
IPv6 over IPv4 tunneling driver
divert: not allocating divert_blk for non-ethernet device sit0
eth0: no IPv6 routers present

NedSlider
Forum Moderator
Posts: 2896
Joined: 2005/10/28 13:11:50
Location: UK

Re: rsync hang my backupserver

Post by NedSlider » 2006/12/28 15:21:30

I remember seeing that bug when it was first posted.

Something to try... not sure it will solve your problem, but has always worked well for me...

I used rsync to do backups, but rather than backing up over an ssh link (or to a remote rsync server), I simply mounted smb (samba) shares on my backup server of the remote systems to be backed up, then rsync'd locally with something like:

[code]rsync -av --modify-window=10 /mnt/data/ /backups/data[/code]

The primary reason for taking this approach was that some of the machines to be backed up were Windows boxes, so mounting smb shares locally seemed an easier solution than attempting to implement ssh or rsync daemons on the Windows platform. Of course this is only really practical if your target and server boxes are on the same lan.

Anyway, always worked flawlessly for me, so maybe worth a try?

teomatto
Posts: 4
Joined: 2006/12/28 12:24:45
Contact:

Re: rsync hang my backupserver

Post by teomatto » 2006/12/28 15:56:28

thanks for your help,
i wll try to mount the samba share and i'll post the result.
i want to add some new info:

svr1 is the samba pdc server, svr2 is the backup server. They are the same machine with the same so (centos44).

if i rsync from svr2 to svr1 for take my data, svr2 HANG
if is SCP from svr1 to svr2 for put data, svr2 HANG.

i'm controlling these machine via vpn and i cannot see any kind of log when svr2 is powered up.

i use rsync since years with other distro without problems.

svr1 connect to svr2 via ssh and public key for work ithout password

thanks again
matteo

NedSlider
Forum Moderator
Posts: 2896
Joined: 2005/10/28 13:11:50
Location: UK

Re: rsync hang my backupserver

Post by NedSlider » 2006/12/28 22:56:16

[quote]
teomatto wrote:
thanks for your help,
i wll try to mount the samba share and i'll post the result.
[/quote]

I will be interested to see how you get on :-)

I'm not convinced the fault lies with rsync, but you could always try uninstalling the rsync RPM and then compiling from the latest source. This may at least help establish if it's an rsync problem, and it's relatively quick and easy to test.

I'd also be tempted to maybe try a non-CentOS kernel like the latest vanilla kernel source.

Anyway, just some more thoughts to help you troubleshoot.

teomatto
Posts: 4
Joined: 2006/12/28 12:24:45
Contact:

Re: rsync hang my backupserver

Post by teomatto » 2006/12/29 13:35:07

[quote]
NedSlider wrote:
[quote]
teomatto wrote:
thanks for your help,
i wll try to mount the samba share and i'll post the result.
[/quote]

I will be interested to see how you get on :-)

I'm not convinced the fault lies with rsync, but you could always try uninstalling the rsync RPM and then compiling from the latest source. This may at least help establish if it's an rsync problem, and it's relatively quick and easy to test.

I'd also be tempted to maybe try a non-CentOS kernel like the latest vanilla kernel source.

Anyway, just some more thoughts to help you troubleshoot.[/quote]

nothing to do, the system hang with rsync over samba fs too :-( after 20gb of transfer
the only thing i can do now is to install another distro....but i'm very worry about this because i have a lot of centos servers.....
i can't understand and i'm very frustrate with this problem.
matteo

NedSlider
Forum Moderator
Posts: 2896
Joined: 2005/10/28 13:11:50
Location: UK

Re: rsync hang my backupserver

Post by NedSlider » 2006/12/29 14:07:44

hmm... I can't replicate that, but then I'm not transferring as much data as you. The max I'm pulling in one go it about 10GB.

[quote]
....but i'm very worry about this because i have a lot of centos servers.....
[/quote]

Maybe you can get away with just changing the rsync backup server that is hanging.

midair77
Posts: 19
Joined: 2006/10/03 19:04:39

Re: rsync hang my backupserver

Post by midair77 » 2007/01/06 02:19:20

Ah, I've used rsync to backup hundreds of Gig of data and I have to say I have seen similar problems. I would say you might want to look at the instance when rsync died because it could be the file at that instance is too big. One time, I tried to rsync a file around 8 Gigs and rsync stalled and I have to Ctl-C to stop but not as bad like your situation where you had to reset the machine. I thought it must be something weird on my LAN but I tried to rsync again for a few more times and it bombed badly. I had to sit there and see why it stopped and found out that was because of a huge file that my users had when they ran jobs.

Another thing I would suggest you to look into is the Ethernet card, could the driver for your ethernet card bomb and the whole thing hang? What about your disks? Any errors from disks in log files?

You might want to look at /var/log/messages and do a dmesg RIGHT AFTER THE ERROR to see anything weird happened At times, dmesg at the command line might show some interesting stuffs like something bomb. It has occured to me that dmesg at the command line sometimes shows some very good info for such abrupt errors.


Cheers.

teomatto
Posts: 4
Joined: 2006/12/28 12:24:45
Contact:

Re: rsync hang my backupserver MCP51

Post by teomatto » 2007/01/11 11:41:57

hi all,
bad news..
i installed a new distribution on the backup server SVR002 (ubuntu edgy 6.10 server) instead of centos44.
now i can rsynk from SVR002 to SVR001 without the SVR002 crash BUT,

after 50GB of transfer SVR001 CENTOS crashed
this is the pdc samba server of my lan and i can only hard reboot.

I believe that this is a problem with the ethernet controller (BUT ONLY IN CENTOS DISTRO) because i have found this logs in messages:

Jan 11 07:01:03 svr001 kernel: nv_stop_tx: TransmitterStatus remained busyeth0: tx_timeout: dead entries!
Jan 11 08:01:48 svr001 kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jan 11 08:01:48 svr001 kernel: eth0: Got tx_timeout. irq: 00000000
Jan 11 08:01:48 svr001 kernel: eth0: Ring at 1af68000: next 63 nic 0
Jan 11 08:01:48 svr001 kernel: eth0: Dumping tx registers

the etherne controller is:
00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev a1) (prog-if 85 [Master SecO PriO])
Subsystem: nVidia Corporation: Unknown device cb84
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 209
I/O ports at 09f0 [size=8]
I/O ports at 0bf0 [size=4]
I/O ports at 0970 [size=8]
I/O ports at 0b70 [size=4]
I/O ports at e000 [size=16]
Memory at fe02d000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Capabilities: [b0] Message Signalled Interrupts: 64bit+ Queue=0/2 Enable-
Capabilities: [cc] HyperTransport: MSI Mapping

00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev a1) (prog-if 85 [Master SecO PriO])
Subsystem: nVidia Corporation: Unknown device cb84
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 185
I/O ports at 09e0 [size=8]
I/O ports at 0be0 [size=4]
I/O ports at 0960 [size=8]
I/O ports at 0b60 [size=4]
I/O ports at cc00 [size=16]
Memory at fe02c000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2
Capabilities: [b0] Message Signalled Interrupts: 64bit+ Queue=0/2 Enable-
Capabilities: [cc] HyperTransport: MSI Mapping

00:10.0 PCI bridge: nVidia Corporation MCP51 PCI Bridge (rev a2) (prog-if 01 [Subtractive decode])
Flags: bus master, 66Mhz, fast devsel, latency 0
Bus: primary=00, secondary=03, subordinate=03, sec-latency=32
I/O behind bridge: 00009000-00009fff
Memory behind bridge: fdd00000-fddfffff
Prefetchable memory behind bridge: fdc00000-fdcfffff
Capabilities: [b8] #0d [0000]
Capabilities: [8c] HyperTransport: MSI Mapping

00:10.2 Multimedia audio controller: nVidia Corporation MCP51 AC97 Audio Controller (rev a2)
Subsystem: Foxconn International, Inc.: Unknown device 0caf
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 193
I/O ports at c800 [size=256]
I/O ports at c400 [size=256]
Memory at fe02b000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] Power Management version 2

00:14.0 Bridge: nVidia Corporation MCP51 Ethernet Controller (rev a1)
Subsystem: Foxconn International, Inc.: Unknown device 0caf
Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 201
Memory at fe02a000 (32-bit, non-prefetchable) [size=4K]
I/O ports at c000 [size=8]
Capabilities: [44] Power Management version 2

i'll follow this steps:
1) add a dedicated dlink network adapter for rsync my 80gb of data
2) if i can't solve a'll reinstall my CENTOS44 machine with an UBUNTU SERVER also on SVR001.

hope to help someone with this informations
matteo

SDGathman
Posts: 5
Joined: 2007/03/22 03:38:58
Contact:

Re: ssh hangs, not just with rsync

Post by SDGathman » 2007/03/22 03:47:27

I use ssh extensively for management of remote systems. On Centos-4 (and not on RH73, RH9, Centos-3, FC4, or AIX-4.3), ssh hangs in the middle of my command session. Sometimes it is when I'm idle. Sometimes, in the middle of a directory listing. So it is no surprise that rsync would hang when used over ssh.

pjwelsh
Posts: 2618
Joined: 2007/01/07 02:18:02
Location: Central IL USA

Re: ssh hangs, not just with rsync

Post by pjwelsh » 2007/03/23 03:10:36

It sounds like you have the nvidia foredeth issue. Many people have better results with the official Nvidia drivers for RHEL 4u4.

Post Reply

Return to “CentOS 4 - Networking Support”