ext4 file system inconsistency

General support questions
owl102
Posts: 413
Joined: 2014/06/10 19:13:41

Re: ext4 file system inconsistency

Post by owl102 » 2014/06/10 19:44:27

drk wrote:Yes, quit resetting the running VMs. Shut them down first.
Unfortunately this does not help. Even if the VMs have their very own partition, the following is reproducible: Unmount the partition, check it (with fsck -f, to get sure it's really clean), mount the partition again, start a VM, play around with it for a few minutes, shut down the VM properly, unmount the partition, check it with fsck -f and one will get inconsistency errors.

It's definitely not a hardware problem, this problem was reproducible on two different PC using three different hard disks. Furthermore its independent of the guest OS: Windows, Debian, Fedora, ..., all the same.

But: It's good to know that using a ext3 partition (instead of ext4) help. However, this information comes too late for me, I already have solved the problem permanently by switching from VMware vSphere 5.1 + Workstation 9 to oVirt + Virt Manager (KVM based).
German speaking forum for Fedora and CentOS: https://www.fedoraforum.de/

User avatar
TrevorH
Site Admin
Posts: 33219
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: ext4 file system inconsistency

Post by TrevorH » 2014/06/10 20:57:28

Which makes it sound like a VMWare problem to me since that was the only variable you changed?
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

drk
Posts: 405
Joined: 2014/01/30 20:38:28

Re: ext4 file system inconsistency

Post by drk » 2014/06/11 06:45:39

Sure sounds like a VMware problem. Can you repeat your test using KVM instead?

owl102
Posts: 413
Joined: 2014/06/10 19:13:41

Re: ext4 file system inconsistency

Post by owl102 » 2014/06/11 13:00:37

It's definitely a VMware problem.

It took us weeks to figure that out. We even have bought a completely new PC since we first thought about hardware problems, and were very surprised by getting the same problems on the new PC, too. And as already written we were finally able to isolate the problem to VMware. If a partition only used for the VMs and nothing else get corrupted after a simple "Start Debian 7 VM", "apt-get update && apt-get dist-upgrade", "Shut down Debian 7 VM" (or something similar done with a different VM), and this happened on two different PCs (and on three different hard disks), reproducible, than VMware must be the problem:

Code: Select all

[root@xxx yyy]# fsck -f /dev/sda6
    fsck from util-linux-ng 2.17.2
    e2fsck 1.41.12 (17-May-2010)
    Durchgang 1: Prüfe Inodes, Blocks, und Größen
    Inode 393222, end of extent exceeds allowed value
            (logical Block 1483744, physical Block 537599, len 7296)
    Bereinige<j>? ja

    Inode 393222, i_Blocks ist 11928328, sollte sein 11869960.  Repariere<j>? ja

    Durchgang 2: Prüfe Verzeichnis Struktur
    Durchgang 3: Prüfe Verzeichnis Verknüpfungen
    Durchgang 4: Überprüfe die Referenzzähler
    Durchgang 5: Überprüfe Gruppe Zusammenfassung
    Block Bitmap differieren:  -(537599--544894)
    Repariere<j>? ja

    Die Anzahl freier Blöcke in Gruppe #16 ist falsch (12161, gezählt=19457).
    Repariere<j>? ja

    Die Anzahl freier Blöcke ist falsch (38323887, gezählt=38331183).
    Repariere<j>? ja

    /dev/sda6: ***** DATEISYSTEM WURDE VERÄNDERT *****
    /dev/sda6: 1737/14508032 Dateien (0.1% nicht zusammenhängend), 19668945/58000128 Blöcke
    [root@xxx yyy]#
(This was tested with RHEL 6.4 and CentOS 6.4 and VMware Workstation 9, with latest updates.)

I guess it's so hard to find this problem description (and solution) in the net because the CentOS installer formats ext4 partitions with a maximum mount count number of 0, i.e. it will never been checked on boot time unless the clean bit is not set (and usually it's set if you shut down your CentOS 6 properly.) So unless you do not change the maximum mount count number or force a filesystem check with "fsck -f" you will maybe never notice these problems.
drk wrote:Can you repeat your test using KVM instead?
After switching to KVM the problem has gone away. We still check the filesystem on every boot (since "tune2fs -c 1 /dev/sda6" was set), and the PC will be booted on every working day but no problem was reported since we have switched to KVM on Nov 2013.
German speaking forum for Fedora and CentOS: https://www.fedoraforum.de/

User avatar
Super Jamie
Posts: 310
Joined: 2014/01/10 23:44:51

Re: ext4 file system inconsistency

Post by Super Jamie » 2014/06/12 13:29:52

Is the VMWare hypervisor using LSI MegaRAID SAS? There is apparently a known issue with that causing guest ext4 filesystem corruption, though I'm sorry I don't have any more specific details than that. I presume if you trawled VMWare driver changelogs you might find something?

owl102
Posts: 413
Joined: 2014/06/10 19:13:41

Re: ext4 file system inconsistency

Post by owl102 » 2014/06/12 13:53:39

Super Jamie wrote:Is the VMWare hypervisor using LSI MegaRAID SAS?
No, both PCs were regular Fujitsu Esprimo PCs without RAID controller, SAS or something similar. Just plain SATA technology with one hard drive resp. SSD. One PC was about 5 years old, the other one a brand new one.
German speaking forum for Fedora and CentOS: https://www.fedoraforum.de/

Post Reply