XFS zero files issue

General support questions
Post Reply
IgorKorotin
Posts: 3
Joined: 2021/07/12 09:03:56

XFS zero files issue

Post by IgorKorotin » 2021/07/12 09:28:52

Hi Guys

We constantly find zero files on our target machines. During development these machines crash a lot, also we're testing them with power cycles. At some moment we have started to encounter zero files.

Googling this issue i found that it is widely know issue:
https://bugzilla.redhat.com/show_bug.cgi?id=845233
https://unix.stackexchange.com/question ... ytes-files
https://access.redhat.com/solutions/272673

Some guy have made script which is able to fix these files because according to his knowledge, the only metadata of file is corrupted, but file is ok. He also says that in some "newer" version of Kernel this zero files issue is fixed. https://github.com/pedmon/xfs_recover

I've tried to go through XFS changes in sources of kernel.org, but there are too many of commits there, i'm not sure i'll find which is exactly the necessary fix. Also i suspect that this change is not in kernel.org's sources at all because CentOS kernel seems has another dev branch.
I tried to find CentOS kernel repository without success.

So the questions are the following:
1. Where can i find CentOS kernel repo, please?
2. Is it known in which version of XFS and Kernel this issue is fixed?

We're using the following version of Centos:
[root@localhost ~]# hostnamectl
Static hostname: localhost.localdomain
Pretty hostname: E59AAAAAAAA2
Icon name: computer-desktop
Chassis: desktop
Machine ID: 7bcd3f50e50a4c19ad9ae9be44a718d2
Boot ID: ca06204b2dc34c72b8ea20c427a661c6
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-1062.18.1.rt56.1044.19.el7.x86_64
Architecture: x86-64

XFS version is:
[root@localhost ~]# xfs_db -r /dev/sda2
xfs_db> version
versionnum [0xb4b5+0x18a] = V5,NLINK,DIRV2,ATTR,ALIGN,LOGV2,EXTFLG,MOREBITS,ATTR2,LAZYSBCOUNT,PROJID32BIT,CRC,FTYPE

Please, advise.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: XFS zero files issue

Post by TrevorH » 2021/07/12 14:07:02

Kernel: Linux 3.10.0-1062.18.1.rt56.1044.19.el7.x86_64
Start by running something a little less ancient. Do you even need the RT kernel? It's non-standard and gets updates far less frequently than the real one. The current CentOS 7 kernel is 3.10.0-1160.31.1.el7.x86_64 and the RT version is stuck at

[ ] kernel-rt-3.10.0-1160.6.1.rt56.1139.el7.x86_64.rpm 2020-11-17 14:19 52M

Yours is at least 2 point releases behind, in fact it doesn't even appear to be an official CentOS RT kernel since there is none of that name/version on vault.centos.org.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

IgorKorotin
Posts: 3
Joined: 2021/07/12 09:03:56

Re: XFS zero files issue

Post by IgorKorotin » 2021/07/14 15:19:16

Hi, Trevor

Thanks for the reply.
I do understand this kernel is old and it might be not official one. I'm not the one who decides which kernel to use.
We use centos in embed project, that's why it is rt. More than that we have some proprietary patches of kernel, which though are not related to fs folder source code at all.

With all respect, questions were different. We have zero files issue, there're some articles pointing that this issue was addressed in the past. So i wonder if somebody can point me source code repository where and might be what to search for. For now i have no glue how this xfs fix looks like. Even do not know approximate Kernel version where it was fixed.
Kernel repo in kernel.org has tons of commits in xfs. It's not real to analyse them all.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: XFS zero files issue

Post by TrevorH » 2021/07/14 15:39:26

The latest kernel rpm changelog has 3702 lines in it before it reaches the -1062 kernel entry. Of those 3702 lines, 124 refer to xfs. That's 124 xfs fixes in the current kernel over the ones in your current one. Several of those 124 fixes look like they might* be applicable to you, the ones that spring out from a quick read are:

Code: Select all

- [fs] xfs: fix inode allocation block res calculation precedence (Brian Foster) [1857203]
- [fs] xfs: Fix tail rounding in xfs_alloc_file_space() (Bill O'Donnell) [1833223]
- [fs] xfs: simplify xfs_idata_realloc (Brian Foster) [1751015]
- [fs] xfs: remove if_real_bytes (Brian Foster) [1751015]
- [fs] xfs: properly serialise fallocate against AIO+DIO (Carlos Maiolino) [1786004]
- [fs] xfs: refactor xfs_buf_log_item reference count handling (Bill O'Donnell) [1583799]
- [fs] xfs: clean up xfs_trans_brelse() (Bill O'Donnell) [1583799]
- [fs] xfs: don't ever put nlink > 0 inodes on the unlinked list (Carlos Maiolino) [1721498]
- [fs] xfs: validate allocated inode number (Carlos Maiolino) [1721498]
* I'm not an xfs programmer so do not know which of these might be useful. There are others in that list of 124 that could easily be your problem but not immediatley obvious from the changelog entry. You can read the full list by using rpm -q --changelog kernel-3.10.0-1160.31.1.el7.x86_64 | head -3702 | grep xfs | less

The link https://access.redhat.com/solutions/272673 refers to kernels in RHEL 6.3 and lower so it's unlikely to be applicable to CentOS 7. The stackexchange link also says they were running 6.3 and the bugzilla link likewise. None of those are likely to be useful since the kernel in CentOS 7 will already have those fixes.

So rather than wander round aimlessly looking at inapplicable bugs you should update to $latest and then retest to see if the problem is fixed or not. If it is fixed then you don't need to do anything else (other than upgrade). If it isn't then you know you still have a valid bug and should report it to Red Hat via bugzilla.redhat.com
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

IgorKorotin
Posts: 3
Joined: 2021/07/12 09:03:56

Re: XFS zero files issue

Post by IgorKorotin » 2021/07/14 15:43:11

Ok, thanks for the explanations.

Post Reply