I need your help, i'm facing a brutal high load average on a nfs server.
The server is Centos 7.6.1810 kernel version 3.10.0-957.el7.x86_64
Randomely the load average of the server grow up fast, in less than 1 minute so it's impossible to connect to the server to check what happens
I need to restart the machine to resume the service
DATA is stored on a RAID 6 BTRFS volume transfered from another distant server.
We are currently transfering large amount of data (approximately 10/15To per day) from multiple distant servers.
We have found those particular error messages in /var/log/messages before crash:
Those DATA are transfered via a NFS mount point between multiple distant serversMar 9 19:32:44 kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:22366]
Mar 9 19:32:44 kernel: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [nfsd:22365]
The server is sharing his entire BTRFS volume with the distant servers with those options :
It seems to be this bug https://bugzilla.redhat.com/show_bug.cgi?id=1095436 but I want to be sure before upgrading or changing OScat /etc/exports
/btrfsvolume distantIP(rw,no_root_squash,no_subtree_check)
Thanks for your help