Autofs amnesia

General support questions
Post Reply
UserRandom76543
Posts: 1
Joined: 2021/09/20 16:50:49

Autofs amnesia

Post by UserRandom76543 » 2021/11/12 23:49:28

Ahoy all,

I have been stumbling into an error with autofs mounts disappearing fairly randomly. These ~300 hosts have all been built to match each other, and I have not seen a pattern in uptime or specific servers that experience these issues. All mounts are fixed when autofs is restarted... and I don't see a pattern of nfs server or client in these... just that autofs seems to be the primary point of failure.

The failure scenario will allow some mounts to work, but will just hang on other mounts. Example would be "temp" still shows files, while "not_temp" will hang forever on a stat
/data/scratch/temp
/data/scratch/not_temp

Versions:
autofs-5.0.7-116
3.10.0-1160.24.1.el7.x86_64
CentOS Linux release 7.9.2009 (Core)

The autofs config looks like:

auto.master:
/data/scratch /etc/auto.mounts tcp hard intr timeo=600 retrans=2 async --ghost

auto.mounts:
temp servername:/ifs/scratch/&
not_temp servername:/ifs/scratch/&

Example of what I see with stat/strace:
access("/etc/selinux/config", F_OK) = 0
open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=106172832, ...}) = 0
mmap(NULL, 106172832, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7f2caebd1000
close(4) = 0
open("/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=2502, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2cb5fa9000
read(4, "# Locale name alias data base.\n#"..., 4096) = 2502
read(4, "", 4096) = 0
close(4) = 0
munmap(0x7f2cb5fa9000, 4096) = 0
open("/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en_US/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/en/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
lstat("/data/scratch/temp",

Trials at getting information:
pcap shows zero nfs activity for the action of trying to access these broken directories
I have attempted to increase the logging with 'automount -l 7 /data/scratch' without fixing it (restarting the process)


Any thoughts on other methods of debugging this?

Post Reply