Centos 7 boots into emergency mode

someotherguy · Post by **someotherguy** » 2021/12/10 19:00:54

After powering down my home server, and transferring it from one UPS to another,
Centos 7 now boots into emergency mode only. I can't find a method to copy the system log to my desktop
to attach to this post, so I'm listing a few entries from journalctl -xb that seem like they might be relevant:
.
.
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#13 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#13 Sense Key : Medium Error [current] [descriptor]
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#13 Add. Sense: Unrecovered read error - auto reallocate failed
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#13 CDB: Read (10) 28 00 29 eb 3d a0 00 00 08 00
localhost.localdomain kernel: blk_update_request: I/O error, dev sda sector 703282592
localhost.localdomain kernel: Buffer I/O error on dev sda, logical block 87910324, async page read
.
.
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#18 Sense Key : Medium Error [current] [descriptor]
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#18 Add. Sense: Unrecovered read error - auto reallocate failed
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#18 CDB: Read (10) 28 00 29 eb 3d a0 00 00 08 00
localhost.localdomain kernel: blk_update_request: I/O error, dev sda sector 703282592
localhost.localdomain kernel: Buffer I/O error on dev sda, logical block 87910324, async page read
.
.
.
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#27 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#27 Sense Key : Medium Error [current] [descriptor]
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#27 Add. Sense: Unrecovered read error - auto reallocate failed
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#27 CDB: Read (10) 28 00 00 00 08 20 00 00 18 00
localhost.localdomain kernel: blk_update_request: I/O error, dev sda sector 2080
.
.
.
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#17 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=0s
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#17 Sense Key : Medium Error [current] [descriptor]
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#17 Add. Sense: Unrecovered read error - auto reallocate failed
localhost.localdomain kernel: sd 0:0:0:0: [sda] tag#17 CDB: Read (10) 28 00 00 00 08 20 00 00 08 00
localhost.localdomain kernel: blk_update_request: I/O error, dev sda sector 2080
localhost.localdomain kernel: Buffer I/O error on dev sda, logical block 4, async page read
.
.
.
similar entries for tag#6, tag#25, tag#28, tag#24
.
localhost.localdomain kernel: Buffer I/O error on dev sda1, logical block 4, async page read
.
similar entries for tag#29, tag#22
.
.
localhost.localdomain kernel: Buffer I/O error on dev sda1, logical block 87910324, async page read
.
.
localhost.localdomain kernel: Buffer I/O error on dev sda1, logical block 4, async page read

multiple repeats of above line
.
.
localhost.localdomain systemd-vconsole-setup[669]: usr/bin/loadkeys failed with error code 1.
.
.
localhost.localdomain systemd[1]: Job dev-disk-by\x2duuid-9dedb8f1\x2d855a\x2d4baa\x2da04d411d5c6b.device/start timed out.
localhost.localdomain systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-9dedb8f1\x2d855a\x2d4baa\x2da04d411d5c6b.device
-- Subject: Unit dev-disk-by\x2duuid-9dedb8f1\x2d855a\x2d4baa\x2da04d411d5c6b.device has failed
.
.
localhost.localdomain systemd[1]: Dependency failed for /boot.
-- Subject: Unit boot.mount has failed
--
--The result is dependency
localhost.localdomain systemd[1]: Dependency failed for Local File Systems
-- Subject: Unit local-fs.target has failed

Similar series of messages for:
rhel-autorelabel-mark service
selinux-policy-migrate-local-changes@targeted.service
.
.
localhost.localdomain systemd[1]: Job local-fs.target/start failed with the result 'dependency'

localhost.localdomain systemd[1]: Job boot.mount/start failed with the result 'dependency'
.
.
localhost.localdomain systemd[1]: Job dev-disk-by\x2duuid-9dedb8f1\x2d855a\x2d4baa\x2da04d411d5c6b.device/start failed with the result 'time out'.
.
.
localhost.localdomain kdumpctl[769]: no permission to write to /boot
localhost.localdomain dracut[1371]: no permission to write to /boot
localhost.localdomain kdumpctl[769]: mkdumprd: failed to make kdump initrd
localhost.localdomain kdumpctl[769]: Starting kdump: [FAILED]
localhost.localdomain systemd[1]: kdump.service: main process exited, code=exited, status=1/FAILURE
localhost.localdomain systemd[1]: Failed to start crash recovery kernel arming
--Subject: Unit kdump service has failed
.
.
.
localhost.localdomain sedispatch[781]: Connection error(Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory:
AVC will be dropped

above line repeats

My server has the OS installed on a SSD (sda) with the data on a 4 disk RAID10. I also see multiple listings of errors on ATA1, which I believe is
the DVD drive. I have not listed them since it's hard to see how they could be realted tp the current problem. In addition to sda I see a listing of sda1, and I don't understand why.

My working hypothesis is that the root problem is that the SSD is failing, and that the solution is to do a clean install on a new SSD.
Any helpful comments greatly appreciated.

Post by **TrevorH** » 2021/12/10 20:48:05

Your disk is dead or dying. Since you say it only started after moving the machine it's probably worth checking all the connections are secure before you try anything else. So /dev/sda is your whole physical hard disk, the first one seen by the BIOS, and sda1 and sda2 are partitions on that device.

someotherguy · Post by **someotherguy** » 2021/12/10 21:41:28

Thanks for the quick reply. I didn't actually move the server, only swapped a different UPS into the rack, unplugged the server power cord from the old one and plugged it into the new one. So a clean install on a new SSD is apparently in the cards. My next question is, will I be able to access my data on the RAID after doing so, and is there anything I should do to facilitate that? Also, I was originally planning to take this as an opportunity to upgrade to Centos 8, until I learned that it was EOL. Would you recommend sticking with Centos 7 or going to Centos Stream?

I had assumed that SSD's are essentially bullet proof, having no moving parts, in addition to being faster than HHD's. This whole experience has caused me to question the wisdom of installing the OS on a single SSD as opposed to the RAID.

Whoever · Post by **Whoever** » 2021/12/11 03:05:59

someotherguy wrote: ↑
2021/12/10 21:41:28

I had assumed that SSD's are essentially bullet proof, having no moving parts, in addition to being faster than HHD's. This whole experience has caused me to question the wisdom of installing the OS on a single SSD as opposed to the RAID.

Absolutely not. You need to monitor SSDs with smartd just like any other hard drive. SSDs have a limited life and "wear" out.

CentOS

Centos 7 boots into emergency mode

Centos 7 boots into emergency mode

Re: Centos 7 boots into emergency mode

Re: Centos 7 boots into emergency mode

Re: Centos 7 boots into emergency mode