Boot fails after update ipmi_si No such device

Issues related to applications and software problems and general support
Post Reply
schatzman
Posts: 3
Joined: 2022/01/08 16:59:56

Boot fails after update ipmi_si No such device

Post by schatzman » 2022/01/08 17:13:29

After the latest "minor" Centos 8 Stream update, my x86_64 Intel server fails to boot.

The boot log has multiple "Failed to start: Load Kernel Modules" messages.

After booting into emergency mode, then

system status systemctlctl-modules-load.service reports

Failed to insert 'ipmi_si': No such device

Attempts to boot from any of the older kernels installed also fails with the same error.

Please help!!

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Boot fails after update ipmi_si No such device

Post by TrevorH » 2022/01/08 20:41:56

Failure to load that specific module does not sound like an issue that should cause the boot to fail. The module is one designed for physical servers with a remote control interface - an IPMI. When one is not present then an attempt to load ipmi_si will fail with that error so it's expected and will not stop you from booting.

I'd suggest re-reading your logs to see if there are other errors present. The most likely cause of boot failure is that something is listed in /etc/fstab that either does not exist or fails to mount for some reason. If the fstab line in question does _not_ contain 'nofail' as an option then that will fail the boot.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

arturocp
Posts: 1
Joined: 2022/01/17 00:32:03

Re: Boot fails after update ipmi_si No such device

Post by arturocp » 2022/01/17 01:14:48

I had the same issue. I created /etc/modprobe.d/blacklist-ipmi.conf with:
blacklist ipmi_si
blacklist ipmi_devintf
blacklist ipmi_msghandler
blacklist ipmi_ssif
blacklist ipmi_watchdog
blacklist ipmi_poweroff
blacklist acpi_ipmi
blacklist ibmaem
blacklist ibmpex

I booted and it worked fine!

technojoecoolusa
Posts: 5
Joined: 2017/09/25 01:02:27

Re: Boot fails after update ipmi_si No such device

Post by technojoecoolusa » 2022/01/28 07:09:33

schatzman, did you receive that error after upgrading to kernel-4.18.0-358.el8.x86_64? Does booting into kernel-4.18.0-348.7.1.el8_5.x86_64 work fine?

For me, the boot doesn't fail inside a VM, but it does fail on bare metal hardware with a IPMI installed.

When it fails to boot, does it say something like "BUG: unable to handle kernel NULL pointer dereference at..."? If so, I think have the same problem. If this is case, can you do a kdump? If so, please post it to:

https://bugzilla.redhat.com/show_bug.cgi?id=2043430

I'm trying to setup get a kdump from my own server, but since it's prod, I can't do that until this weekend.

avoulvou
Posts: 5
Joined: 2022/01/31 20:24:35

Re: Boot fails after update ipmi_si No such device

Post by avoulvou » 2022/02/01 07:17:18

Hi,

I had a similar case alarm without boot interruption, only alarm raised

My server is VPS and alarm raised after upgrade to latest kernel
[root@monitor ~]# uname -a
Linux monitor.mywebhost.gr 4.18.0-358.el8.x86_64 #1 SMP Mon Jan 10 13:11:20 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

[root@monitor ~]# systemctl --failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● systemd-modules-load.service loaded failed failed Load Kernel Modules

LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.

1 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.
[root@monitor ~]# systemctl status systemd-modules-load
● systemd-modules-load.service - Load Kernel Modules
Loaded: loaded (/usr/lib/systemd/system/systemd-modules-load.service; static; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2022-02-01 09:09:22 EET; 5min ago
Docs: man:systemd-modules-load.service(8)
man:modules-load.d(5)
Main PID: 660 (code=exited, status=1/FAILURE)

Feb 01 09:09:22 monitor.mywebhost.gr systemd[1]: Starting Load Kernel Modules...
Feb 01 09:09:22 monitor.mywebhost.gr systemd-modules-load[660]: Module 'msr' is builtin
Feb 01 09:09:22 monitor.mywebhost.gr systemd-modules-load[660]: Failed to insert 'ipmi_si': No such device
Feb 01 09:09:22 monitor.mywebhost.gr systemd[1]: systemd-modules-load.service: Main process exited, code=exited, status=1/FAILURE
Feb 01 09:09:22 monitor.mywebhost.gr systemd[1]: systemd-modules-load.service: Failed with result 'exit-code'.
Feb 01 09:09:22 monitor.mywebhost.gr systemd[1]: Failed to start Load Kernel Modules.

Solved with workaround /etc/modprobe.d/blacklist-ipmi.conf blacklist

joecool-SG
Posts: 4
Joined: 2020/01/22 03:20:18

Re: Boot fails after update ipmi_si No such device

Post by joecool-SG » 2022/02/08 14:34:44

I can confirm that I had the same issue when upgrading to CentOS Stream ( 4.18.0-358 ) and that the modprobe blacklist resolved the issue. ( Thanks arturocp)

As well. I want to address some of the flawed troubleshooting that popped up on this thread

1) "Failure to load that specific module does not sound like an issue that should cause the boot to fail."

When contradicting direct evidence, I need to see a fact-based justification.
The Kernel is booting, it highlights this failure, it reports only this failure.

If you don't believe the data (log) then ask for more data, but it is a reasonable conclusion that this is the root of the failure.
Disregarding direct evidence for a "sounds like" is to chase after paper dragons.

Some modules do cause the kernel to stop loading. It is a matter of applying fact to determine if ipmi is one of them, for the given kernel.


2) "Most likely ... fstab"

I have cifs entries in fstab that do not resolve at boot, and even without nofail, they do not cause a boot failure. This is clearly a suspect idea.

If fstab is the root cause,then (1) there are specific conditions, and those need to be stated ( boot device, no ignore flag, etc. ) and (2) there are specific messages that will be seen. If you've made it to kernel startup, it has obviously found some of the needed partitions.

The larger issue is that this suggestion seems random - how would a kernel dnf update break fstab ? there is no evidence of that, and hence more misdirection.

hughesjr
Site Admin
Posts: 254
Joined: 2004/12/05 01:51:26
Location: Corpus Christi, Texas, USA
Contact:

Re: Boot fails after update ipmi_si No such device

Post by hughesjr » 2022/02/14 13:44:49

The issue of "missing ipmi_si" is not the reason for the boot failure. This is an issue for sure, but whatever is happening next the boot sequence is causing your boot failure.

To fix specifically the "ipmi_si No such device" error, you can blacklist the module and remove from your initrd image the device "impi_si" using this article and following the steps for RHEL 8.

https://access.redhat.com/solutions/41278

But that is not the cause of the boot failure .. you will pass that issue and continue on with the boot process.

joecool-SG
Posts: 4
Joined: 2020/01/22 03:20:18

Re: Boot fails after update ipmi_si No such device

Post by joecool-SG » 2022/02/26 20:25:55

For the official fix, bugzilla reference:

https://bugzilla.redhat.com/show_bug.cgi?id=2052053

Post Reply