Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Issues related to applications and software problems
silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by silvio » 2022/11/18 08:38:23

Hi,

we upgraded the systems to the latest kernel version and detected "bootloops" on multiple systems with the latest kernel version.
We see the grup menue and after selecting the kernel the system reboots.

First try was to reinstall all kernels but it changed nothing.
Abhängigkeiten werden aufgelöst
--> Transaktionsprüfung wird ausgeführt
---> Paket kernel.x86_64 0:3.10.0-1160.62.1.el7 markiert, um installiert zu werden
---> Paket kernel.x86_64 0:3.10.0-1160.66.1.el7 markiert, um installiert zu werden
---> Paket kernel.x86_64 0:3.10.0-1160.71.1.el7 markiert, um installiert zu werden
---> Paket kernel.x86_64 0:3.10.0-1160.80.1.el7 markiert, um installiert zu werden
--> Abhängigkeitsauflösung beendet

Abhängigkeiten aufgelöst

==============================================================================================================================================================================================================================================
Package Arch Version Paketquelle Größe
==============================================================================================================================================================================================================================================
Neuinstallieren:
kernel x86_64 3.10.0-1160.62.1.el7 centos7-x86_64-updates 50 M
kernel x86_64 3.10.0-1160.66.1.el7 centos7-x86_64-updates 50 M
kernel x86_64 3.10.0-1160.71.1.el7 centos7-x86_64-updates 50 M
kernel x86_64 3.10.0-1160.80.1.el7 centos7-x86_64-updates 52 M

Transaktionsübersicht
==============================================================================================================================================================================================================================================
Neu installieren 4 Pakete

Gesamte Downloadgröße: 203 M
Installationsgröße: 258 M
Is this ok [y/d/N]: y
Downloading packages:
No Presto metadata available for centos7-x86_64-updates
(1/4): kernel-3.10.0-1160.62.1.el7.x86_64.rpm | 50 MB 00:01:05
(2/4): kernel-3.10.0-1160.66.1.el7.x86_64.rpm | 50 MB 00:00:57
(3/4): kernel-3.10.0-1160.71.1.el7.x86_64.rpm | 50 MB 00:00:55
(4/4): kernel-3.10.0-1160.80.1.el7.x86_64.rpm | 52 MB 00:01:03
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gesamt 856 kB/s | 203 MB 00:04:02
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installieren : kernel-3.10.0-1160.66.1.el7.x86_64 1/4
Installieren : kernel-3.10.0-1160.62.1.el7.x86_64 2/4
Installieren : kernel-3.10.0-1160.71.1.el7.x86_64 3/4
Installieren : kernel-3.10.0-1160.80.1.el7.x86_64 4/4
chroot: failed to run command '/usr/sbin/prelink': Permission denied
chroot: failed to run command '/usr/sbin/prelink': Permission denied
chroot: failed to run command '/usr/sbin/prelink': Permission denied
chroot: failed to run command '/usr/sbin/prelink': Permission denied
Überprüfung läuft: kernel-3.10.0-1160.80.1.el7.x86_64 1/4
Überprüfung läuft: kernel-3.10.0-1160.71.1.el7.x86_64 2/4
Überprüfung läuft: kernel-3.10.0-1160.62.1.el7.x86_64 3/4
Überprüfung läuft: kernel-3.10.0-1160.66.1.el7.x86_64 4/4

Installiert:
kernel.x86_64 0:3.10.0-1160.62.1.el7 kernel.x86_64 0:3.10.0-1160.66.1.el7 kernel.x86_64 0:3.10.0-1160.71.1.el7 kernel.x86_64 0:3.10.0-1160.80.1.el7

Komplett!
The version 3.10.0-1160.71.1 starts without any problems:
The systems are AMD based servers and we use the following commandline parameters:
nomodeset rhgb quiet audit=1 audit_backlog_limit=8192 vsyscall=none slub_debug=P page_poison=1 mem_encrypt=on
The cpu on all systesm is a
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 113
Model name: AMD Ryzen 5 3600 6-Core Processor
Stepping: 0
CPU MHz: 3600.000
CPU max MHz: 3600,0000
CPU min MHz: 2200,0000
BogoMIPS: 7200.20
Virtualization: AMD-V
L1d cache: 32K
L1i cache: 32K
L2 cache: 512K
L3 cache: 16384K
NUMA node0 CPU(s): 0-11
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 cpb cat_l3 cdp_l3 hw_pstate sme retpoline_amd ssbd ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip overflow_recov succor smca

Anyone the same problem?

Best

Silvio

BShT
Posts: 583
Joined: 2019/10/09 12:31:40

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by BShT » 2022/11/23 19:56:09

did you update just the kernel or the whole system?

User avatar
jlehtone
Posts: 4523
Joined: 2007/12/11 08:17:33
Location: Finland

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by jlehtone » 2022/11/23 21:17:02

silvio wrote:
2022/11/18 08:38:23
First try was to reinstall all kernels but it changed nothing.
...
The version 3.10.0-1160.71.1 starts without any problems:
Why all? If older kernels seem fine and only the latest fails, then why would you risk the working kernels?
In worst case the reinstall could have ruined them all.
You surely can run something like:

Code: Select all

yum reinstall kernel\*3.10.0-1160.80.1.el7.x86_64
(Although, I'd rather remove the 3.10.0-1160.80.1 packages first -- and oldest kernel; one functional should be enough.)
silvio wrote:
2022/11/18 08:38:23

Code: Select all

chroot: failed to run command '/usr/sbin/prelink': Permission denied
What does cause that? What are the permissions on that file?
How about rpm -Vf /usr/sbin/prelink to see what rpm thinks about that package.

If your system can't install a kernel properly, then it is no wonder that the kernel fails.
(You were probably very lucky that the reinstall did not damage the older kernels.)

User avatar
TrevorH
Site Admin
Posts: 33191
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by TrevorH » 2022/11/23 22:08:15

There is a known bug in the .80 kernel on older Xeon processors - the ones I've seen mentioned are 55xx series.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

User avatar
jlehtone
Posts: 4523
Joined: 2007/12/11 08:17:33
Location: Finland

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by jlehtone » 2022/11/24 09:05:12

Is it good or bad, if AMD CPU from 2019 is so similar to old Intel CPU (from 2009?) that same bug affects both?

For the record, the .80 did install and boot cleanly with Intel Core i7-6700. Or so it seems. (Undefined behaviour can be sneaky.)

User avatar
TrevorH
Site Admin
Posts: 33191
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by TrevorH » 2022/11/24 09:11:14

The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by silvio » 2022/11/24 11:28:11

BShT wrote:
2022/11/23 19:56:09
did you update just the kernel or the whole system?
The whole system is up2date

Silvio

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by silvio » 2022/11/24 11:34:06

jlehtone wrote:
2022/11/23 21:17:02
silvio wrote:
2022/11/18 08:38:23
First try was to reinstall all kernels but it changed nothing.
...
The version 3.10.0-1160.71.1 starts without any problems:
Why all? If older kernels seem fine and only the latest fails, then why would you risk the working kernels?
In worst case the reinstall could have ruined them all.
You surely can run something like:

Code: Select all

yum reinstall kernel\*3.10.0-1160.80.1.el7.x86_64
(Although, I'd rather remove the 3.10.0-1160.80.1 packages first -- and oldest kernel; one functional should be enough.)
silvio wrote:
2022/11/18 08:38:23

Code: Select all

chroot: failed to run command '/usr/sbin/prelink': Permission denied
What does cause that? What are the permissions on that file?
How about rpm -Vf /usr/sbin/prelink to see what rpm thinks about that package.

If your system can't install a kernel properly, then it is no wonder that the kernel fails.
(You were probably very lucky that the reinstall did not damage the older kernels.)
The idea was then the system fails this it was possible not kernel releated.
The prelink message is correct. It's an point in the OpenScap list for C7/RH7 to disable this and the systems running without problems.

"The prelinking feature changes binaries in an attempt to decrease their startup time. In order to disable it, change or add the following line inside the file /etc/sysconfig/prelink: PRELINKING=no Next, run the following command to return binaries to a normal, non-prelinked state: $ sudo /usr/sbin/prelink -ua"

The system is running with this option since more then 3 years so it is not that point.

Silvio

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by silvio » 2022/11/24 11:37:39

Thanks Trevor,

I have seen this and have opened an entry in the RedHat Bugtracker:
Hope the info helps someone.

Silvio

silvio
Posts: 67
Joined: 2008/11/10 13:06:03

Re: Booting latest Kernel failed on multiple systems (kernel-3.10.0-1160.80.1.el7.x86_64)

Post by silvio » 2022/11/24 11:40:30

jlehtone wrote:
2022/11/24 09:05:12
Is it good or bad, if AMD CPU from 2019 is so similar to old Intel CPU (from 2009?) that same bug affects both?

For the record, the .80 did install and boot cleanly with Intel Core i7-6700. Or so it seems. (Undefined behaviour can be sneaky.)
At the moment the only system with a Intel cpu is the database server. So I am not willing to check if the kernel is running on this system :-) .

Silvio

Post Reply