[SOLVED] NVIDIA: API and kernel mod mismatch

General support questions
Post Reply
jody
Posts: 53
Joined: 2015/05/12 12:58:08

[SOLVED] NVIDIA: API and kernel mod mismatch

Post by jody » 2022/01/11 15:26:14

Hi

Currently my workstation does not boot into graphics mode.
In /var/log/messages i have several copies of this line

Code: Select all

Jan 11 15:57:18 aim-seadragon kernel: NVRM: API mismatch: the client has the version 470.94, but#012NVRM: this kernel module has the version 470.86.  Please#012NVRM: make sure that this kernel module and all NVIDIA driver#012NVRM: components have the same version.
On the other hand

Code: Select all

$ sudo yum info kmod-nvidia.x86_64 nvidia-x11-drv.x86_64
Loaded plugins: fastestmirror, nvidia
Loading mirror speeds from cached hostfile
 * base: pkg.adfinis.com
 * elrepo: elrepo.reloumirrors.net
 * epel: pkg.adfinis.com
 * extras: pkg.adfinis.com
 * updates: pkg.adfinis.com
Installed Packages
Name        : kmod-nvidia
Arch        : x86_64
Version     : 470.94
Release     : 1.el7_9.elrepo
Size        : 124 M
Repo        : installed
From repo   : elrepo
Summary     : nvidia kernel module(s)
URL         : http://www.nvidia.com/
License     : Proprietary
Description : This package provides the nvidia kernel module(s) built
            : for the Linux kernel using the x86_64 family of processors.

Name        : nvidia-x11-drv
Arch        : x86_64
Version     : 470.94
Release     : 1.el7_9.elrepo
Size        : 26 M
Repo        : installed
From repo   : elrepo
Summary     : NVIDIA OpenGL X11 display driver files
URL         : http://www.nvidia.com/
License     : Distributable
Description : This package provides the proprietary NVIDIA OpenGL X11 display driver files.

I.e. the versions of the driver and the kernel module seem to be the same (470.94)

How can i fix this mismatch?.
Last edited by jody on 2022/01/12 14:53:53, edited 1 time in total.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: NVIDIA: API and kernel mod mismatch

Post by TrevorH » 2022/01/11 17:08:02

Did you reboot after updating kmod-nvidia from 470.86 to 470.94? At the very least you need to quit the GUI and from a root command prompt you would need to remove the loaded nvidia kernel module using modprobe -r nvidia and then load the new copy and restart the GUI. Probably easier to reboot...
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

jody
Posts: 53
Joined: 2015/05/12 12:58:08

Re: NVIDIA: API and kernel mod mismatch

Post by jody » 2022/01/12 08:48:31

Hi TrevorH
Yes, i had rebooted, but the situation is still the same: it didn't enter the GUI, the main console still shows the boot messages, and in /var/log/messages i have the same messages about the different versions.

The modprobe also didn't work:

Code: Select all

$ sudo modprobe -r nvidia
modprobe: FATAL: Module nvidia is in use.
but

Code: Select all

$ ps ax | grep X
 1539 pts/9    S+     0:00 grep --colour=auto X
 3982 tty7     Ssl+ 10180:34 /usr/bin/X :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
Under what circumstances is nvidia in use without X running?

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: NVIDIA: API and kernel mod mismatch

Post by TrevorH » 2022/01/12 09:19:45

Look at the dates/times on /boot/initramfs* and see if they were rebuilt around the time that you updated kmod-nvidia. Is there enough space on /boot to allow for a new initramfs or did the install fail?
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

jody
Posts: 53
Joined: 2015/05/12 12:58:08

Re: NVIDIA: API and kernel mod mismatch

Post by jody » 2022/01/12 10:02:20

These are my initramfs:

Code: Select all

$ ls -l initram*
-rw-r--r--. 1 root root 43567558 Mar 27  2018 initramfs-0-rescue-549e40f2ffb94c3698cd1b18c829d611.img
-rw-------. 1 root root 84454692 Nov 22 17:25 initramfs-3.10.0-1160.45.1.el7.x86_64.img
-rw-------. 1 root root 66714227 Oct 25 08:36 initramfs-3.10.0-1160.45.1.el7.x86_64kdump.img
-rw-------. 1 root root 84452919 Dec 13 08:51 initramfs-3.10.0-1160.49.1.el7.x86_64.img
-rw-------. 1 root root 66760307 Jan 11 17:01 initramfs-3.10.0-1160.49.1.el7.x86_64kdump.img
and these are what i think to be the kernel mods

Code: Select all

ls -l /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/nvidia/nvidia*
-rw-r--r--. 1 root root  4438080 Dec 14 14:53 /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/nvidia/nvidia-drm.ko
-rw-r--r--. 1 root root 51583032 Dec 14 14:53 /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/nvidia/nvidia.ko
-rw-r--r--. 1 root root  2124944 Dec 14 14:53 /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/nvidia/nvidia-modeset.ko
-rw-r--r--. 1 root root   246080 Dec 14 14:53 /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/nvidia/nvidia-peermem.ko
-rw-r--r--. 1 root root 35782528 Dec 14 14:53 /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/nvidia/nvidia-uvm.ko
I reinstalled kmod-nvidia.x86_64 and nvidia-x11-drv.x86_64 - both were installed without error messages.
It looks like the initramfs were rebuilt too (the ".x86_64.img", not the "*.x86_64kdump.img"):

Code: Select all

-rw-r--r--. 1 root root 43567558 Mar 27  2018 initramfs-0-rescue-549e40f2ffb94c3698cd1b18c829d611.img
-rw-------. 1 root root 84462750 Jan 12 10:42 initramfs-3.10.0-1160.45.1.el7.x86_64.img
-rw-------. 1 root root 66714227 Oct 25 08:36 initramfs-3.10.0-1160.45.1.el7.x86_64kdump.img
-rw-------. 1 root root 84459566 Jan 12 10:43 initramfs-3.10.0-1160.49.1.el7.x86_64.img
-rw-------. 1 root root 66760307 Jan 11 17:01 initramfs-3.10.0-1160.49.1.el7.x86_64kdump.img
Regarding the space left in the boot partition:

Code: Select all

$ df -B 1 /boot/
Filesystem     1B-blocks      Used Available Use% Mounted on
/dev/sda1      517713920 412311552 105402368  80% /boot
Would it be permissible to manually remove the files pertaining to older versions (in my case 1160.45.1) from /boot?

User avatar
jlehtone
Posts: 4523
Joined: 2007/12/11 08:17:33
Location: Finland

Re: NVIDIA: API and kernel mod mismatch

Post by jlehtone » 2022/01/12 13:35:40

You could run:

Code: Select all

sudo package-cleanup --oldkernels
That gets the list of older kernel packages and hands them to "yum remove". (Not the two latest and not the currently running kernel.)


However, that does not help you at all, because you have only two versions installed. You could keep only one with:

Code: Select all

sudo package-cleanup --oldkernels --count=1
To have package autoremoval on installation of new kernel, one could edit 'installonly_limit' in file /etc/yum.conf.
The default is 5, keep five kernels.

You do have a third kernel there, the rescue kernel. Package 'dracut-config-rescue' (if installed) adds that in some situations.

You do have those "kdump" files. You have kernel dump service enabled (it is an option in installer and enabled by default).
There is some way to disable so those those kdump files are not (re)generated.

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: NVIDIA: API and kernel mod mismatch

Post by TrevorH » 2022/01/12 13:55:49

If you are short on space in /boot then consider disabling/removing kdump which is only of any use if you know how to analyze kernel dumps or have someone that can do it for you.
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

User avatar
TrevorH
Site Admin
Posts: 33202
Joined: 2009/09/24 10:40:56
Location: Brighton, UK

Re: NVIDIA: API and kernel mod mismatch

Post by TrevorH » 2022/01/12 13:56:22

Also, now that you have rebuilt the initramfs files, did you reboot to see if the problem is still present?
The future appears to be RHEL or Debian. I think I'm going Debian.
Info for USB installs on http://wiki.centos.org/HowTos/InstallFromUSBkey
CentOS 5 and 6 are deadest, do not use them.
Use the FAQ Luke

jody
Posts: 53
Joined: 2015/05/12 12:58:08

Re: NVIDIA: API and kernel mod mismatch

Post by jody » 2022/01/12 14:53:31

@TrevorH: thanks - after rebooting the workstation with the "fresh" initramfs the problem was gone.

@jlehtone: thanks for the "package-cleanup" command!

Post Reply