1.查看是否有默认nouveau驱动

所有命令都在root用户下运行

运行命令:

ls | grep nouveau
lsmod | grep nouveau

截图:

配置可用源,安装dkms管理显卡驱动:

yum install -y epel-release
yum install -y kernel-headers kernel-devel dkms

2.启用nvidia驱动

运行命令:

vim /lib/modprobe.d/dist-blacklist.conf

首先:blacklist nvidiafb 这一行前,加个#号

如图:

备份镜像并重新生成:

mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut -v /boot/initramfs-$(uname -r).img $(uname -r) --force

3.更改启动配置

命令:

vim /etc/default/grub

参数:GRUB_CMDLINE_LINUX

后尾新增字符串:" rd.driver.blacklist=nouveau"

前面带一个空格

之后运行一下命令,重启后生效:

grub2-mkconfig -o /etc/grub2-efi.cfg
grub2-mkconfig -o /etc/grub2.cfg

4.安装CUDA环境:

CUDA安装:

wget https://developer.download.nvidia.com/compute/cuda/12.8.1/local_installers/cuda-repo-rhel9-12-8-local-12.8.1_570.124.06-1.x86_64.rpm
rpm -ivh cuda-repo-rhel9-12-8-local-12.8.1_570.124.06-1.x86_64.rpm
dnf clean all
dnf -y install cuda-toolkit-12-8

显卡驱动:

# 专有内核
sudo dnf -y module install nvidia-driver:latest-dkms
# 开源内核
sudo dnf -y module install nvidia-driver:open-dkms

cuDNN安装:

wget https://developer.download.nvidia.com/compute/cudnn/9.8.0/local_installers/cudnn-local-repo-rhel9-9.8.0-1.0-1.x86_64.rpm
rpm -ivh cudnn-local-repo-rhel9-9.8.0-1.0-1.x86_64.rpm
dnf clean all
dnf -y install cudnn
dnf -y install cudnn-cuda-12

5.重启后测试

命令:

lsmod | grep nvidia
nvidia-smi

如图: