参考下面这个链接
安装NVIDIA驱动
安装过程中大概会出现下面这个报错:
an error occurred while performing building kernel modules see /var/log/nvidia-installer.log for details. unrecognized command-line option…
原因是系统gcc版本和内核gcc版本不一致,安装gcc-12
That’s an issue with the Ubuntu kernel. For whatever reasons, the ubuntu kernel team decided to use gcc-12 for kernel compilation while the 22.04 system compiler is gcc-11. Please install gcc-12 from ubuntu repo to be able to compile the nvidia modules again.
笔记本电脑挂起恢复
在 Linux 上,经过一次挂起/恢复周期后,有时 Ollama 会无法发现你的 NVIDIA GPU,并回退到在 CPU 上运行。你可以通过重新加载 NVIDIA UVM 驱动来解决这个驱动程序错误,命令为 sudo rmmod nvidia_uvm && sudo modprobe nvidia_uvm。
出现这个问题后,复制下面四条命令重启ollama即可
The 999 error is a generic “unknown error” code, which isn’t super helpful.
What happens if you try removing the uvm module:
sudo systemctl stop ollama
sudo rmmod nvidia_uvm
sudo modprobe nvidia_uvm
sudo systemctl start ollama
复制代码
如何找到这个问题呢,停止ollama然后重新启动ollama就会有log出来
sudo systemctl stop ollama
ollama serve
复制代码
会有这些log出来,分析GPU找不到了,以是ollama又使用cpu来跑模子了
msg="unknown error initializing cuda driver library /usr/lib/x86_64-linux-gnu/libcuda.so.550.144.03: cuda driver library init failure: 999"