Ubuntu 16.04安装NVIDIA的显卡驱动396和CUDA9.1

时间 2019-11-09

标签 ubuntu 16.04 安装 nvidia 显卡驱动 cuda9.1 cuda 栏目 Ubuntu 繁體版

原文原文链接

Ubuntu安装NVIDIA的显卡驱动和CUDA Toolkit

Ubuntu(这里是用16.04LTS桌面版，若是是17.04及之后版本，由于使用的显示服务器不一样，可能又会有所不一样)安装NVIDIA的显卡驱动常常出现启动后死循环进不去系统的状况，这里推荐的方法能够安装最新的驱动（版本396）和Cuda Toolkit，在最新的Titan V显卡测试可用。linux

一、使用Ubuntu软件库安装（推荐）

最简单的方法，安装Synaptic而后搜索NVIDIA，找一个最新的驱动安装。

sudo apt install synaptic
sudo synaptic

这样安装的驱动是通过Ubuntu测试过得，比较保险。不过，版本较旧一点，我安装的Ubuntu16.04 LTS里的NVidia驱动默认是384版本。git

能够直接安装NVidia的384版本的驱动（强烈推荐）：

sudo apt install nvidia-384

安装最新的396版本驱动，以下（Titan V测试可用）：

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-396

#开发使用
sudo apt install nvidia-396-dev

若是从Nvidia官网下载安装，装完系统重启失败。我试了几回都失败，放弃。

二、安装CUDA驱动（注意选项）

安装CUDA驱动会自动安装图形卡驱动，能够从 https://developer.nvidia.com/cuda-downloads 下载安装，已经支持最新的Volta架构（目前只有采用V100芯片的Titan V图形卡和Tesla计算卡使用）。github

#获取Cuda9.1安装文件文件和2018.5.5的补丁包：
wget -c https://developer.nvidia.com/compute/cuda/9.1/Prod/local_installers/cuda_9.1.85_387.26_linux
wget -c https://developer.nvidia.com/compute/cuda/9.1/Prod/patches/3/cuda_9.1.85.3_linux
#而后运行 sudo chmod +x ...，再执行就能够了。

不过，按照这种方法安装后，从新启动后挂起，能够参考本文后面的方法尝试解决。
- 建议安装8.X版本，9.1安装后重启出现循环登陆现象，进不去系统。
  - 使用run文件安装，deb安装会出现启动时系统挂起现象（好像不肯定）。
- 后来发现多是安装图形驱动的问题，在选择是否安装驱动时，选择否，便可。

Ubuntu18.04:docker

https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64

Install:ubuntu

sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

装完后还须要一些设置，才可使用，提示以下：服务器

===========
= Summary =
===========

Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-9.1
Samples:  Installed in /home/openthings, but missing recommended libraries

Please make sure that
 -   PATH includes /usr/local/cuda-9.1/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-9.1/lib64, or, add /usr/local/cuda-9.1/lib64 to /etc/ld.so.conf and run ldconfig as root

To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.1/bin

Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.1/doc/pdf for detailed information on setting up CUDA.

***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.1 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver

三、安装NVIDIA原厂驱动（不建议）

要安装更新的驱动，能够到NVidia官网（http://www.nvidia.cn/Download/index.aspx?lang=cn）下载。架构

安装时，要求关闭xserver，运行：curl

sudo service lightdm stop

按ctl+alt+F1进入命令行模式。再按ctl+alt+F7能够返回图形界面。ide

运行完后，重启lightdm，再运行：测试

sudo service lightdm start

不过，因为测试不太充分，安装复杂不说，还会遇到重启后挂起的现象，致使没法登陆。

能够启动时进入“高级-Recovery”模式，而后在命令行下从新设置。

运行：

dpkg-reconfigure lightdm

系统修复措施，参考：

要是还不行的话，就只能从新安装系统了。

四、安装NVidia支持的Docker引擎

安装NVidia支持的Docker引擎，就能够在容器中使用GPU了。具体步骤以下：

# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# Test nvidia-smi with the latest official CUDA image
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

注意，如今像上面运行 Docker 能够直接支持GPU了，不用再单独运行Docker-Nvidia命令了，大大加强了与各类容器编排系统的兼容性，Kubernetes目前也已经能够支持Docker容器运行GPU了。

参考 http://www.javashuo.com/article/p-gsqaalml-ex.html

目前版本依赖Docker 18.03版，若是已经安装了其它版本，能够指定安装的版本，以下：

sudo apt install docker-ce=18.03.1~ce-0~ubuntu

详细的参考：

https://github.com/NVIDIA/nvidia-docker