如何搭建GPU深度学习环境

时间 2019-12-04

标签如何搭建 gpu 深度学习环境繁體版

原文原文链接

这篇文章将带领你完全安装好如下内容：python

Tensorflow-gpuc++

CUDAgit

cudnngithub

Vswindows

首先咱们默认你已经安装了anaconda，若是没有那就去安装一个。至于怎么安装，能够随便百度一个教程，由于安装anaconda的教程实在太多了。记得添加环境变量，这个教程里也确定有。app

好，如今你已经安装好了anaconda，请你这样操做：ui

先看一下你的显卡能不能跑gpu。this

首先，你要打开你的英伟达控制面板（不知道咋打开的能够百度一下，为了节省篇幅这里不写了）google

点击帮助-系统信息-组件，而后看到这里：spa

OK，说明你的显卡目前最高支持CUDA10.2.120的版本，固然比这个版本低的你也能够安装，不过也是有要求的，咱们后面再讲。

如今，你要肯定你要用哪一个版本的tensorflow-gpu，我我的认为，要先肯定tensorflow-gpu的版本才能确认CUDA和cudnn。通常说来，版本越高对硬件的要求就越高，我为了折中，选择了1.8.0的版本。至于你怎么选择呢，能够看你的硬件了，若是硬件还不错，能够选择高一点的版本。并且有的代码特别娇气，只能用特定版本的tensorflow跑，因此在安装前最好选择一个合适的gpu版本，而不是瞎装。

咱们能够从tensorflow的官网上看到这样一幅图：

不过对你来讲这张图必定比较老了，由于google公司不断地推陈出新，在你看到这篇攻略的时候，估计已经升级了好几代了。不过不要紧，选一个差很少的就行，不要喜新厌旧，最新的版本有可能存在兼容问题。

我选择的是1.8.0的版本，能够看到它的要求是python3.5-3.6，咱们选择3.6，还须要VS2015，还须要CUDA9和cudnn7.

OK，这就是咱们须要的全部材料了，下面就是去下载了！

下载的内容放在这里，老老实实下载下来：

CUDA：https://developer.nvidia.com/cuda-toolkit-archive

Cudnn：https://developer.nvidia.com/rdp/cudnn-archive

Tensorflow：https://github.com/fo40225/tensorflow-windows-wheel

DXSDK_jun10.exe ：https://www.microsoft.com/en-us/download/details.aspx?id=6812
微软的DirectX Software Development Kit ，安装它为了编译后年的cuda_samples

打开上面的连接选择你想要的版本，下载下来，必定注意CUDA和cudnn的版本搭配。

下面开始安装咯！

首先在anaconda里建立一个新的环境，像这样

这里设定的是python3.6，这是出于上面tensorflow那张图，若是你安装的gpu版本较低，可能须要安装python3.5。

建立完成后什么都不须要作，咱们去作别的事。

如今你须要打开你的vs2015进行安装，下载完成后应该是获得这些东西：

双击，等待一段时间，

按照图示操做，而后点击下一步。

咱们选择vs只是为了这个c++环境，若是你想用vs跑python也能够，不过我是用的pycharm。毕竟先入为主了。

等安装完成咱们进入下一步。

对了，这里有个事要说，若是你是第一次安装vs，那么恭喜你。若是你以前安装过vs，那就太好了，你须要把原来的vs完全清除干净。至于怎么清除，你能够看看这个：

https://blog.csdn.net/a359877454/article/details/52679041

好的，如今须要安装的是这个：

DXSDK_jun10.exe

不用管，直接安装就能够，无论最后有没有error，只要过一遍就ok

接下来就是重头戏CUDA了，你要把你安装版本的所有内容下载下来，包括补丁包，而后先安装本体，再安装补丁包。

这是我下载的全部CUDA，一个本体+4个补丁包

若是一开始提示你不兼容，也不要紧，只要把

里最后面那个driver components取消掉就能够了。

都安装完以后，检测一下是否安装成功：

解压cudnn，

里面有bin、include、lib三个目录，将三个文件夹复制到CUDA对应文件夹（其实是将cuDNN这三个目录中的文件，添加到CUDA对应bin、include、lib文件夹中，CUDA对应文件夹不须要删除，也不会有文件被覆盖），默认文件夹在：C:\ProgramFiles\NVIDIA GPU Computing Toolkit\CUDA\v9.0，解压后的cudnn里除了这三个文件夹外还有一个文件，也须要放到v9.0的文件夹下

这一步完成以后须要编译一下cuda_samples，就是打开

的文件，这里要选择你对应版本的vs文件，好比我用的vs2015，因此我打开vs2015.sln

打开后是这儿样的：

注意上面红框部分要选为64位和Release

而后在有边框找到1_ Utilities，而后右键选择Build

稍等片刻，下方会出现这样的字样：

那就对咯！

配置完成后，咱们能够验证是否配置成功，主要使用CUDA内置的deviceQuery.exe 和 bandwithTest.exe：首先启动cmd，cd到安装目录下的C:\ProgramFiles\NVIDIA GPU Computing Toolkit\CUDA\v9.0\extras\demo_suite,而后分别执行bandwidthTest.exe和deviceQuery.exe,

咱们须要关注的是最后的result是否是=pass

如今环境基本搭建完成了，最后一步就是安装gpu了。

我安装gpu的方法有点另类，是下载完成后再安装，而不少博主的安装方法是直接用pip或者是conda

我是小白哈，因此就下载后安装了，由于pip的速度太慢了

咱们要作的就是找到下载好的whl文件

而后打开

首先要输入activate tf-gpu，这个tf-gpu是你当初给anaconda新建的环境记得名字

而后输入pip install，按一下空格，别回车，把你的whl文件拖进来，而后回车

接下来又是漫长的等待，直到安装完成。

安装完成，你的gpu就完全搭建好了。如今你只须要打开你的pycharm，而后在file-setting-project；code里指定anaconda新环境的python（好比我这里是tf-gpu的python3.6）

而后点击肯定就能够了。如今你就能够试试你的代码能不能在gpu里运行了！

须要注意一点，既然你已经装了gpu的tensorflow，就不要再装cpu了，一个环境下有俩tensorflow会起冲突。

在安装完这一切以后，咱们能够输入如下代码，来检测还存在哪些问题：

import ctypes
import imp
import sys


def main():
    try:
        import tensorflow as tf
        print("TensorFlow successfully installed.")
        if tf.test.is_built_with_cuda():
            print("The installed version of TensorFlow includes GPU support.")
        else:
            print("The installed version of TensorFlow does not include GPU support.")
        sys.exit(0)
    except ImportError:
        print("ERROR: Failed to import the TensorFlow module.")

    candidate_explanation = False

    python_version = sys.version_info.major, sys.version_info.minor
    print("\n- Python version is %d.%d." % python_version)
    if not (python_version == (3, 5) or python_version == (3, 6)):
        candidate_explanation = True
        print("- The official distribution of TensorFlow for Windows requires "
              "Python version 3.5 or 3.6.")

    try:
        _, pathname, _ = imp.find_module("tensorflow")
        print("\n- TensorFlow is installed at: %s" % pathname)
    except ImportError:
        candidate_explanation = False
        print("""
- No module named TensorFlow is installed in this Python environment. You may
  install it using the command `pip install tensorflow`.""")

    try:
        msvcp140 = ctypes.WinDLL("msvcp140.dll")
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'msvcp140.dll'. TensorFlow requires that this DLL be
  installed in a directory that is named in your %PATH% environment
  variable. You may install this DLL by downloading Microsoft Visual
  C++ 2015 Redistributable Update 3 from this URL:
  https://www.microsoft.com/en-us/download/details.aspx?id=53587""")

    try:
        cudart64_80 = ctypes.WinDLL("cudart64_80.dll")
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'cudart64_80.dll'. The GPU version of TensorFlow
  requires that this DLL be installed in a directory that is named in
  your %PATH% environment variable. Download and install CUDA 8.0 from
  this URL: https://developer.nvidia.com/cuda-toolkit""")

    try:
        nvcuda = ctypes.WinDLL("nvcuda.dll")
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'nvcuda.dll'. The GPU version of TensorFlow requires that
  this DLL be installed in a directory that is named in your %PATH%
  environment variable. Typically it is installed in 'C:\Windows\System32'.
  If it is not present, ensure that you have a CUDA-capable GPU with the
  correct driver installed.""")

    cudnn5_found = False
    try:
        cudnn5 = ctypes.WinDLL("cudnn64_5.dll")
        cudnn5_found = True
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'cudnn64_5.dll'. The GPU version of TensorFlow
  requires that this DLL be installed in a directory that is named in
  your %PATH% environment variable. Note that installing cuDNN is a
  separate step from installing CUDA, and it is often found in a
  different directory from the CUDA DLLs. You may install the
  necessary DLL by downloading cuDNN 5.1 from this URL:
  https://developer.nvidia.com/cudnn""")

    cudnn6_found = False
    try:
        cudnn = ctypes.WinDLL("cudnn64_6.dll")
        cudnn6_found = True
    except OSError:
        candidate_explanation = True

    if not cudnn5_found or not cudnn6_found:
        print()
        if not cudnn5_found and not cudnn6_found:
            print("- Could not find cuDNN.")
        elif not cudnn5_found:
            print("- Could not find cuDNN 5.1.")
        else:
            print("- Could not find cuDNN 6.")
            print("""
  The GPU version of TensorFlow requires that the correct cuDNN DLL be installed
  in a directory that is named in your %PATH% environment variable. Note that
  installing cuDNN is a separate step from installing CUDA, and it is often
  found in a different directory from the CUDA DLLs. The correct version of
  cuDNN depends on your version of TensorFlow:

  * TensorFlow 1.2.1 or earlier requires cuDNN 5.1. ('cudnn64_5.dll')
  * TensorFlow 1.3 or later requires cuDNN 6. ('cudnn64_6.dll')

  You may install the necessary DLL by downloading cuDNN from this URL:
  https://developer.nvidia.com/cudnn""")

    if not candidate_explanation:
        print("""
- All required DLLs appear to be present. Please open an issue on the
  TensorFlow GitHub page: https://github.com/tensorflow/tensorflow/issues""")

    sys.exit(-1)


if __name__ == "__main__":
    main()

若是获得的结果是这样的，那么恭喜你，你成功安装了tensorflow-gpu，以后你能够随便检测你编写的代码了！

说明：以上内容是博主总结大量安装攻略后得出的，本身也是按照这方法安装的，虽然出了不少差错，但最后仍是成功搭建了gpu环境。

上述内容不必定对任何人都有用，只是博主本身这样安装的，如实地总结了出来。若是你按照上述方法安装失败，能够百度一下别人的方法，放心，安装失败不会对你的电脑形成影响，我安装失败了3次，用了三天时间，最后才安装成功。