经过近30小时探索,终于把深度学习环境tensorflow gpu版本 安装成功。最难搞定就是编译过程,就是14步,又耗时又不好找问题。找了各种文章,很多都无法成功。上面分享的是比较靠谱的文章,只不过没有安装jupyter,我在此基础上加入了安装jupyter:
1、配置goolge 服务器的ubuntu 16.04,可以自行配置,之后进入terminal
2、升级
sudo apt-get update
sudo apt-get upgrade
3、安装 Anaconda3
https://www.howtoing.com/how-to-install-the-anaconda-python-distribution-on-ubuntu-16-04
curl -O https://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-x86_64.sh
我们现在可以通过SHA-256校验和通过加密散列验证来验证安装程序的数据完整性。我们将使用sha256sum命令以及脚本的文件名:
sha256sum Anaconda3-5.1.0-Linux-x86_64.sh
bash Anaconda3-5.1.0-Linux-x86_64.sh
一路yes
Prepending PATH=/home/sammy/anaconda3/bin to PATH in /home/sammy/.bashrc A backup will be made to: /home/sammy/.bashrc-anaconda3.bak
source ~/.bashrc
一旦你这样做,你可以验证你的安装通过使用conda命令,例如与list :
conda list
4、 Verify You Have a CUDA-Capable GPU:
lspci | grep -i nvidia
5、Verify You Have a Supported Version of Linux
uname -m && cat /etc/*release
6、Install Dependencies
sudo apt-get install build-essential
sudo apt-get install cmake git unzip zip
sudo apt-get install pylint
7、Install linux kernel header
uname -r
sudo apt-get install linux-headers-$(uname -r)
8、Download the NVIDIA CUDA Toolkit:
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda-9.1
9、Reboot the system to load the NVIDIA drivers
sudo reboot
10、Go to terminal and type:
nano ~/.bashrc
in the end of the file, add:
export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
ctrl+x then y to save and exit
source ~/.bashrc
sudo ldconfig
nvidia-smi
11、Install cuDNN 7.1.3
Goto https://developer.nvidia.com/cudnn and download Membership required
After login
Download the following:
cuDNN v7.1.3 Runtime Library for Ubuntu16.04 (Deb)
cuDNN v7.1.3 Developer Library for Ubuntu16.04 (Deb)
cuDNN v7.1.3 Code Samples and User Guide for Ubuntu16.04 (Deb)
Goto downloaded folder and in terminal perform following:
sudo dpkg -i libcudnn7-doc_7.1.3.16-1+cuda9.1_amd64.deb
sudo dpkg -i llibcudnn7_7.1.3.16-1+cuda9.1_amd64.deb
sudo dpkg -i libcudnn7-dev_7.1.3.16-1+cuda9.1_amd64.deb
Verifying cuDNN installation:
cp -r /usr/src/cudnn_samples_v7/ $HOME
cd $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
./mnistCUDNN
If cuDNN is properly installed and running on your Linux system, you will see a message similar to the following:
Test passed!
12、Install Dependencies
libcupti (required)
sudo apt-get install libcupti-dev
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
Bazel (required)
sudo apt-get install pkg-config zip g++ zlib1g-dev unzip
sudo apt-get install openjdk-8-jdk
wget https://github.com/bazelbuild/bazel/releases/download/0.11.1/bazel_0.11.1-linux-x86_64.deb
sudo dpkg -i bazel_0.11.1-linux-x86_64.deb
To install these packages for Python 3.n, issue the following command:
sudo apt-get install python3-numpy python3-dev python3-pip python3-wheel
13、 Configure Tensorflow from source:
source ~/.bashrc
sudo ldconfig
wget https://github.com/tensorflow/tensorflow/archive/v1.7.0.zip
unzip v1.7.0.zip
cd tensorflow-1.7.0
./configure
Give python path in
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Press enter two times
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
Do you wish to build TensorFlow with Apache Kafka Platform support? [y/N]: N
Do you wish to build TensorFlow with XLA JIT support? [y/N]: N
Do you wish to build TensorFlow with GDR support? [y/N]: N
Do you wish to build TensorFlow with VERBS support? [y/N]: N
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N
Do you wish to build TensorFlow with CUDA support? [y/N]: Y
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: 9.1
Please specify the location where CUDA 9.1 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: 7.1.3
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/lib/x86_64-linux-gnu
Do you wish to build TensorFlow with TensorRT support? [y/N]: N
Now we need compute capability which we have noted at step 1 eg. 5.0
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.0] 5.0
Do you want to use clang as CUDA compiler? [y/N]: N
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: /usr/bin/gcc
Do you wish to build TensorFlow with MPI support? [y/N]: N
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: -march=native
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:N
14、Build Tensorflow using bazel
Do following to create symbolic link to cuda/include/math_functions.hpp from cuda/include/crt/math_functions.hpp to fix math_functions.hpp is not found error.
sudo ln -s /usr/local/cuda/include/crt/math_functions.hpp /usr/local/cuda/include/math_functions.hpp
bazel build --config=opt --config=cuda --incompatible_load_argument_is_label=false //tensorflow/tools/pip_package:build_pip_package
This process will take a lot of time. It may take 1 – 2 hours or maybe even more.
The bazel build command builds a script named build_pip_package. Running this script as follows will build a .whl file within the tensorflow_pkg directory:
To build whl file issue following command:
bazel-bin/tensorflow/tools/pip_package/build_pip_package tensorflow_pkg
Activate your virtual environment here if you use.
To install tensorflow with pip:
cd tensorflow_pkg
#python3是默认
pip install tensorflow*.whl
15、安装tensorflow-gpu
pip install tensorflow-gpu==1.7
16、Verify Tensorflow installation
import tensorflow as tf
if not tf.test.gpu_device_name():
warnings.warn('没有找到GPU.')
else:
print('默认的GPU设备: {}'.format(tf.test.gpu_device_name()))
结果
Hello, TensorFlow!
最后的安装完成之后的文件结构
启用jupyter notebook
jupyter notebook --ip 0.0.0.0 --port 8888