Setup and installation for machine learning with CUDA-8.0.61 and cuDNN-6 on Ubuntu 16.04 LTS- Part 2

Tensorflow installation is as simple as running few commands if you have the correct version of CUDA and cuDNN.

To start with I will explain how to uninstall the previous version of CUDA/cuDNN which is installed in Part-1. It is important to know how to configure the installation since this utils can break anytime due to version changes and frequent updates. So every time reinstalling OS is not a solution.

To remove nvidia drivers use this command,

sudo /usr/bin/nvidia-uninstall
sudo apt-get remove --purge nvidia-*
sudo apt-get --purge remove nvidia-cuda* 

Just to make sure, try listing out the packages,


apt list --installed | grep cuda

and uninstall each package by,

sudo apt-get remove <package>

Disable nouveau driver(free driver for nvidia cards comes with ubuntu) for nvidia driver installation.

Edit this file,

vi /etc/modprobe.d/blacklist-nouveau.conf

with this content,

blacklist nouveau
options nouveau modeset=0

Regenerate the kernel initramfs:(initramfs is used to mount root file system / while boot)

sudo update-initramfs -u
sudo reboot

Reboot system.

Install CUDA-8.0 and cuBLAS patch form .deb file downloaded from NVIDIA CUDA archives.

cuda-8.0 installation

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda-8.0
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get upgrade cuda-8.0

Once the installation is done, install cuDNN 6. Download .deb file form cuDNN download page and install. Install, runtime library, development library and code samples.

cudnn6 for cuda-8.0


sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb
sudo dpkg -i libcudnn6-dev_6.0.21-1+cuda8.0_amd64.deb
sudo dpkg -i libcudnn6-doc_6.0.21-1+cuda8.0_amd64.deb

Add this to .bashrc

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Go to cuDNN samples directory and compile the sample program.

cp /usr/src/cudnn_samples_v6 ~/.
cd ~/cudnn_samples_v6/mnistCUDNN
make clean
make
./mnistCUDNN

If you get this error,

cudnnGetVersion() : 6021 , CUDNN_VERSION from cudnn.h : 6021 (6.0.21)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 6 Capabilities 6.1, SmClock 1417.5 Mhz,
 MemSize (Mb) 4035, MemClock 3504.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

Testing single precision
CUDNN failure
Error: CUDNN_STATUS_INTERNAL_ERROR
mnistCUDNN.cpp:394
Aborting...

just run as root,

cuDNN 6 sample testing

Test Passed!! You now have CUDA-8.0 with cuDNN-6

To install tensorflow, execute this commands. For python 2.7,

sudo apt-get install libcupti-dev

sudo apt-get install python-pip python-dev

pip install tensorflow-gpu

or for python 3.5,

sudo apt-get install libcupti-dev

sudo apt-get install python3-pip python3-dev

pip3 install tensorflow-gpu

After installation, test by calling sample program,

# Python
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

tensorflow on nvidia 1050ti

If everything is installed correctly, this will print out the GPU device tensorflow is running.

 

 

Setup and installation for machine learning with CUDA and cuDNN on Ubuntu 16.04 LTS- Part 1

Important: This is to install CUDA 9.0 with CuDNN 7, this will not work with tensorflow 1.4(at the time of writing). I realized this after the installation. I will go through tensorflow 1.4 with CUDA 8 and cuDNN 6 in the next post.

Tensorflow 1.4 release notes

All our prebuilt binaries have been built with CUDA 8 and cuDNN 6. We anticipate releasing TensorFlow 1.5 with CUDA 9 and cuDNN 7.

Step 1: Download ubuntu .iso from ubuntu/downloads. This will download a .iso image file to your PC. In my case, the file is ubuntu-16.04.03-desktop-amd64.iso

Step 2: Create bootable USD stick or  burn a dvd from the image for installation. Install Ubuntu on the PC. I have downloaded .iso in Windows 10 and used the default dvd writer program to burn to a disk.

Step 3: Boot into the fresh installation. Open a terminal in ubuntu, update the installation.

sudo apt-get update

Step 4: Download NVIDIA drivers.

I have updated the drivers through additional drivers menu in ubuntu.
Make sure that you have, NVIDIA graphics driver 384.81 or newer for CUDA 9.

Step 5: Download and Install CUDA.

Method 1 : Download the .run file (cuda_9.0.176_384.81_linux.run for ubuntu 16.04).

Press ctrl+alt+f1 to stop X server and go to tty mode, execute the command.

sudo sh cuda_9.0.176_384.81_linux.run

Accept the licence terms, skip install driver which comes with it (at least it did not work for me), install OpenGL driver, allow permission to manage x server configuration, accept all default paths.

NVIDIA driver CUDA 9

If the driver installation fails, got to /tmp and remove the X server lock files, retry the installation.

cd /tmp
rm -rf .X*

Press ctrl+alt+f7 to return to login screen, once the installation completes.
Install third party lib for building CUDA samples,

sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev​

Go to the samples directory, eg:/usr/local/cuda/samples/5_Simulations/particles and try,

sudo make

if everything goes well it will compile and create an executable, run it by,

./particles

this will show the demo application

CUDA 9 demo

To check device status,

/usr/local/cuda-8.0/samples/1_Utilities/deviceQuery$ ./deviceQuery

CUDA-8.0 deviceQuery

To uninstall CUDA, if something goes wrong, got to /usr/local/cuda/bin and run the uninstall script.

The default installation path will be /usr/local/cuda/

Method 2: installing from .deb


sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

Try compiling the sample program to check if CUDA is installed fine.

Add this to .bashrc

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Step 6: Install cuDNN.

Method 1: Download .deb file form cuDNN download page and install. Install, runtime library, development library and code samples.


sudo dpkg -i libcudnn7_7.0.3.11-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.3.11-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.3.11-1+cuda9.0_amd64.deb

Method 2 :​ install for downloaded tar file, cudnn-9.0-linux-x64-v7.tgz

If there is any error associated with running cuDNN, check the libcudnn*.so* files are present in /usr/local/cuda/lib64 and cudnn.h file is present in /usr/local/cuda/include

If you are installing form a tar file, cuDNN can be installed by simply copying these files to respective folder of CUDA installation.


tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

Go to cuDNN samples directory and compile the sample program.


cd /usr/src/cudnn_samples_v7/conv_sample

sudo make clean

sudo make

This will compile the code and show the result and we can verify the cuDNN installation.


$ sudo ./conv_sample
Testing single precision
Testing conv
^^^^ CUDA : elapsed = 4.41074e-05 sec,
Test PASSED
Testing half precision (math in single precision)
Testing conv
^^^^ CUDA : elapsed = 4.00543e-05 sec,
Test PASSED

Cool !! CUDA 9.0 with cuDNN 7 is installed in your system.

Support and documentation.

CUDA developer zone

NVIDIA Linux Display driver archive

I got an error while compiling the mnist code sample, not sure what is the issue, just pasting the error below,

 
/usr/src/cudnn_samples_v7/mnistCUDNN$ sudo make
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -IFreeImage/include  -m64    -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_53,code=compute_53 -o fp16_dev.o -c fp16_dev.cu
g++ -I/usr/local/cuda/include -IFreeImage/include   -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -IFreeImage/include   -o mnistCUDNN.o -c mnistCUDNN.cpp
In file included from /usr/local/cuda/include/channel_descriptor.h:62:0,
                 from /usr/local/cuda/include/cuda_runtime.h:90,
                 from /usr/include/cudnn.h:64,
                 from mnistCUDNN.cpp:30:
/usr/local/cuda/include/cuda_runtime_api.h:1683:101: error: use of enum ‘cudaDeviceP2PAttr’ without previous declaration
  __cudart_builtin__ cudaError_t CUDARTAPI cudaDeviceGetP2PAttribute(int *value, enum cudaDeviceP
                                                                                      ^
/usr/local/cuda/include/cuda_runtime_api.h:2930:102: error: use of enum ‘cudaFuncAttribute’ without previous declaration
 __cudart_builtin__ cudaError_t CUDARTAPI cudaFuncSetAttribute(const void *func, enum cudaFuncAtt
                                                                                      ^
In file included from /usr/local/cuda/include/channel_descriptor.h:62:0,
                 from /usr/local/cuda/include/cuda_runtime.h:90,
                 from /usr/include/cudnn.h:64,
                 from mnistCUDNN.cpp:30:
/usr/local/cuda/include/cuda_runtime_api.h:5770:92: error: use of enum ‘cudaMemoryAdvise’ without previous declaration
  __host__ cudaError_t CUDARTAPI cudaMemAdvise(const void *devPtr, size_t count, enum cudaMemoryA
                                                                                      ^
/usr/local/cuda/include/cuda_runtime_api.h:5827:98: error: use of enum ‘cudaMemRangeAttribute’ without previous declaration
 t__ cudaError_t CUDARTAPI cudaMemRangeGetAttribute(void *data, size_t dataSize, enum cudaMemRang
                                                                                      ^
/usr/local/cuda/include/cuda_runtime_api.h:5864:102: error: use of enum ‘cudaMemRangeAttribute’ without previous declaration
 cudaError_t CUDARTAPI cudaMemRangeGetAttributes(void **data, size_t *dataSizes, enum cudaMemRang
                                                                                      ^
Makefile:200: recipe for target 'mnistCUDNN.o' failed
make: *** [mnistCUDNN.o] Error 1