Important: This is to install CUDA 9.0 with CuDNN 7, this will not work with tensorflow 1.4(at the time of writing). I realized this after the installation. I will go through tensorflow 1.4 with CUDA 8 and cuDNN 6 in the next post.
Tensorflow 1.4 release notes
All our prebuilt binaries have been built with CUDA 8 and cuDNN 6. We anticipate releasing TensorFlow 1.5 with CUDA 9 and cuDNN 7.
Step 1: Download ubuntu .iso from ubuntu/downloads. This will download a .iso image file to your PC. In my case, the file is ubuntu-16.04.03-desktop-amd64.iso
Step 2: Create bootable USD stick or burn a dvd from the image for installation. Install Ubuntu on the PC. I have downloaded .iso in Windows 10 and used the default dvd writer program to burn to a disk.
Step 3: Boot into the fresh installation. Open a terminal in ubuntu, update the installation.
sudo apt-get update
Step 4: Download NVIDIA drivers.
I have updated the drivers through additional drivers menu in ubuntu.
Make sure that you have, NVIDIA graphics driver 384.81 or newer for CUDA 9.
Step 5: Download and Install CUDA.
Method 1 : Download the .run file (cuda_9.0.176_384.81_linux.run for ubuntu 16.04).
Press ctrl+alt+f1 to stop X server and go to tty mode, execute the command.
sudo sh cuda_9.0.176_384.81_linux.run
Accept the licence terms, skip install driver which comes with it (at least it did not work for me), install OpenGL driver, allow permission to manage x server configuration, accept all default paths.

If the driver installation fails, got to /tmp and remove the X server lock files, retry the installation.
cd /tmp
rm -rf .X*
Press ctrl+alt+f7 to return to login screen, once the installation completes.
Install third party lib for building CUDA samples,
sudo apt-get install g++ freeglut3-dev build-essential libx11-dev libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
Go to the samples directory, eg:/usr/local/cuda/samples/5_Simulations/particles and try,
sudo make
if everything goes well it will compile and create an executable, run it by,
./particles
this will show the demo application

To check device status,
/usr/local/cuda-8.0/samples/1_Utilities/deviceQuery$ ./deviceQuery

To uninstall CUDA, if something goes wrong, got to /usr/local/cuda/bin and run the uninstall script.
The default installation path will be /usr/local/cuda/
Method 2: installing from .deb
sudo dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda
Try compiling the sample program to check if CUDA is installed fine.
Add this to .bashrc
export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
Step 6: Install cuDNN.
Method 1: Download .deb file form cuDNN download page and install. Install, runtime library, development library and code samples.
sudo dpkg -i libcudnn7_7.0.3.11-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.3.11-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.3.11-1+cuda9.0_amd64.deb
Method 2 : install for downloaded tar file, cudnn-9.0-linux-x64-v7.tgz
If there is any error associated with running cuDNN, check the libcudnn*.so* files are present in /usr/local/cuda/lib64 and cudnn.h file is present in /usr/local/cuda/include
If you are installing form a tar file, cuDNN can be installed by simply copying these files to respective folder of CUDA installation.
tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
Go to cuDNN samples directory and compile the sample program.
cd /usr/src/cudnn_samples_v7/conv_sample
sudo make clean
sudo make
This will compile the code and show the result and we can verify the cuDNN installation.
$ sudo ./conv_sample
Testing single precision
Testing conv
^^^^ CUDA : elapsed = 4.41074e-05 sec,
Test PASSED
Testing half precision (math in single precision)
Testing conv
^^^^ CUDA : elapsed = 4.00543e-05 sec,
Test PASSED
Cool !! CUDA 9.0 with cuDNN 7 is installed in your system.
Support and documentation.
CUDA developer zone
NVIDIA Linux Display driver archive
I got an error while compiling the mnist code sample, not sure what is the issue, just pasting the error below,
/usr/src/cudnn_samples_v7/mnistCUDNN$ sudo make
/usr/local/cuda/bin/nvcc -ccbin g++ -I/usr/local/cuda/include -IFreeImage/include -m64 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_53,code=compute_53 -o fp16_dev.o -c fp16_dev.cu
g++ -I/usr/local/cuda/include -IFreeImage/include -o fp16_emu.o -c fp16_emu.cpp
g++ -I/usr/local/cuda/include -IFreeImage/include -o mnistCUDNN.o -c mnistCUDNN.cpp
In file included from /usr/local/cuda/include/channel_descriptor.h:62:0,
from /usr/local/cuda/include/cuda_runtime.h:90,
from /usr/include/cudnn.h:64,
from mnistCUDNN.cpp:30:
/usr/local/cuda/include/cuda_runtime_api.h:1683:101: error: use of enum ‘cudaDeviceP2PAttr’ without previous declaration
__cudart_builtin__ cudaError_t CUDARTAPI cudaDeviceGetP2PAttribute(int *value, enum cudaDeviceP
^
/usr/local/cuda/include/cuda_runtime_api.h:2930:102: error: use of enum ‘cudaFuncAttribute’ without previous declaration
__cudart_builtin__ cudaError_t CUDARTAPI cudaFuncSetAttribute(const void *func, enum cudaFuncAtt
^
In file included from /usr/local/cuda/include/channel_descriptor.h:62:0,
from /usr/local/cuda/include/cuda_runtime.h:90,
from /usr/include/cudnn.h:64,
from mnistCUDNN.cpp:30:
/usr/local/cuda/include/cuda_runtime_api.h:5770:92: error: use of enum ‘cudaMemoryAdvise’ without previous declaration
__host__ cudaError_t CUDARTAPI cudaMemAdvise(const void *devPtr, size_t count, enum cudaMemoryA
^
/usr/local/cuda/include/cuda_runtime_api.h:5827:98: error: use of enum ‘cudaMemRangeAttribute’ without previous declaration
t__ cudaError_t CUDARTAPI cudaMemRangeGetAttribute(void *data, size_t dataSize, enum cudaMemRang
^
/usr/local/cuda/include/cuda_runtime_api.h:5864:102: error: use of enum ‘cudaMemRangeAttribute’ without previous declaration
cudaError_t CUDARTAPI cudaMemRangeGetAttributes(void **data, size_t *dataSizes, enum cudaMemRang
^
Makefile:200: recipe for target 'mnistCUDNN.o' failed
make: *** [mnistCUDNN.o] Error 1