Train a neural network with custom images using NVIDIA DIGITS on your machine

It was a surprise when NVIDIA DIGITS framework was up and running in my 10 year old machine in no time. So I decided I start my blog with a tutorial about it.

Until now I was hand coding neural networks in tensorflow/tflearn and feeding data directly into the network. There is less time to play with the network and optimise while majority of the time will be spend to make it run.

NVIDIA DIGITS, is a framework which gives a nice UI to create your network, without even touching the code, unless u really want to. It does the heavy lifting like parsing test/train data and creating input database. I have no idea how it arranges the data, If you have used sklearn, you used to do the hard work arranging data in to arrays, so you will be clear of each and every bit. I will extend the tutorial as I figure out things. DIGITS then create models/network from existing models and this can be customised.

I will explain here, how to create/train a network on your machine which can detect cat and dogs from images using data from kaggle, download the data here. Extract the tar file and arrange the data according to Step2.

Step 1: Signup for NVIDIA developer network and download and install NVIDIA DIGITS.

I downloaded DIGITS 5 since I had a previous installation of CUDA 8. At the time of writing this blog, there was no local installer for DIGITS 6. A follow up tutorial will be there when I move to DIGITS on Docker. That will happen when I upgrade my machine. Currently I use an old machine with Pentium D desktop paired with a GTX 1050 Ti running on Ubuntu 16.04.3. I am waiting to upgrade to Intel kabylake, So I have to upgrade motherboard, RAM and CPU. When this happens, I have to have a fresh installation of all(CUDA/CuDNN/Python3/OpenCV/tensorflow/Anaconda/Arduino etc). So there will be more tutorials to follow.


This will download a deb file, install it using the below commands.
sudo dpkg -i nv-deep-learning-repo-ubuntu1604-ga-cuda8.0-digits5.0_1-1_amd64.deb
sudo apt-get update
sudo apt-get install digits

This will start DIGITS on a local webserver, which can be accessed by going to http://localhost



Step 2: Create database for training from the images you have.

There should be train and validate directories. Under train/validate, there should be directories for each separate class which contain the images. Here, cats and dogs are the two classes.

Click on DIGITS webpage -> Datasets->Images dropdown->Classification
Fill in, path to test images and validation images, give a name and click create. This will do parsing of images and creating a database for feeding it into the neural network model.

NVIDIA DIGITS new dataset creation

Step3: Create a model and train, thats all!!

Click on DIGITS webpage -> Models->Images dropdown->Classification
Choose Alexnet or Googlenet, give a name and click create. This will create your neural network in caffe framework and it will start training. The parameters in ‘Solver options’ and ‘Data transformations’ can be changed to try with different optimizers. the network layers can be edited by clicking ‘Customize’.

NVIDIA DIGITS new model creation

I got an ERROR for Out of memory, but it got resolved when I recreated the model. If anyone knows about the reason please let me know in the comments. My system swap space was changed to 30GB so it should not fail, I assume it was because of GPU memory availability at 4GB. Maybe we should be able to redirect to system memory or I have to upgrade my GPU. Anyway it worked for my surprise.

NVIDIA DIGITS Error out of memory

The CPU utilisation shoots up to 164%!!, I have no idea why, may be the CPU is a severe bottleneck.

NVIDIA DIGITS cats and dogs training


Step4: Test the network with a random image.

While training, we can observe the accuracy and loss in the graph. Once the training is over, upload a random image and click on ‘Classify one’ to inference.


And It works!


This is so fast considering the amount of debugging time spent  when I write by myself. The whole process took me about 20 mins till the output. In the following post I will try to explain how to make it standalone python program.