Detailed tutorial for Tensorflow speech recognition is here, I am going through the steps not mentioned for initial setup of the code and the issues faced.
Step 1: Download tensorflow source from git
git clone https://github.com/tensorflow/tensorflow.git
this will download tensorflow source tree to the location there it is executed.
Step 2: Training, the training script is located in tensorflow/examples/speech_commands pass the switch –data_url= to stop downloading default speech data from tensorflow. The path for training data can be set in this file. Tensorboard can be opened by this command ‘tensorboard –logdir /tmp/logs’. Go to the url which will get printed after executing the command.
python tensorflow/examples/speech_commands/train.py --data_url=
Step 3: Create a frozen graph after the training ends. It took 1.5hrs for training with a GTX 1050Ti GPU.
python tensorflow/examples/speech_commands/freeze.py \ --start_checkpoint=/tmp/speech_commands_train/conv.ckpt-18000 \ --output_file=/tmp/my_frozen_graph.pb
Step 4: Inference
python tensorflow/examples/speech_commands/label_wav.py \ --graph=/tmp/my_frozen_graph.pb \ --labels=/tmp/speech_commands_train/conv_labels.txt \ --wav=/tmp/speech_dataset/left/a5d485dc_nohash_0.wav
The short voice samples are converted to spectrogram image before processing. A CNN can be used for training on image. To create a spectrogram using the provided tool, go to tensorflow folder which contain ‘configure’ script and run,
this will start building the tensorflow source code. Once this is done use this command to create spectrogram image for a wav file. Make sure to give absolute paths, more of the time I have encountered error because of mismatched paths.
bazel run tensorflow/examples/wav_to_spectrogram:wav_to_spectrogram -- --input_wav= /tensorflow/core/kernels/spectrogram_test_data/short_test_segment.wav --output_image=/tensorflow/tmp/spectrogram.png
this throws an error saying bazel not found. Bazel is a build tool like ant or maven, this is used to build tensorflow.
sudo apt-get install openjdk-8-jdk echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add - sudo apt-get update && sudo apt-get install bazel sudo apt-get upgrade bazel
This is the spectrogram output.
The github page for Tensorflow Object detection API is here.
To use Tensorflow Object detection API,
Step 1: Clone the tensorflow model tree to your PC.
git clone https://github.com/tensorflow/models.git
Step 2: go to research folder, install dependencies, protobuf, export PYTHONPATH.
or follow the detailed steps here
cd models/research sudo apt-get install protobuf-compiler python-pil python-lxml sudo pip install jupyter sudo pip install matplotlib sudo pip install pillow sudo pip install lxml sudo pip install jupyter sudo pip install matplotlib # From models/research/ protoc object_detection/protos/*.proto --python_out=. # From models/research/ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
Step 3: Open default ipython notebook comes with Object detection API
cd models/research/object_detection jupyter notebook object_detection_tutorial.ipynb