CNN Applications using Example of Dog Breed Classifier

Sushant Agarwal
4 min readMay 26, 2020

Overview

This is a project that aims to detect dog breed based on the photo of the dog. If the photo or image contains a human face (or alien face), then the application will return the breed of dog that most resembles this person.

There are 266 individual breeds of dog pictured on the website dogtime. If you are like me, you would be able to identify not more than 10–15 of the breeds.

The steps that were followed to work through the project were the following:

  • Import Datasets
  • Detect Humans
  • Detect Dogs
  • Create a CNN to classify Dog Breeds (from scratch)
  • Use a CNN to classify Dog Breeds (using Transfer Learning)
  • Test algorithm

In this project we are going to use PyTorch to creat our classifier

PyTorch is an open-source machine learning library for Python, based on Torch, used for applications such as natural language processing. It is primarily developed by Facebook’s artificial-intelligence research group, and Uber’s “Pyro” Probabilistic programming language software is built on it.

About Dataset

This project includes 2 types of data.

The dog data - this contains the images of dog based on their breed. the doogs are classified into 133 categories based on their breed in this dataset.

The Human data - this a simple human images data required to test the model.

How can we apply CNN on Image datasets for classification tasks?

Convolutional neural networks (CNN) — the concept behind recent breakthroughs and developments in deep learning.

CNNs have broken the mold and ascended the throne to become the state-of-the-art computer vision technique. These convolutional layers can be used on images to learn the features in the image and then add some function to calculate the weight of the feature that the pixel has in the image.

Tasks that can be done using CNN’s:

  • Using CNNs to Classify Hand-written Digits on MNIST Dataset
  • Identifying Images from CIFAR-10 Dataset using CNNs
  • Categorizing Images of ImageNet Dataset using CNNs
  • Not to forget even detect Dog Breed using CNN

Can Pre-trained models be used for this CNN training tasks?

In computer vision, transfer learning is usually expressed through the use of pre-trained models. A pre-trained model is a model that was trained on a large benchmark dataset to solve a problem similar to the one that we want to solve. Actually, several state-of-the-art results in image classification are based on transfer learning solutions (Krizhevsky et al. 2012, Simonyan & Zisserman 2014, He et al. 2016). A comprehensive review on transfer learning is provided by Pan & Yang (2010).

Sometimes CNN can be as deep as 151 layers as in case of ResNet-151 layer network. More deep the network more are the number of training parameters and more is the training time and requirement of resources. Using the method of Transfer Learning we can save a considerable amount of time and resources.

There are many pretrained model available like:

  • VGG16
  • VGG19
  • ResNet50
  • InceptionV3
  • Xception
  • Inception-ResNetV2

What is the difference in retraining the complete model vs using the pre-trained model by initalizing the weights?

The above image helps visualize a few possibilities when training a model.

The image on the right(Strategy 3) represents a model in which the convolution base of a trained model is frozen, in this the bottleneck features obtained from it are used to retrain the further layers. This is what a typical pre-trained models works.

The Strategy 2 in the middle represents a model where, we are freezing some of the layers and retraining the left part , then the findinga are used for prediction.

Finally the Strategy 1 on the left represents training a model from scratch. This can be a custom model or can be some of the standard model which are trained completely on the new dataset.

In the pre-train and fine-tune paradigm, model training starts with some learned weights that come from a pre-trained model. This has become more standard, especially when it comes to vision-related tasks like object detection and image segmentation.

Comparision

  • Training time : Strategy 1 > Strategy 2> Strategy 3
  • The computation power required in strategy 1 may be far more than in strategy 3.

Below the link to a GITHUB repository is given where a dataset on Dog Breed is used. In this all 3 strategies are implemented.

Link to the GITHUB Reposotory : Dog Breed Classifier

Follow me at

Linkedin : https://www.linkedin.com/in/sushantag9/

Twitter : @sushantag9

If you like the post please share your commemts & a 👏

--

--

Sushant Agarwal

My passion is to learn new technologies and apply them. I have an experience of working with Edge devices like Arduino, RaspBerry Pie, NVIDIA Jetson Nano and In