Chihuahua or Muffin?

Categories computer vision

Detecting Chihuahuas vs Muffins

A while ago I stumbled upon this funny post by Mariya Yao which talks about a series of memes which put together completely unrelated objects that are very resembling to each other. It further compares different machine learning classification APIS to know which one correctly predicts best.

Based on this, I wonder how good is a raw pretrained imagenet network using the pretrained network resnet34, provided by pytorch, using the fastai library.
According to this, the model has an accuracy of 73.3%. I will only test on these edge cases studied by Mariya.

Also, I wanted to show how little code is needed to do this using the fastai library. It is extraordinary how fast it can be! I also used the library to further train a model with a small set of muffin and chihuahuas images using transfer learning.

This experiment was made using fast.ai 1.0.38 and Pytorch 1.0.

Setup

I start by including the fastai libraries and setting up some other necessary code to make it look nicer in OSX

In [1]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2

from fastai import *
from fastai.vision import *
In [2]:
# Use in case of retina display
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina')
In [3]:
path = Path('/home/duguet/data/chihuahua-muffin')

Here, I set up my data folder. Inside this folder, the easiest way for me is to separate the classes in folders. This would be the structure:

In [4]:
path_string = str(path)
!tree -d $path_string
/home/duguet/data/chihuahua-muffin
├── models
├── other
├── test
│   ├── chihuahua
│   └── muffin
└── train
    ├── chihuahua
    └── muffin

8 directories

I created a folder train in which we are not interested right now for predicting, and a folder test which will have the images Mariya used for testing these algorithms. We see that for both folders, I created the subfolders chihuahua and muffin, which correspond to their different classes.

In [5]:
testset = path/'test'; trainset=path/'train'

In short, I downloaded the 16 images used for testing and put them in the testset folder.

In [6]:
!tree $path_string/test
/home/duguet/data/chihuahua-muffin/test
├── chihuahua
│   ├── test12.png
│   ├── test14.png
│   ├── test15.png
│   ├── test2.png
│   ├── test4.png
│   ├── test6.png
│   ├── test7.png
│   └── test9.png
└── muffin
    ├── test1.png
    ├── test10.png
    ├── test11.png
    ├── test13.png
    ├── test16.png
    ├── test3.png
    ├── test5.png
    └── test8.png

2 directories, 16 files

Test ImageNet Pretrained Network

The original Imagenet dataset was prepared to detect more than 20k categories, but the ILSVRC competition which was the biggest image classification competition for years, used 1000 synsets (or classes) to classify. The model available for Pytorch (and inherited by the fastai library) is trained for this amount of classes.

If you want to predict using the pretrained model, there are 2 ways to do that: the PyTorch way and the fastai way.

The PyTorch way

You need to download a file with the list of the classes of the model. You can find one here.

In [7]:
s = open('/home/duguet/data/imagenet/imagenet1000_clsid_to_human.txt', 'r')
imagenet_classes = eval(s.read())
#imagenet_classes = pickle.load(open('/home/duguet/data/imagenet/classes.pkl', 'rb'))

To use a model for prediction, it needs to be set into eval mode.

In [8]:
imagenet_model = models.resnet34(pretrained=True).eval()

We also need to find the class numbers corresponding to chihuahua and muffin.

In [9]:
print(list(imagenet_classes.values()).index('Chihuahua'))
print(list(imagenet_classes.values()).index('muffin'))
151
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-603ad247df74> in <module>
      1 print(list(imagenet_classes.values()).index('Chihuahua'))
----> 2 print(list(imagenet_classes.values()).index('muffin'))

ValueError: 'muffin' is not in list

Unfortunately muffin is not in the list of the 1000 clases that the pretrained network provides. In the imagenet website, we see that muffin appears to be in the misc subset. So the results for muffin will try to show the closest thing to it in the model. I’m curious to see what it is.

We go over all the files in the test set (both subfolders) and print out their image and predicted class.

In [10]:
import glob
In [11]:
# Evaluate the old PyTorch way
testlist = glob.glob(str(testset/'*/*'))

fig,ax = plt.subplots(4,4,figsize=(15,10))

for i,ax_ in enumerate(ax.flatten()):
    try:
        img_file = testlist[i]
        img = open_image(img_file).resize(224)
        batch = torch.stack([img.data])
        out = imagenet_model(batch)
        solution = to_np(out.detach())
        idx = solution.argmax()
        img.show(ax=ax_, title=imagenet_classes[idx])
#         print(f'file: {img_file.name}. Detected {imagenet_classes[idx]}, {solution.max()}')
    except:
        continue