Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 1.45 KB

File metadata and controls

26 lines (21 loc) · 1.45 KB

Flipkart Object Localization

This is my attempt on the Object Localization problem from the Flipkart Grid challenge.

Problem Statement

The problem is an agnostic object detection problem in which only bounding boxes must be predicted without any class labels. Also any pre-trained weights are not allowed.
The training data was provided by Flipkart and consisted of 14k images.
https://dare2compete.com/o/Flipkart-GRiD-Teach-The-Machines-2019-74928

Solution

The solution that was propose to solve the object localization problem consists of a CNN having 4 neurons in the output layer where each corresponds to the 4 output values i.e the bounding boxes coordedinates to be determined. Therefore it will be solved as a regression problem.

Given Image Given Image Bounded Box Bounded Box

Architecture

The architecture of the CNN is based on the popular VGG-16 architecture commonly used for image classification problems. The final layer has been modified to output four bounding box coordinates and the activation has changed from softmax to relu.

Results

The CNN model was trained on Google Colab for 10 epochs. It achieved 79% accuracy when tested on the test set.
Due to the inavailability of images for training such a model the accuracy is a bit less.
Also perhaps a bit more data augmentation would have helped.