Verifying and Validating College ID Images

2020, Dec 9    

Akansh Maurya Amith P Sudhanshu Dubey
Mentor Name: Prof. Kavi Arya, Omkar Manjrekar, Andrea F
Duration of Internship: 08/05/2020−27/06/2020

** I cannot share much about the pipeline and procedure because our paper is under review.

Cover page
Figure 1: Cover page

Abstract

E-yantra organizes various competitions and MOOCs for students across India. The staff spends a significant amount of time validating ID cards uploaded by eYRC, eYIC and MOOCs participants. Through this project we present an efficient system to verify and validate college ID cards. Our approach is to identify valid college ID cards from a pool of images which may contain other photos too. This is done by a deep image classification model. To extract the textual information from the image we performed text detection and text recognition. We explored different models, evaluated their speed and accuracy for our dataset. Validation of ID card, matching of results of OCR to the ground truth, is done by a custom algorithm which uses partial string matching and fuzzy-set. Also to improve accuracy of our system we have performed some Image processing techniques of which one was to correct the orientation of Image automatically before passing through our classification and OCR model.

Problem Definition
Figure 2 :Problem Definition

Overview of pipeline

Pipeline
Figure 3: Pipeline

The pipeline for this project consists of four-part. A classification model, that can verify the correct College ID card image from the pool of images. Before passing the Images to the text detection and recognition model, we corrected the image orientation by passing through the Rotate-Net model. The last stage of the pipeline was to match the OCR results from the ground truth to validate an ID card, this was done by the custom algorithm with fuzzy string matching. In the subsequent sections, we have compared different deep learning models, evaluated them on our dataset, and selected the appropriate to achieve maximum efficiency.

Data exploration

To build the model pipeline and define the problem statement more transparently it was important to introspect the image data we have and measure the variation. As a dataset we started with around 2000 images of which 1800 were valid college ID cards. The rest 200 images consist of aadhaar cards, pan cards, passport size photo and other random images. Most of the images in our dataset were clicked by mobile camera, thus some images were blur, low resolution, presence of background noise and randomly oriented. Many of them were end to end cropped, but it was estimated that only around 2-3 percent of images have background noise. It was estimated that around 8 percent of images were oriented at different angle

Data Distribution
Figure 4: Data Distribution

Classification Model

Every college has a unique ID card format. Some of the college ID cards were of shape, height ratio width, 2:5, others were 5:2. Each college ID card was having a different color, different logos, in some information was typed others were handwritten. This variation of image data made our classification problem more complex.
However there were few features that humans can identify, every ID card was rectangular in shape, had a passport size photo of the candidate present inside the ID card, also some textual information like Name, Roll No., College/University Name was present. Now our task was to segregate these images. On the other side, non-ID cards have some pattern associated with them, aadhaar cards have a lot of similar features like the presence of Indian emblem, Indian flag colored strips, QR code, etc. So was the case for PAN cards.
So we made our classification model with four classes, College ID cards, Aadhaar Cards, PAN cards, and others. Others include those images which rarely occur in the dataset, mainly selfies, backgrounds, photos of faces, etc.

Classification Task
Figure 5: Classification Task

Rotate-Net

We have seen that around 10 percent of images in our dataset were in the wrong orientation, specifically for our case they were orientated. ResNet model requires images to be of size 224X224X3. We resized images and also normalized them according to the mean and standard deviation of ImageNet Data set. We trained our model to thirty epochs using Adam optimizer with learning rate of 0.0001 and categorical cross-entropy loss function. With this we got accuracy of 91 percent on our validation set. To improve our accuracy we made few changes in our dataset.Every ID card has a passport size photo of the student. If we try to correct the orientation of faces present in the ID card, eventually we will correct the ID card orientation. So in our dataset we introduced the images of faces available in the open-source data repository. Our hypothesis was that extracting features of faces is easier than the text data present.

Rotate-Net
Figure 6: Rotate-Net

Text detection and Recognition

Comparison between OCR
Figure 7: Comparison between OCR

Results of OCR

Text detection and Recognition
Figure 8: Text detection and Recognition

Fuzzy String Matching

The word fuzzy means something that is difficult to understand and explain precisely. In the context of string matching, fuzzy string matching implies imprecise or approximate string matching. Thus, rather than a strict true or false, fuzzy string matching would tell how similar are the two given strings.
In this project, we have employed fuzzy string matching to match the text recognised from the ID card to the information given by students while registering. The reasons for using fuzzy string matching rather than an exact string matching are as follow:
The text recognition model isn't 100% reliable and so it may make mistakes. An exact string matching would fail in this scenario, while the fuzzy one would give some score of similarity

  • Students may enter information which is logically the same as that on their ID card but is not literally the same. As an example, students often interchange "Engineering" with "Engg." and vice versa. They are not the exact same string but have the same meaning.
  • Thus, using fuzzy string matching makes this ID card validation robust and close to the real world.

Fuzzy Set Matching Results
Figure 9: Fuzzy Set Matching Results

Credits:

Credits