Trending February 2024 # Pan Card Fraud Detection Using Computer Vision # Suggested March 2024 # Top 11 Popular

You are reading the article Pan Card Fraud Detection Using Computer Vision updated in February 2024 on the website We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested March 2024 Pan Card Fraud Detection Using Computer Vision

Hold on ! First we need to understand what is SSIM !

What is SSIM?

The Structural Similarity Index (SSIM) is a perceptual metric that quantifies the image quality degradation that is caused by processing such as data compression or by losses in data transmission.

How SSIM perform its function?

This metric is basically a full reference that requires 2 images from the same shot, this means 2 graphically identical images to the human eye. The second image generally is compressed or has a different quality, which is the goal of this index.

What is the real-world use of SSIM?

SSIM is usually used in the video industry but has as well a strong application in photography.

How SSIM helps in detection?

SSIM actually measures the perceptual difference between two similar images. It cannot judge which of the two is better: that must be inferred from knowing which is the original one and which has been exposed to additional processing such as compression or filters.

# Compute the Structural Similarity Index (SSIM) between the two images, # ensuring that the difference image is returned (score, diff) = structural_similarity(original_gray, tampered_gray, full=True) diff = (diff * 255).astype("uint8") print("SSIM Score is : {}".format(score*100)) print ("The given pan card is original") else: print("The given pan card is tampered")

Output :

SSIM Score is : 31.678790332739425 The given pan card is tampered

Let’s break down what just happened in the above code!

Structural similarity index helps us to determine exactly where in terms of x,y coordinates location, the image differences are. Here, we are trying to find similarities between the original and tampered image.

The lower the SSIM score lower is the similarity, i.e SSIM score is directly proportional to the similarity between two images

Generally SSIM values 0.97, 0.98, 0.99 for good quallty recontruction techniques.

Experience real-time threshold and contours on images

Contours detection is a process that can be explained simply as a curve joining all the continuous points (along with the boundary), having the same color or intensity. The algorithm does indeed find edges of images but also puts them in a hierarchy.

# Calculating threshold and contours cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = imutils.grab_contours(cnts)

Here we are using the threshold function of computer vision which applies an adaptive threshold to the image which is stored in the form array. This function transforms the grayscale image into a binary image using a mathematical formula.

Find contours works on binary image and retrieve the contours. These contours are a useful tool for shape analysis and recognition. Grab contours grabs the appropriate value of the contours.

Creating bounding boxes (contours) # loop over the contours for c in cnts: # applying contours on image (x, y, w, h) = cv2.boundingRect(c) cv2.rectangle(original, (x, y), (x + w, y + h), (0, 0, 255), 2) cv2.rectangle(tampered, (x, y), (x + w, y + h), (0, 0, 255), 2)

Bounding rectangle helps in finding the ratio of width to height of the bounding rectangle of the object. We compute the bounding box of the contour and then draw the bounding box on both input images to represent where the two images are different or not.

#Display original image with contour print('Original Format Image') original_contour = Image.fromarray(original)"pan_card_tampering/image/original_contour_image.png") original_contour

Output :

Original format Image

Inference :

Here in the above output, you can see that the original image is shown with the contours (bounding boxes) on it using fromarray() function.

Also, you can simply save the image using the save() function (Optional).

#Diplay tampered image with contour print('Tampered Image') tampered_contour = Image.fromarray(tampered)"pan_card_tampering/image/tampered_contours_image.png") tampered_contour

Output :

Tampered Image

Inference: This similarly goes with the tampered image but one can notice that some of the contours are missing in the tampered image.

Here’s the illustration of the above result # Display difference image with black print('Different Image') difference_image = Image.fromarray(diff)"pan_card_tampering/image/difference_image.png") difference_image

Output :

Different Image

Inference :

Here is another very interactive way to show the contours in terms of heated threshold i.e. by finding the heated zone (text/image zone) and normal zone (without text/image).

The heated zone i.e the zone which has text/images will be shown in the dark (black) region and the other one as a light (kind of white) zone.

#Display threshold image with white print('Threshold Image') threshold_image = Image.fromarray(thresh)"pan_card_tampering/image/threshold_image.png") threshold_image

Output :

Threshold Image

Inference: Everything here is just the same all we can see is the change in the role of color, here white color is showing the heated zone and the black color is showing the normal zone.


Finding out structural similarity of the images helped us in finding the difference or similarity in the shape of the images.

Similarly, finding out the threshold and contours based on that threshold for the images converted into grayscale binary also helped us in shape analysis and recognition.

As our SSIM is ~31.2% we can say that the image user provided is fake or tampered with.

Finally, we visualized the differences and similarities between the images using by displaying the images with contours, difference, and threshold.


This project can be used in different organizations where customers or users need to provide any kind of id in order to get themselves verified. The organization can use this project to find out whether the ID is original or fake. Similarly, this can be used for any type of ID like Aadhar, voter id, etc.


Thank you for reading my article 🙂

I hope you have enjoyed the practical implementation and line-by-line explanation of PAN card fraud detection!

I’m providing the code link here so that you guys can also learn and contribute to this project to make it even better.

You will never gonna miss my previous article on, “Drug discovery using machine learning” published in Analytics Vidhya’s medium publication. Refer to this link

If got any queries you can connect with me on LinkedIn, refer to this link

About me

Greeting to everyone, I’m currently working as a Data Science Associate Analyst in Zorba Consulting India. Along with part-time work, I’ve got an immense interest in the same field i.e. Data Science along with its other subsets of Artificial Intelligence such as, Computer Vision, Machine learning, and Deep learning feel free to collaborate with me on any project on the above-mentioned domains (LinkedIn).

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

You're reading Pan Card Fraud Detection Using Computer Vision

Top 10 Computer Vision Start

Computer vision is an interdisciplinary study that studies how computers may be built to learn at a high level from digital pictures or films. From an engineering standpoint, it tries to automate operations that the human visual system can do start-ups. Computer vision uses artificial intelligence to train computers to examine and comprehend pictures at a high level. This includes video tracking and object identification, as well as scene reconstruction and navigation mapping.

Several businesses, including real estate, retail, and even dentistry, have discovered creative methods to utilize computer vision.

Sense Time: SenseTime, a worldwide firm based in China, aspires to positively improve human civilization using AI tools. Academic research is one of their strong skills, and they have made significant contributions to AI research. One of SenseTime’s achievements has been the development of an AI system that identifies things better than the human eye. Several of their other research areas have already helped to boost various industries, including automotive, education, finance, and healthcare.

Megvii: Megvii creates Face++ Cognitive Services, a platform that provides computer vision technologies that let your apps read and comprehend the environment better. Face++’s easy and powerful APIs and SDKs enable you to effortlessly integrate leading, deep learning-based image analysis recognition technologies into your projects.

Hawk-Eye Innovations: Hawk-Eye blends sports and computer science by analyzing player decisions using AI vision. Its technology can process data from the players in real time while the ball or even the automobile moves. The statistics obtained from this data serve as the foundation for a variety of products and services aimed at revolutionizing the sport by assessing the impartiality of the game and improving its broadcast.

Shield AI: Shield AI collaborates with the Departments of Defense and Homeland Security, as well as other federal, state, and municipal departments and agencies, to develop next-generation intelligence, reconnaissance, and surveillance technologies. Nova, Shield AI’s initial product, is a Hivemind-powered drone that scans buildings while transmitting video and creating maps.

Veranda: Many new business and residential buildings now include extra security and surveillance systems. Surveillance cameras, smart doors, home automation, and other access control systems are examples. Verkada is a firm that offers a cloud platform through which all of this equipment may be remotely managed and monitored. The firm was launched in 2024 by Stanford University graduates and has swiftly piqued the interest of investors due to its great growth potential. Verkada, which was named in Forbes’ $ 1 billion startup list in 2023, was awarded with a $ 1.6 billion the following year. It provides security cameras and access control solutions that employ computer vision to authenticate individuals and identify dangers.

Metropolis: Metropolis is an artificial intelligence and computer vision startup that aims to reinvent parking and empower the future of mobility.

Trigo: Trigo enhances the production of grocery shops by merging AI-enabled technology with machine learning and artificial intelligence. Consumers in Trigo-connected establishments will not have to wait in line at the checkout. Instead, customers may scan the things over the phone, pay for them, and then depart when completed. Trigo began working with Cloud Servers in 2023 to speed growth and extend its portfolio.

Onfido: Onfido fosters confidence in the digital world by assisting companies in digitally verifying people’s identities. Our Identity Record Check compares your users’ information to a variety of recognized worldwide databases and credit reference organizations.

Nauto: The primary goal of NAUTO is to make driving safer for everyone by utilizing computer vision. Its cutting-edge artificial intelligence detects real-time hazards by recording the driver’s attention, vehicle movement, and other contextual data. With the assistance of this critical data, the driver has extra time to respond and prevent or stop before something unfavorable occurs. The ai – powered NAUTO fleet insurer platform is used by over 400 companies, and the technology has processed over a billion miles of driver photos.

Top Computer Vision Jobs To Apply In December 2023

You can apply for these computer vision jobs this December

Computer vision is a field of

Computer Vision Data Scientist at Endovision

New Delhi Endovision is a Hong Kong-based med-tech company, which is helping endoscopists to reduce cancer miss rates with the aid of real-time video analysis using AI. We have generated interest worldwide, and this is the hottest area of research in the field of endoscopy. Our partners are located in Hong Kong, Japan, and India, with the primary focus on Hong Kong at the moment. They are looking to hire a new ‘Research Engineer’ in our team. You’d help create AI-first products — implementing state-of-the-art research papers, contributing to the company IP, and technology stack deployed in Nvidia Jetson ecosystem. You’d work on cutting-edge deep learning, computer vision, and graphics problems with an emphasis on endoscopy, with an opportunity to collaborate with research scientists and engineers at the Endovision and its partnering institutions. Candidates should have: Experience (academic or industry) with computer vision and deep learning in at least two – Neural networks – CNNs, RNNs, autoencoders, etc., and transfer learning – generative deep learning method, esp. for image generation from images and videos – numerical optimization. Apply

Engineer/Interns – Computer Vision and Machine Learning at DeepSight AI Labs Pvt Ltd

Gurugram, Haryana Profile Description: A passionate developer with a drive to work in a hot startup. You will be working in a team in the area of computer vision,

Key Expectations

You should be passionate and have strong entrepreneurial skills to solve the problems that you come across

You have never given up attitude

You are a quick learner and respect others’ knowledge

You are ready to challenge any technology expert or traditional technology belief to create new innovation.


Computer Vision Engineer at CIRPI

Chennai, Tamil Nadu The company is creating a home design solution (AI) A platform to explore new Architectural designs for consumers. It is an upcoming organization in need of staff.

It is looking for a computer vision engineer who wants to be a part of the small team and has passion, dedication towards creating the best user experience

Expert problem-solving skills, computer programming skills, and strong mathematical skills would be appreciated, familiarity with libraries and frameworks for computer vision, machine learning, deep learning, and data science is highly valuable

It wants to find a programmer who has demonstrated initiative to learn AI on her/his own.


Computer Vision and Image Processing at SensoVision Systems

Bengaluru Experience: 2-5 years; Computer Vision and Image Processing: In this position, you will be involved in the given Roles and Responsibilities Roles and Responsibilities Working on computer vision/image processing applications like object classification, segmentation, etc. Working on deep learning algorithms for machine vision-based inspection use cases. Working on Machine Vision Cameras involving a variety of Lens, Filters, and other Optical Elements to collect data. Having full-stack development experience will be a plus. Educational Qualification Bachelor’s degree in marketing, business or related field. Skill(s) required OpenCV, Python, Linux and C programming, Python, PHP, Django, cloud, TensorFlow, machine vision camera experience will be a bonus.  

Desired Skills and Experience

Image processing, algorithms, Python, c programming, machine vision, Django, Linux, PHP, OpenCV Apply

Computer Vision Specialist at Bosch Engineering and Business Solutions

Education Requirements:

A Master’s or Ph.D. is preferred (Computer Science / Machine Learning, etc.,) from tier 1 institutions.

Job Requirements

Good knowledge of computer vision algorithms and hands-on experience in Opencv, machine learning, and deep learning for vision data

Experience in the medical or industry domain on computer vision will be preferable

Ability to drive product class algorithm development including data acquisition, processing, and deployment in edge or cloud

Partner closely with product and engineering leaders throughout the lifecycle of the project. Ensure that necessary data is captured; analytic needs are well-defined upfront and coordinate the analytic needs

Drive efforts to enable product and engineering leaders to share their knowledge and insights through clear and concise communication, education and data visualization

Should have independently handled a project technically and provided directions to the other team members

Experience in turning ideas into actionable designs

Able to persuade stakeholders and champion effective techniques through the development

Ongoing technical authority role with our larger customers

Strong interpersonal and communication skills: the ability to tell a clear, concise actionable story with data, to folks across various levels of the company

Able to lead the project independently

Technical directions to junior in the team, like to sort the respective task for responsible team members.

Technical Skills

Expertise with one of the following DL frameworks;

Tensorflow, Keras, Caffe, Pytorch

Proficient in OpenCV – and image processing stacks

Knowledge in Machine learning algorithms – like regression, SVM, clustering etc.,

Good to have – programming skills – C/C++

Good to have – knowledge in containers – like Dockerization

Good to have – knowledge in CI/CD pipelines.

Preferable: knowledge working with IP cameras / GIGE cameras / live data acquisition/data acquisition optimization

Domain expertise in one the areas: automotive or medical or industry

Should have experience in the cloud or edge deployment architectures

Tech-savvy and willing to work with open-source Tools

Applying statistical and machine learning techniques, such as mean-variance, k-means, nearest-neighbor, support vector, Bayesian time-series, and network analysis to identify outliers, classify events or actors and correlate anomalous sequences of events.

Blood Cell Detection In Image Using Naive Approach

This article was published as a part of the Data Science Blogathon.

The basics of object detection problems are how uss the different deep learning architectures that we can use to solve object detection problems. Let us first discuss the problem statement that we’ll be working on.

Table of Contents

Understanding the Problem Statement Blood Cell Detection

Dataset Link

Naive Approach for Solving Object Detection Problem

Steps to Implement Naive Approach

Load the Dataset

Data Exploration

Prepare Dataset for Naive Approach

Create Train and Validation Set

Define Classification Model Architecture

Train the Model

Make Predictions


Understanding the Problem Statement Blood Cell Detection Problem Statement

Now, here is a sample image from the data set. You can see that there are some red-shaded regions and a blue or a purple region, as you can see.

So in the above image, there are the red-shaded regions which are the RBCs or Red Blood Cells, and the purple-shaded regions, which are the WBCs,  and some small black highlighted portions, which are the platelets.

As you can see in this particular image, we have multiple objects and multiple classes.

We are converting this to a single class single object problem for simplicity. That means we are going to consider only WBCs.

Hence, just a single class, WBC, and ignore the rest of the classes. Also, we will only keep the images that have a single WBC.

So the images which have multiple WBCs will be removed from this data set. Here is how we will select the images from this data set.

So, we have removed image 2 and image 5 because image 5 has no WBC, whereas image 2 has 2 WBCs and the other images are a part of the data set. Similarly, the test set will also have only one WBC.

Now, for each image, we have a bounding box around the WBCs. And as you can see in this image, we have the file name as chúng tôi and these are the bounding box coordinates for the bounding box around the WBC.

In the next section, we will cover the simplest approach or the naive approach for solving this object detection problem.

Dataset Link Naive Approach for Solving Object Detection Problem

In this section, we are going to discuss a naive approach for solving the object detection problem. So let’s first understand the task, we have to detect WBCs in the image of blood cells, so you can see that below image.

Now, the simplest way would be that divide the images into multiple patches, so for this image, have divided the image into four patches.

We classify each of these patches, so the first patch has no WBC the second patch has a WBC, similarly the third and fourth do not have any WBC.

We are already familiar with the classification process and how to build the classification algorithms. So we can easily classify each of these individual patches as yes and no for representing WBC’s.

Now, in the below image the patch (a green box) which has a WBC, can be represented as the bounding box, so in this case, we’ll take the coordinates of this patch take this coordinates-value, and return that as the bounding box for WBCs.

Now in order to implement this approach, we’ll first need to prepare our training data. Now one question might be, why do we need to prepare the training data at all? we already have the images and the bounding boxes along with these images.

Well, if you remember, we have our training data in the following format where we have our WBC bounding box and the bounding box coordinates.

Now, note that we have these bounding box coordinates for the complete image, but we are going to divide this image into four patches. So we’ll need the bounding box coordinates for all of those four patches.  So our next question is how do we do that?

we have to define a new training data where we have the file name as you can see below image. We have the different patches and for each of these patches, we have Xmin, Xmax, Ymin, and Ymax values which denote the coordinates of these patches, and finally, our target variable which is WBC.  IS a WBC present in the image or not?

Now in this case it would become a simple classification problem. So for each image, we’ll divide it into four different patches and create the bounding box coordinates for each of these patches.

Now the next question is how do we create these bounding box coordinates? So it’s really simple.

Consider this that we have an image of size (640*480). So this origin would be (0,0). The above image has x and y-axis, and here we would have the coordinate value as (640, 480).

Now, if we find out the midpoint it would be (320,240). Once we have these values, we can easily find out the coordinates for each of these patches. So for the first patch, our Xmin and Ymin would be (0,0) and Xmax, Ymax would be (320,240).

Similarly, we can find it out for the second, third, and fourth patches. Once we have the coordinate values or the bounding box values for each of these patches. The next task is to identify if there is a WBC within this patch or not.

Here we can clearly see that patch 2 has a WBC while other patches do not, but we cannot manually label it for each of the images on each of the patches in the data set.

Now in the next section, we are going to implement the naive approach.

Steps to Implement Naive Approach

In the last section, we discussed the Naive approach for object detection. Let us now define the steps to implement this approach on the blood cell detection problem.

These are the steps that will follow:-

Load the Dataset

Data Explore

Prepare the Dataset for Naive Approach

Create Train and Validation set

Define classification model Architecture

Train the model

Make Predictions

so let’s go to the next section, implement these above steps.

1 Loading Required Libraries and Dataset

So let’s first start with loading the required libraries. It’s “numpy” and pandas then we have “matplotlib” in order to visualize the data and we have loaded some libraries to work with the images and resize the image and finally the torch library.

# Importing Required Libraries import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline import os from PIL import Image from skimage.transform import resize import torch from torch import nn

Now we will fix a random seed value.

# Fixing a random seed values to stop potential randomness seed = 42 rng = np.random.RandomState(seed)

here we’ll mount the drive since the data set is stored on the drive.

# mount the drive from google.colab import drive drive.mount('/content/drive')

Now since the data on the drive is available in the zip format. We’ll have to unzip this data and here we are going to unzip the data. So we can see that all the images are loaded and are stored in a folder called images. At the end of this folder, we have a CSV file which is trained.csv.

# unzip the dataset from drive !unzip /content/drive/My Drive/

Source: Author

2 Data Exploration

So let us read the CSV file and find out what is the information stored in this ‘train.csv’ file.

## Reading target file data = pd.read_csv('train.csv') data.shape

So here we are printing the first few rows of the CSV file. We can see that the file has image_names along with the cell_type which will denote RBC or WBC and so on. Finally the bounding box coordinates for this particular object in this particular image.


Source: Author

So if we check the value counts for the RBC, WBC, and platelets. we’ll see that RBCs have the maximum value count followed by WBCs and platelets.


Source: Author

Now for simplicity, we are going to only consider the WBC’s here. Hence we have selected the data with only WBC’s. So now you can see we have image_names and only cell_type WBC against these images. Also, we have our bounding box coordinates.

(data.loc[data['cell_type'] =='WBC']).head()

Source: Author

Let’s look at a few images from the original data set and the shape of these images. So we can see that the shape of these images is (480,640,3). So this is an RGB image with three channels and this is the first image in the data set.

image = plt.imread('images/' + '1.jpg') print(image.shape) plt.imshow(image)

Source: Author

Now the next step is to create patches out of this image. So we are going to learn how to divide this image into four patches. Now we know that the image is of the shape (640, 480). hence this middle point will be (320,240) and the center is (0, 0).

Source: Author

So we have the coordinates for all of these patches in the image and here we are going to make use of these coordinates and create the patches. So our format of these coordinates will be Ymin, Ymax, Xmin, and Xmax. So here we have our (Ymin, Ymax) is ( 0, 240) and (Xmin, Xmax) is (0 ,320). So this basically indicates the first patch. Similarly, we have image_2,image_3, image_4 for the subsequent second third, and fourth patches. So this is a process we can create patches from the image.

# creating 4 patches from the image # format ymin, ymax, xmin, xmax image_1 = image[0:240, 0:320, :] image_2 = image[0:240, 320:640, :] image_3 = image[240:480, 0:320, :] image_4 = image[240:480, 320:640, :]

Source: Author

Now we need to assign a target value for these patches. So in order to do that we calculate the intersection over union where we have to find out the intersection area and the union area.

Source: Author

So intersection area is simply this particular rectangle, to find out the area we need to find out the Xmin, Xmax, and Ymin, Ymax coordinates for this rectangle.

def iou(box1, box2): Irect_xmin, Irect_ymin = max(box1[0],box2[0]), max(box1[2],box2[2]) Irect_xmax, Irect_ymax = min(box1[1],box2[1]), min(box1[3],box2[3]) if Irect_xmax < Irect_xmin or Irect_ymax < Irect_ymin: target = inter_area = 0 else: inter_area = np.abs((Irect_xmax - Irect_xmin) * (Irect_ymax - Irect_ymin)) box1_area = (box1[1]-box1[0])*(box1[3]-box1[2]) box2_area = (box2[1]-box2[0])*(box2[3]-box2[2]) union_area = box1_area+box2_area-inter_area iou = inter_area/union_area return target

We have our original bounding box coordinates from the train CSV file. When I used as input these two values to the “iou” function that we defined the target comes out to be 1. You can try with different patches also based on that you will get target value.

box1= [320, 640, 0, 240] box2= [93, 296, 1, 173] iou(box1, box2)

The output is 0. Now the next step is to prepare the dataset.

3 Preparing Dataset for Naive Approach

We have considered and explored only a single image from the dataset. So let us perform these steps for all the images in the data set. so first of all here is the complete data that we have.


Source: Author

Now, We are converting these cell types as RBC is zero, WBC is one, and platelets are two.

data['cell_type'] = data['cell_type'].replace({'RBC': 0, 'WBC': 1, 'Platelets': 2})

Now we have to select the images which have only a single WBC.

Source: Author

So first of all we are creating a copy of the dataset and then keeping only WBCs and removing any image which has more than one WBC.

## keep only Single WBCs data_wbc = data.loc[data.cell_type == 1].copy() data_wbc = data_wbc.drop_duplicates(subset=['image_names', 'cell_type'], keep=False)

So now we have selected the images. We are going to set the patch coordinates based on our input image sizes. We are reading the images one by one and storing the bounding box coordinates of the WBC for this particular image. We are extracting the patches out of this image using the patch coordinates that we have defined here.

And then we are finding out the target value for each of these patches using the IoU function that we have defined. Finally, here we are resizing the patches to the standard size of (224, 224, 3). Here we are creating our final input data and the target data for each of these patches.

# create empty lists X = [] Y = [] # set patch co-ordinates patch_1_coordinates = [0, 320, 0, 240] patch_2_coordinates = [320, 640, 0, 240] patch_3_coordinates = [0, 320, 240, 480] patch_4_coordinates = [320, 640, 240, 480] for idx, row in data_wbc.iterrows(): # read image image = plt.imread('images/' + row.image_names) bb_coordinates = [row.xmin, row.xmax, row.ymin, row.ymax] # extract patches patch_1 = image[patch_1_coordinates[2]:patch_1_coordinates[3], patch_1_coordinates[0]:patch_1_coordinates[1], :] patch_2 = image[patch_2_coordinates[2]:patch_2_coordinates[3], patch_2_coordinates[0]:patch_2_coordinates[1], :] patch_3 = image[patch_3_coordinates[2]:patch_3_coordinates[3], patch_3_coordinates[0]:patch_3_coordinates[1], :] patch_4 = image[patch_4_coordinates[2]:patch_4_coordinates[3], patch_4_coordinates[0]:patch_4_coordinates[1], :] # set default values target_1 = target_2 = target_3 = target_4 = inter_area = 0 # figure out if the patch contains the object ## for patch_1 target_1 = iou(patch_1_coordinates, bb_coordinates ) ## for patch_2 target_2 = iou(patch_2_coordinates, bb_coordinates) ## for patch_3 target_3 = iou(patch_3_coordinates, bb_coordinates) ## for patch_4 target_4 = iou(patch_4_coordinates, bb_coordinates) # resize the patches patch_1 = resize(patch_1, (224, 224, 3), preserve_range=True) patch_2 = resize(patch_2, (224, 224, 3), preserve_range=True) patch_3 = resize(patch_3, (224, 224, 3), preserve_range=True) patch_4 = resize(patch_4, (224, 224, 3), preserve_range=True) # create final input data X.extend([patch_1, patch_2, patch_3, patch_4]) # create target data Y.extend([target_1, target_2, target_3, target_4]) # convert these lists to single numpy array X = np.array(X) Y = np.array(Y)

Now, let’s print the shape of our original data and the new data that we have just created. So we can see that we originally had 240 images. Now we have divided these images into four parts so we have (960,224,224,3). This is the shape of the images.

# 4 patches for every image data_wbc.shape, X.shape, Y.shape

so let’s quickly look at one of these images that we have just created. So here is our original image and this is the last patch or the fourth patch for this original image. We can see that the target assigned is one.

image = plt.imread('images/' + '1.jpg') plt.imshow(image)

Source: Author

If we check any other patch, let’s say I want to check the first patch of this image so here this will put the target as zero. You will get the first patch. Similarly, you can ensure that all the images are converted into patches and the targets are assigned accordingly.

plt.imshow(X[0].astype('uint8')), Y[0]

Source: Author

4 Preparing Train and Validation Sets

Now that we have the dataset. we are going to prepare our training and validation sets. Now note that here we have the shape of images as (224,224,3).

# 4 patches for every image data_wbc.shape, X.shape, Y.shape

The output is:-

((240, 6), (960, 224, 224, 3), (960,))

In PyTorch, we need to have the channels first. So we are going to move the axis that is will have the shape (3,224,224).

X = np.moveaxis(X, -1, 1) X.shape

The output is:-

(960, 3, 224, 224)

Now here we are normalizing the image pixel values.

X = X / X.max()

Using the train test split function we are going to create a train and validation set.

from sklearn.model_selection import train_test_split X_train, X_valid, Y_train, Y_valid=train_test_split(X, Y, test_size=0.1, random_state=seed) X_train.shape, X_valid.shape, Y_train.shape, Y_valid.shape

The output of the above code is:-

((864, 3, 224, 224), (96, 3, 224, 224), (864,), (96,))

Now, we are going to convert both of our training sets and validation sets into tensors, because these are “numpy” arrays.

X_train = torch.FloatTensor(X_train) Y_train = torch.FloatTensor(Y_train) X_valid = torch.FloatTensor(X_valid) Y_valid = torch.FloatTensor(Y_valid) 5 Model Building

For now, we’re going to build our model so here we have installed a library which is PyTorch model summary.

!pip install pytorch-model-summary

Source: Author

This is simply used to print the model summary in PyTorch. Now we are importing the summary function from here.

from pytorch_model_summary import summary

Here is the architecture that we have defined for our Naive approach. So we have defined a sequential model where we have our Conv2d layer with the input number of channels as 3 and the number of filters is 64, the size of the filters is 5 and the stride is set to 2. We have our ReLU activation function for this Conv2d layer. A pooling layer with the window size as 4 and stride 2 and then convolutional layer. Now we are flattening the output from the Conv2d layer and finally our linear layer or dense layer and sigmoid activation function.

## model architecture model = nn.Sequential( nn.Conv2d(in_channels=3, out_channels=64, kernel_size=5, stride=2), nn.ReLU(), nn.MaxPool2d(kernel_size=4,stride=2), nn.Conv2d(in_channels=64, out_channels=64, kernel_size=5, stride=2), nn.Flatten(), nn.Linear(40000, 1), nn.Sigmoid() )

So here if we print the model this will be the model architecture that we have defined.


Source: Author

Using the summary function, we can have a look at the model summary. So this will return us the layers the output shape from each of these layers, the number of trainable parameters for each of these layers. So now our model is ready.

print(summary(model, X_train[:1]))

Source: Author

6 Train the Model

let us train this model. So we are going to define our loss and optimizer functions. We have defined binary cross-entropy as a loss and adam optimizer. And then we are transferring the model to GPU. Here we are taking batches from the input image in order to train this model.

## loss and optimizer criterion = torch.nn.BCELoss() optimizer = torch.optim.Adam(model.parameters(), lr=0.001) ## GPU device if torch.cuda.is_available(): model = model.cuda() criterion = criterion.cuda()

In the output, we can see that at each epoch the loss is decreasing. So the training is complete for this model.

# batch size of the model batch_size = 32 # defining the training phase model.train() for epoch in range(15): # setting initial loss as 0 train_loss = 0.0 # to randomly pick the images without replacement in batches permutation = torch.randperm(X_train.size()[0]) # to keep track of training loss training_loss = [] # for loop for training on batches for i in range(0,X_train.size()[0], batch_size): # taking the indices from randomly generated values indices = permutation[i:i+batch_size] # getting the images and labels for a batch batch_x, batch_y = X_train[indices], Y_train[indices] if torch.cuda.is_available(): batch_x, batch_y = batch_x.cuda().float(), batch_y.cuda().float() # clearing all the accumulated gradients optimizer.zero_grad() # mini batch computation outputs = model(batch_x) # calculating the loss for a mini batch loss = criterion(outputs.squeeze(),batch_y) # storing the loss for every mini batch training_loss.append(loss.item()) # calculating the gradients loss.backward() # updating the parameters optimizer.step() training_loss = np.average(training_loss) print('epoch: t', epoch, 't training loss: t', training_loss)

Source: Author

7 Make Predictions

Let us now use this model in order to make predictions. So here I am only taking the first five inputs from the validation set and transferring them to the Cuda.

output = model(X_valid[:5].to('cuda')).cpu().detach().numpy()

Here is the output for these first five images that we have taken. Now we can see that for the first two the output is that there is no WBC or there is a WBC.


This is the output:

array([[0.00641595], [0.01172841], [0.99919134], [0.01065345], [0.00520921]], dtype=float32)

So let’s also plot the images. We can see that this is the third image, here the model says that there is a WBC and we can see that we have a WBC in this image.


Source: Author

Similarly, we can check for another image, So will take the first image of the output. you can see the output image, this image was our input patch and there is no WBC in this patch.


Source: Author


This was a very simple method in order to make the predictions or identify the patch or portion of the image that has a WBC.


Understanding the practical implementation of blood cell detection with an image dataset using a naive approach. This is the real challenge to solve the business problem and develop the model. While working on image data you have to analyze a few tasks such as bounding box, calculating IoU value, Evaluation metric.  The next level(future task) of this article is one image can have more than one object. The task is to detect the object in each of these images.

I hope the articles helped you understand how to detect blood cells with image data, how to build detection models, we are going to use this technique, and apply it in the medical analysis domain.

About the Author

Hi, I am Kajal Kumari. have completed my Master’s from IIT(ISM) Dhanbad in Computer Science & Engineering. As of now, I am working as Machine Learning Engineer in Hyderabad. Here is my Linkedin profile if you want to connect with me.

End Notes

Thanks for reading!

If you want to read my previous blogs, you can read Previous Data Science Blog posts from here.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.


Thumb Pc Uses Google Software To Give Computer Vision To Robots And Drones

A new USB stick computer uses Google’s machine-learning software to give drones and robots the equivalent of a human eye, and add new smarts to cameras.

Movidius’ Fathom Neural Compute Stick isn’t your conventional PC. It is instead designed to analyze pixels and provide the right context for images.

Fathom provides the much-needed horsepower for devices like drones, robots and cameras to run computer vision applications like image recognition. These devices alone typically don’t have the ability to run computer vision applications.

Fathom uses an embedded version of Google’s TensorFlow machine learning software for vision processing. The device can be plugged into the USB port of a device or a developer board like Raspberry Pi, which in turn can power a drone or robot. It needs a 64-bit Linux OS and 50MB of hard drive space.

With a Fathom stick, simple robots or drones could do a lot more than they typically can do now. For example, a drone could use the Fathom to avoid obstacles and automatically navigate to specific locations. Or when riding a bike, a helmet camera could automatically start recording video after identifying a certain object like a street sign.

It could also bring a higher level of situational awareness to IP-based home security systems. Connected cameras are expected to be able to differentiate between humans and animals, with the computing handled by a Fathom stick plugged in the USB port of a home security hub.

Other applications for Fathom include 3D modeling and scanning, immersive gaming, augmented reality and gesture recognition.

In a way, the Fathom is a smaller and more power-efficient version of the Nvidia Jetson TX1 developer board, which is also targeted at robots, drones, self-driving cars and Internet of Things devices. Fathom is like a mobile equivalent of the TX1 — it doesn’t have the raw horsepower, but it’s very fast at doing specific vision recognition tasks while consuming less power.

Fathom was described as a “discrete deep learning accelerator,” by Jack Dashwood, the marketing communications director at Movidius.

Fathom is based on the Myriad 2 processor already in DJI’s flagship Phantom 4 autonomous drone, which can sense obstacles. Dashwood couldn’t say if Fathom could be plugged directly into products like GoPro.

Movidius estimated the price of Fathom to be under $100. An initial run will ship to researchers, hobbyists and companies that are developing, testing and playing with products. Fathom will become commercially available in the fourth quarter of this year.

Fathom delivers 150 gigaflops of performance while consuming under 1.2 watts of power. The vision processing happens locally; there’s no need for devices to connect to cloud services to recognize and identify images, Dashwood said.

Fathom relies on machine-learning to crunch images, and needs to be trained to analyze pixels and provide the right context to images. That entails the creation of rich data sets against which images can be verified. That learning model is usually developed on a PC, and then transferred to work with the TensorFlow software stack on the smaller Fathom.

In most cases, there are many pixels that must be analyzed in order to get a complete understanding of an image – for example, when a person is happy, the lips take on a different structure. There’s no one way to train Fathom to recognize all images, and  learning models may be different for cameras, drones, robots and self-driving cars.

The creation of the rich data sets needed for image understanding involves steps like classification and labeling of pixels. Fathom uses a combination of algorithms and pixel association to understand images. In machine learning models, sentiment and face recognition capabilities have become fairly common, while distance measurement and simultaneous localization and mapping — which involves analyzing images to update a map — remain a challenge, Dashwood said.

Fathom also has 12 vector processors that can be programmed to do a variety of tasks. The computer also has a custom GPU subsystem that is central to vision processing.

4 Ways To Backup Iphone To Computer Without Using Itunes

Apple is, unfortunately, discontinuing iTunes for Mac users after 18 years. The software offers extensive features including a mobile device management utility. With iTunes being one of the favorite backup services, many users are puzzled about how to connect their iPhone to their computer for backup. 

If you’re stuck with a similar question, you’re in luck! In this article, we will teach you four ways to back your iPhone to your computer without iTunes.

Ways to Backup iPhone to Computer Without Using iTunes

Apart from iTunes, you can backup your iPhone to iCloud, Google Drive, through Finder, and Third-party applications. The backup process is direct for Finder, while the backup process to your computer is a bit long for the rest.

Using Finder 

Apple introduced Finder as the replacement for iTunes. The creators made the process of backup simpler through Finder. Unlike the backup process on iTunes, through Finder, the steps are basic and easier to understand. Follow these instructions to back your iPhone up using Finder:

Follow these instructions to back your iPhone up using Finder:

On iCloud

iCloud is Apple’s exclusive cloud-based storage space. You can connect all your devices to the same iCloud storage with the same Apple ID. It holds all types of file data such as your photos, music, documents, and so on.

You can also use iCloud to backup your data from your iPhone. After that, you can import data from your iCloud to your Mac. Follow this process to back your data up from iCloud:

Using Google Drive

Google Drive is a cloud-based file storage and data synchronization service developed by Google. The data is synchronized between all devices that share the same Google account. 

Like iCloud, you can transfer data from your iPhone to Google Drive and then import the data from your computer. Follow these instructions to backup your iPhone to your computer using Google Drive: 

Third-party Application

If you’re comfortable using third-party applications to back your storage up, you can use applications. Since it’s your photos and data we are talking about, make sure the app you are selecting are from a trusted source.

How to Restore iPhone Backup?

You can swiftly recover the data you’ve backed up on your iCloud, Mac/PC, Android, and iPhone. If you’ve already set up your iPhone, you must erase all of its contents to restore the backup from iCloud.

Here are the steps you need to follow to reset your iPhone

To perform a backup from iCloud, you must have a good internet connection. After fulfilling the requirements, follow these instructions to recover your backed-up data to your iPhone:

Press and hold the power key of your iPhone to turn it on. 

Follow along with the instructions until you land on the Apps & Data page.

From the list of options, select one of these:

Restore from iCloud Backup

If you have created an iCloud backup to transfer to your iPhone, select Restore from iCloud Backup. After selecting this option follow these instructions.

To log in to iCloud, enter the credentials for your Apple ID.

Choose a backup to recover.

To restore your purchases, re-enter your Apple ID. You can skip this step for later.

Wait for the backup to complete.

Restore from Mac or PC

You can transfer data from your Mac or your PC to your new iPhone. If you have backed up the data you want to recover on your Mac or PC, select Restore from Mac or PC. Follow these instructions after selecting the option:

On Mac

If your device is a Mac, follow these steps to recover your backup on iPhone:


Follow these instructions to recover your Backup data from your PC to your iPhone:

Note: If you’ve encrypted your backup file, you need the enter the password you’ve set before performing the recovery.

Transfer Directly from iPhone

If you wish to transfer all data from your previous iPhone and not a backup, you can select the option Transfer Directly from iPhone. Follow these instructions to load your data from another iPhone:

Turn your device on and bring them closer to each other. Make sure the Bluetooth is turned on.

On your older iPhone, you will receive a message offering to Set up New iPhone. Select Unlock to Continue and unlock your older iPhone.

After you see an animation on your newer iPhone, place your new iPhone up to the camera and scan the animation.

On your newer iPhone, enter the passcode of your other iPhone. This step will move information such as your Wi-Fi to your new iPhone. Set up your FaceID or skip it for later.

On the Transfer Your Data page, select Transfer from iPhone.

Wait for the transfer to complete.

Move Data from Android

If you have the data you want on your iPhone stored in an Android, select the Move Data from Android option. Before you begin the data transfer, plug in both of your devices to a power source and connect to a strong WiFi connection. To move your information from an android to an iPhone, follow these instructions:

Update the detailed information about Pan Card Fraud Detection Using Computer Vision on the website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!