You are reading the article Image Classification Using Convolutional Neural Network With Python updated in December 2023 on the website Katfastfood.com. We hope that the information we have shared is helpful to you. If you find the content interesting and meaningful, please share it with your friends and continue to follow and support us for the latest updates. Suggested January 2024 Image Classification Using Convolutional Neural Network With Python
This article was published as a part of the Data Science Blogathon
Hello guys! In this blog, I am going to discuss everything about image classification.
In the past few years, Deep Learning has been proved that its a very powerful tool due to its ability to handle huge amounts of data. The use of hidden layers exceeds traditional techniques, especially for pattern recognition. One of the most popular Deep Neural Networks is Convolutional Neural Networks(CNN).
A convolutional neural network(CNN) is a type of Artificial Neural Network(ANN) used in image recognition and processing which is specially designed for processing data(pixels).
Before moving further we need to understand what is the neural network? Let’s go…
Neural Network:A neural network is constructed from several interconnected nodes called “neurons”. Neurons are arranged into the input layer, hidden layer, and output layer. The input layer corresponds to our predictors/features and the Output layer to our response variable/s.
Multi-Layer Perceptron(MLP):The neural network with an input layer, one or more hidden layers, and one output layer is called a multi-layer perceptron (MLP). MLP is Invented by Frank Rosenblatt in the year of 1957. MLP given below has 5 input nodes, 5 hidden nodes with two hidden layers, and one output node
How does this Neural Network work?– Input layer neurons receive incoming information from the data which they process and distribute to the hidden layers.
– That information, in turn, is processed by hidden layers and is passed to the output neurons.
– The information in this artificial neural network(ANN) is processed in terms of one activation function. This function actually imitates the brain neurons.
– Each neuron contains a value of activation functions and a threshold value.
– The threshold value is the minimum value that must be possessed by the input so that it can be activated.
– The task of the neuron is to perform a weighted sum of all the input signals and apply the activation function on the sum before passing it to the next(hidden or output) layer.
Let us understand what is weightage sum?Say that, we have values 𝑎1, 𝑎2, 𝑎3, 𝑎4 for input and weights as 𝑤1, 𝑤2, 𝑤3, 𝑤4 as the input to one of the hidden layer neuron say 𝑛𝑗, then the weighted sum is represented as
𝑆𝑗 = σ 𝑖=1to4 𝑤𝑖*𝑎𝑖 + 𝑏𝑗
where 𝑏𝑗 : bias due to node
What are the Activation Functions?These functions are needed to introduce a non-linearity into the network. The activation function is applied and that output is passed to the next layer.
*Possible Functions*
• Sigmoid: Sigmoid function is differentiable. It produces output between 0 and 1.
• Hyperbolic Tangent: Hyperbolic Tangent is also differentiable. This Produces output between -1 and 1.
• ReLU: ReLU is Most popular function. ReLU is used widely in deep learning.
• Softmax: The softmax function is used for multi-class classification problems. It is a generalization of the sigmoid function. It also produces output between 0 and 1
Now, let’s go with our topic CNN…
CNN:Now imagine there is an image of a bird, and you want to identify it whether it is really a bird or something other. The first thing you should do is feed the pixels of the image in the form of arrays to the input layer of the neural network (MLP networks used to classify such things). The hidden layers carry Feature Extraction by performing various calculations and operations. There are multiple hidden layers like the convolution, the ReLU, and the pooling layer that performs feature extraction from your image. So finally, there is a fully connected layer that you can see which identifies the exact object in the image. You can understand very easily from the following figure:
Convolution:-Convolution Operation involves matrix arithmetic operations and every image is represented in the form of an array of values(pixels).
Let us understand example:
a = [2,5,8,4,7,9]
b = [1,2,3]
In Convolution Operation, the arrays are multiplied one by one element-wise, and the product is grouped or summed to create a new array that represents a*b.
The first three elements of matrix a are now multiplied by the elements of matrix b. The product is summed to get the result and stored in a new array of a*b.
This process remains continuous until the operation gets completed.
Pooling:into a feed-forward neural network which is also called a Multi-Layer Perceptron.
Up to this point, we have seen concepts that are important for our building CNN model.
Now we will move forward to see a case study of CNN.
1) Here we are going to import the necessary libraries which are required for performing CNN tasks.
import NumPy as np %matplotlib inline import matplotlib.image as mpimg import matplotlib.pyplot as plt import TensorFlow as tf2) Here we required the following code to form the CNN model
model = tf.keras.models.Sequential([ tf.keras.layers.Conv2D(16,(3,3),activation = "relu" , input_shape = (180,180,3)) , tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(32,(3,3),activation = "relu") , tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(64,(3,3),activation = "relu") , tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Conv2D(128,(3,3),activation = "relu"), tf.keras.layers.MaxPooling2D(2,2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(550,activation="relu"), #Adding the Hidden layer tf.keras.layers.Dropout(0.1,seed = 2023), tf.keras.layers.Dense(400,activation ="relu"), tf.keras.layers.Dropout(0.3,seed = 2023), tf.keras.layers.Dense(300,activation="relu"), tf.keras.layers.Dropout(0.4,seed = 2023), tf.keras.layers.Dense(200,activation ="relu"), tf.keras.layers.Dropout(0.2,seed = 2023), tf.keras.layers.Dense(5,activation = "softmax") #Adding the Output Layer])
A convoluted image can be too large and so it is reduced without losing features or patterns, so pooling is done.
Here Creating a Neural network is to initialize the network using the Sequential model from Keras.
Flatten()- Flattening transforms a two-dimensional matrix of features into a vector of features.
3) Now let’s see a summary of the CNN model
model.summary()It will print the following output
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 178, 178, 16) 448 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 89, 89, 16) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 87, 87, 32) 4640 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 43, 43, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 41, 41, 64) 18496 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 20, 20, 64) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 18, 18, 128) 73856 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 9, 9, 128) 0 _________________________________________________________________ flatten (Flatten) (None, 10368) 0 _________________________________________________________________ dense (Dense) (None, 550) 5702950 _________________________________________________________________ dropout (Dropout) (None, 550) 0 _________________________________________________________________ dense_1 (Dense) (None, 400) 220400 _________________________________________________________________ dropout_1 (Dropout) (None, 400) 0 _________________________________________________________________ dense_2 (Dense) (None, 300) 120300 _________________________________________________________________ dropout_2 (Dropout) (None, 300) 0 _________________________________________________________________ dense_3 (Dense) (None, 200) 60200 _________________________________________________________________ dropout_3 (Dropout) (None, 200) 0 _________________________________________________________________ dense_4 (Dense) (None, 5) 1005 ================================================================= Total params: 6,202,295 Trainable params: 6,202,295 Non-trainable params: 04) So now we are required to specify optimizers.
from tensorflow.keras.optimizers import RMSprop,SGD,Adam adam=Adam(lr=0.001)Optimizer is used to reduce the cost calculated by cross-entropy
The loss function is used to calculate the error
The metrics term is used to represent the efficiency of the model
5)In this step, we will see how to set the data directory and generate image data.
bs=30 #Setting batch size train_dir = "D:/Data Science/Image Datasets/FastFood/train/" #Setting training directory validation_dir = "D:/Data Science/Image Datasets/FastFood/test/" #Setting testing directory from tensorflow.keras.preprocessing.image import ImageDataGenerator # All images will be rescaled by 1./255. train_datagen = ImageDataGenerator( rescale = 1.0/255. ) test_datagen = ImageDataGenerator( rescale = 1.0/255. ) # Flow training images in batches of 20 using train_datagen generator #Flow_from_directory function lets the classifier directly identify the labels from the name of the directories the image lies in train_generator=train_datagen.flow_from_directory(train_dir,batch_size=bs,class_mode='categorical',target_size=(180,180)) # Flow validation images in batches of 20 using test_datagen generator validation_generator = test_datagen.flow_from_directory(validation_dir, batch_size=bs, class_mode = 'categorical', target_size=(180,180))The output will be:
Found 1465 images belonging to 5 classes. Found 893 images belonging to 5 classes.6) Final step of the fitting model.
history = model.fit(train_generator, validation_data=validation_generator, steps_per_epoch=150 epochs=30, validation_steps=50 verbose=2)The output will be:
Epoch 1/30 5/5 - 4s - loss: 0.8625 - acc: 0.6933 - val_loss: 1.1741 - val_acc: 0.5000 Epoch 2/30 5/5 - 3s - loss: 0.7539 - acc: 0.7467 - val_loss: 1.2036 - val_acc: 0.5333 Epoch 3/30 5/5 - 3s - loss: 0.7829 - acc: 0.7400 - val_loss: 1.2483 - val_acc: 0.5667 Epoch 4/30 5/5 - 3s - loss: 0.6823 - acc: 0.7867 - val_loss: 1.3290 - val_acc: 0.4333 Epoch 5/30 5/5 - 3s - loss: 0.6892 - acc: 0.7800 - val_loss: 1.6482 - val_acc: 0.4333 Epoch 6/30 5/5 - 3s - loss: 0.7903 - acc: 0.7467 - val_loss: 1.0440 - val_acc: 0.6333 Epoch 7/30 5/5 - 3s - loss: 0.5731 - acc: 0.8267 - val_loss: 1.5226 - val_acc: 0.5000 Epoch 8/30 5/5 - 3s - loss: 0.5949 - acc: 0.8333 - val_loss: 0.9984 - val_acc: 0.6667 Epoch 9/30 5/5 - 3s - loss: 0.6162 - acc: 0.8069 - val_loss: 1.1490 - val_acc: 0.5667 Epoch 10/30 5/5 - 3s - loss: 0.7509 - acc: 0.7600 - val_loss: 1.3168 - val_acc: 0.5000 Epoch 11/30 5/5 - 4s - loss: 0.6180 - acc: 0.7862 - val_loss: 1.1918 - val_acc: 0.7000 Epoch 12/30 5/5 - 3s - loss: 0.4936 - acc: 0.8467 - val_loss: 1.0488 - val_acc: 0.6333 Epoch 13/30 5/5 - 3s - loss: 0.4290 - acc: 0.8400 - val_loss: 0.9400 - val_acc: 0.6667 Epoch 14/30 5/5 - 3s - loss: 0.4205 - acc: 0.8533 - val_loss: 1.0716 - val_acc: 0.7000 Epoch 15/30 5/5 - 4s - loss: 0.5750 - acc: 0.8067 - val_loss: 1.2055 - val_acc: 0.6000 Epoch 16/30 5/5 - 4s - loss: 0.4080 - acc: 0.8533 - val_loss: 1.5014 - val_acc: 0.6667 Epoch 17/30 5/5 - 3s - loss: 0.3686 - acc: 0.8467 - val_loss: 1.0441 - val_acc: 0.5667 Epoch 18/30 5/5 - 3s - loss: 0.5474 - acc: 0.8067 - val_loss: 0.9662 - val_acc: 0.7333 Epoch 19/30 5/5 - 3s - loss: 0.5646 - acc: 0.8138 - val_loss: 0.9151 - val_acc: 0.7000 Epoch 20/30 5/5 - 4s - loss: 0.3579 - acc: 0.8800 - val_loss: 1.4184 - val_acc: 0.5667 Epoch 21/30 5/5 - 3s - loss: 0.3714 - acc: 0.8800 - val_loss: 2.0762 - val_acc: 0.6333 Epoch 22/30 5/5 - 3s - loss: 0.3654 - acc: 0.8933 - val_loss: 1.8273 - val_acc: 0.5667 Epoch 23/30 5/5 - 3s - loss: 0.3845 - acc: 0.8933 - val_loss: 1.0199 - val_acc: 0.7333 Epoch 24/30 5/5 - 3s - loss: 0.3356 - acc: 0.9000 - val_loss: 0.5168 - val_acc: 0.8333 Epoch 25/30 5/5 - 3s - loss: 0.3612 - acc: 0.8667 - val_loss: 1.7924 - val_acc: 0.5667 Epoch 26/30 5/5 - 3s - loss: 0.3075 - acc: 0.8867 - val_loss: 1.0720 - val_acc: 0.6667 Epoch 27/30 5/5 - 3s - loss: 0.2820 - acc: 0.9400 - val_loss: 2.2798 - val_acc: 0.5667 Epoch 28/30 5/5 - 3s - loss: 0.3606 - acc: 0.8621 - val_loss: 1.2423 - val_acc: 0.8000 Epoch 29/30 5/5 - 3s - loss: 0.2630 - acc: 0.9000 - val_loss: 1.4235 - val_acc: 0.6333 Epoch 30/30 5/5 - 3s - loss: 0.3790 - acc: 0.9000 - val_loss: 0.6173 - val_acc: 0.8000The above function trains the neural network using the training set and evaluates its performance on the test set. The functions return two metrics for each epoch ‘acc’ and ‘val_acc’ which are the accuracy of predictions obtained in the training set and accuracy attained in the test set respectively.
Conclusion:Hence, we see that sufficient accuracy has been met. However, anyone can run this model by increasing the number of epochs or any other parameters.
I hope you liked my article. Do share with your friends, colleagues.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
You're reading Image Classification Using Convolutional Neural Network With Python
Intent Classification With Convolutional Neural Networks
This article was published as a part of the Data Science Blogathon
Introductionis a machine-learning approach that groups text into pre-defined categories. It is an integral tool in Natural Language Processing (NLP) used for varied tasks like spam and non-spam email classification, sentiment analysis of movie reviews, detection of hate speech in social media posts, etc. Although there are a lot of machine learning algorithms available for text classification like Naive Bayes, Support Vector Machines, Logistic Regression, etc., in this article we will be using a deep-learning-based convolutional neural network architecture to perform intent classification of text commands.
What are CNNs?Though CNNs are associated more frequently with computer vision problems, recently they have been used in NLP with interesting results. CNNs are just several layers of convolutions with non-linear activation functions like ReLU or tanh or SoftMax applied to the results.
A 1-D convolution is shown in the above image. A filter/kernel of size 3 is passed over the input of size 6. Convolution is a mathematical operation where the elements in the filter are multiplied element-wise with the input over which the filter is currently present and the corresponding products are summed up to obtain the output element (as is shown by c3 = w1i2 + w2i3 + w3i4). The filter keeps going over the input, performing convolutions, and obtaining the output elements. We need 2-D convolutions in image processing tasks since images are 2-D vectors, but 1-D convolutions are enough for 1-D text manipulations. A convolutional neural network is simply a neural network where layers that perform convolutions are present. There can be multiple filters present in a single convolutional layer, which help to capture information about different input features.
Why CNNs in text classification?The filters/kernels in CNNs can help identify relevant patterns in text data – bigrams, trigrams, or n-grams (contiguous sequence of n words) depending on kernel size. Since CNNs are translation invariant, they can detect these patterns irrespective of their position in the sentence. Local order of words is not that important in text classification, so CNNs can perform this task effectively. Each filter/kernel detects a specific feature, such as if the sentence contains positive (‘good’, ‘amazing’) or negative (‘bad’, ‘terrible’) terms in the case of sentiment analysis. Like sentiment analysis, most text classification tasks are determined by the presence or absence of some key phrases present anywhere in the sentence. This can be effectively modelled by CNNs which are good at extracting local and position-invariant features from data. Hence we have chosen CNNs for our intent classification task.
Loading the Dataset import pandas as pd commands=pd.read_csv('TextCommands.csv’) commands.columns = ['text','label','misc'] commands.head()The dataset looks like this :
Source: Author’s Jupyter notebook
The different intents/labels are numbered from 1 to 26. The dataset is pretty balanced among the different labels. The dataset should ideally be balanced because a severely imbalanced dataset can be challenging to model and require specialized techniques.
Data PreprocessingData preprocessing is a particularly important task in NLP. We apply three main pre-processing methods here :
Tokenizing: Keras’ inbuilt tokenizer API has fit the dataset which splits the sentences into words and creates a dictionary of all unique words found and their uniquely assigned integers. Each sentence is converted into an array of integers representing all the unique words present in it.
Sequence Padding: The array representing each sentence in the dataset is filled with zeroes to the left to make the size of the array 10 and bring all arrays to the same length.
Finally, the labels are converted into one-hot vectors using the to_categorical function from Keras.utils library.
The corresponding code : import numpy as np from chúng tôi import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.utils import to_categorical MAX_SEQUENCE_LENGTH = 10 MAX_NUM_WORDS = 5000 tokenizer = Tokenizer(num_words=MAX_NUM_WORDS) tokenizer.fit_on_texts(commands['text']) sequences = tokenizer.texts_to_sequences(commands['text']) word_index = tokenizer.word_index print('Found %s unique tokens.' % len(word_index)) data = pad_sequences(sequences, maxlen=MAX_SEQUENCE_LENGTH) labels = to_categorical(np.asarray(commands['label'])) print('Shape of data tensor:', data.shape) print('Shape of label tensor:', labels.shape)142 unique tokens are found in our dataset. Next, we need to split the data into train and test sets. The random shuffling of indices is used to split the dataset into roughly 90% training data and the rest test data.
VALIDATION_SPLIT = 0.1 indices = np.arange(data.shape[0]) np.random.shuffle(indices) data = data[indices] labels = labels[indices] num_validation_samples = int(VALIDATION_SPLIT * data.shape[0]) x_train = data[:-num_validation_samples] y_train = labels[:-num_validation_samples] x_val = data[-num_validation_samples:] y_val = labels[-num_validation_samples:] Model BuildingWe start by importing the necessary packages to build the model and creating an embedding layer.
from keras.layers import Dense, Input, GlobalMaxPooling1D from keras.layers import Conv1D, MaxPooling1D, Embedding, Flatten from keras.models import Model from keras.models import Sequential from keras.initializers import Constant EMBEDDING_DIM = 60 num_words = min(MAX_NUM_WORDS, len(word_index) + 1) embedding_layer = Embedding(num_words,EMBEDDING_DIM,input_length=MAX_SEQUENCE_LENGTH,trainable=True) A keras functional model is implemented. It has the following layers :
An input layer that takes the array of length 10 representing a sentence.
An embedding layer of dimension 60 whose weights can be updated during training. It helps to convert each word into a fixed-length dense vector of size 60. The input dimension is set as the size of the vocabulary and the output dimension is 60. Each word in the input will hence get represented by a vector of size 60.
Two convolutional layers (Conv1D) with 64 filters each, kernel size of 3, and relu activation.
A max-pooling layer(MaxPooling1D) with pool size 2. Max Pooling in CNN is an operation that selects the maximum element from the region of the input which is covered by the filter/kernel. Pooling reduces the dimensions of the output, but it retains the most important information.
A flatten layer to flatten the input without affecting batch size. If the input to the flatten layer is a tensor of shape 1 X 3 X 64, the output will be a tensor of shape 1 X 192.
A dense (fully connected) layer of 100 units and relu activation.
A dense layer of 26 units and softmax activation that outputs the final probabilities of belonging to each of the 26 classes. Softmax activation is used here since it goes best with categorical cross-entropy loss, which is the loss we are going to be using to train the model.
The model architecture is shown below :
Source: Created by Author
The code for building the model :
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32') embedded_sequences = embedding_layer(sequence_input) x = Conv1D(64, 3, activation='relu')(embedded_sequences) x = Conv1D(64, 3, activation='relu')(x) x = MaxPooling1D(2)(x) x=Flatten()(x) x = Dense(100, activation='relu')(x) preds = Dense(27, activation='softmax')(x) model = Model(sequence_input, preds) model.summary()The model is compiled with categorical cross-entropy loss and rmsprop optimizer. Categorical cross-entropy is a loss function commonly used for multi-class classification tasks. The rmsprop optimizer is a gradient-based optimization technique that uses a moving average of squared gradients to normalize the gradient. This helps to overcome the vanishing gradients problem. Accuracy is used as the main performance metric. The model summary can be seen below :
Source: Author’s Jupyter Notebook
Model Training and EvaluationThe model is trained for 30 epochs with batch size 50.
s=0.0 for i in range (1,50): model.fit(x_train, y_train,batch_size=50, epochs=30, validation_data=(x_val, y_val)) # evaluate the model scores = model.evaluate(x_val, y_val, verbose=0) s=s+(scores[1]*100)The model is evaluated by calculating its accuracy. Accuracy of classification is calculated by dividing the number of correct predictions by the total number of predictions.
# evaluate the model scores = model.evaluate(x_val, y_val, verbose=0) print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))The accuracy of our model comes out to be 94.87%! You can try improving the accuracy further by playing around with the model hyperparameters, further tuning the model architecture or changing the train-test split ratio.
Using the model to classify a new unseen text commandWe can use our trained model to classify new text commands not present in the dataset into one of the 26 different labels. Each new text has to be tokenized and padded before being fed as input to the model. The model.predict() function returns the probabilities of the data belonging to each of the 26 classes. The class with the greatest probability is the predicted class.
# new instance where we do not know the answer sequences_new = tokenizer.texts_to_sequences(Xnew) data = pad_sequences(sequences_new, maxlen=MAX_SEQUENCE_LENGTH) # make a prediction yprob = model.predict(data) yclasses=yprob.argmax(axis=-1) # show the inputs and predicted outputs print("X=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%snX=%s, Predicted=%s" % (Xnew[0], yclasses[0],Xnew[1],yclasses[1],Xnew[2],yclasses[2],Xnew[3],yclasses[3],Xnew[4],yclasses[4],Xnew[5],yclasses[5],Xnew[6],yclasses[6],Xnew[7],yclasses[7],Xnew[8],yclasses[8],Xnew[9],yclasses[9]))The output from the above code is :
Source: Author’s Jupyter notebook
The output l
ConclusionTo conclude, Natural Language Processing is a continuously expanding field filled with emerging technologies and applications. It has a massive impact in areas like chatbots, social media monitoring, recommendation systems, machine translation, etc. Now, you have learned how to use CNNs for text classification, go ahead and try to apply them in other areas of Natural Language Processing. The results might end up surprising you!
Thank you for reading.
Read here about NPL using CNNs for Sentence Classification!
Connect at: [email protected]
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.
Related
Approaching Classification With Neural Networks
This article was published as a part of the Data Science Blogathon.
Introduction on ClassificationClassification is one of the basic tasks that a machine can be trained to perform. This can include classifying whether it will rain or not today using the weather data, determining the expression of the person based on the facial image, or the sentiment of the review based on text etc. Classification is extensively applied in various applications thus making it one of the most fundamental tasks under supervised machine learning.
But before we start with the classification let’s get started…
About the DatasetThe dataset we are using to train our model is the Iris Dataset. This dataset consists of 150 samples belonging to 3 species of Iris flower i.e. Iris Setosa, Iris Versicolour and Iris Virginica. This is a multi-variate dataset i.e. there are 4 features provided for each sample i.e. sepal length, sepal width, petal length and petal width. We need to use these 4 features and classify the type of iris species. Thus a multi-class classification model is used to train on this dataset. More information about this dataset can be found here.
Getting Started with ClassificationLet’s get started by first importing required libraries,
import os import pandas as pd import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.keras import Sequential from tensorflow.keras import layers from tensorflow.keras import models from tensorflow.keras import optimizers from tensorflow.keras import losses from tensorflow.keras import metrics from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report, accuracy_score from tensorflow import keras from sklearn.preprocessing import LabelEncoderCheck the version of TensorFlow installed by following,
print(tf.__version__)Next, we need to download and extract the dataset from here. Then move it to the location of notebook/script or copy the location of the dataset. Now read the CSV file from that location,
file_path = 'iris_dataset.csv' df = pd.read_csv(file_path) df.head()We can see that our dataset has 4 input features and 1 target variable. The target variable consists of 3 classes i.e. ‘Iris-setosa’, ‘Iris-versicolor’ and ‘Iris-verginica’. Now let’s further prepare our dataset for model training.
Data PreparationFirst, let’s check if our dataset consists of any null values.
print(df.isnull().sum())There are no null values. Therefore we can continue to separate the inputs and targets.
X = df.drop('target', axis=1) y = df['target']Since now we have separated the input features (X) and target labels (y), let’s split the dataset into training and validation sets. For this purpose let’s use Scikit-Learn’s train_test_split method to split our dataset.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) print("The length of training dataset: ", len(X_train)) print("The length of validation dataset: ", len(X_test))In the above code, we have split the dataset such that the validation data contains 20% of the randomly selected samples from the whole dataset. Let’s now further do some processing before we create the model.
Data ProcessingSince we have the data split ready, let’s now do some basic processing like feature scaling and encoding labels. The input features contain attributes of petal and sepal i.e. length and width in centimetres. Therefore these features are numerical that need to be normalized i.e. transform the data such that the mean is 0 and the standard deviation is 1.
Let’s use Scikit-learn’s StandardScalar module to do the same.
features_encoder = StandardScaler() features_encoder.fit(X_train) ######################################################## X_train = features_encoder.transform(X_train) X_test = features_encoder.transform(X_test)Now we should encode the categorical target labels. This is because our model won’t be able to understand if the categories are represented in strings. Therefore let’s encode the labels using Scikit-learn’s LabelEncoder module.
label_encoder = LabelEncoder() label_encoder.fit(y_train) ######################################################## y_train = label_encoder.transform(y_train).reshape(-1, 1) y_test = label_encoder.transform(y_test).reshape(-1, 1)Now let’s check the shapes of the datasets,
print(X_train.shape) print(X_test.shape) print(y_train.shape) print(y_test.shape)Great! Now we are ready to define and train our model.
Creating ModelLet’s define the model for classification using the Keras Sequential API. We can stack the required layers and define the model architecture. For this model, let’s define Dense layers to define the input, output and intermediate layers.
model = Sequential([ layers.Dense(8, activation="relu", input_shape=(4,)), layers.Dense(16, activation="relu"), layers.Dense(32, activation="relu"), layers.Dense(3, activation="softmax") ])In the above model, we have defined 4 Dense layers. The output layer consists of 3 neurons i.e. equal to the number of output labels present. We are using the softmax activation function at the final layer because it enables the model to provide probabilities for each of the labels. The output label that has the highest probability is the output prediction determined by the model. In other layers, we have used the relu activation function.
Now let’s compile the model by defining the loss function, optimizer and metrics.
loss=losses.SparseCategoricalCrossentropy(), metrics=metrics.SparseCategoricalAccuracy())
According to the above code, we have used SGD or Stochastic Gradient Descent as the optimizer with a default learning rate of 0.01. The SparseCategoricalCrossEntropy loss function is used. We are using SparseCategoricalCrossEntropy rather than CategoricalCrossEntropy loss function because our outputs categories are in the integer format. CategoricalCrossEntropy would be a good choice when the categories are one-hot encoded. Finally, we are using SparseCategoricalAccuracy as the metric that is tracked.
Now let’s train the model…
Model Training and EvaluationNow let’s train our model using the processed training data for 200 epochs and provide the test dataset for validation.
history = model.fit(x=X_train, y=y_train, epochs=200, validation_data=(X_test, y_test), verbose=0)Now we have trained our model using the training dataset. Before evaluation let’s check the summary of the model we have defined.
# Check model summary model.summary()Now let’s evaluate the model on the testing dataset.
# Perform model evaluation on the test dataset model.evaluate(X_test, y_test)That’s great results… Now let’s define some helper functions to plot the accuracy and loss plots.
# Plot history # Function to plot loss def plot_loss(history): plt.plot(history.history['loss'], label='loss') plt.plot(history.history['val_loss'], label='val_loss') plt.ylim([0,10]) plt.xlabel('Epoch') plt.ylabel('Error (Loss)') plt.legend() plt.grid(True) ######################################################## # Function to plot accuracy def plot_accuracy(history): plt.plot(history.history['sparse_categorical_accuracy'], label='accuracy') plt.plot(history.history['val_sparse_categorical_accuracy'], label='val_accuracy') plt.ylim([0, 1]) plt.xlabel('Epoch') plt.ylabel('Accuracy') plt.legend() plt.grid(True)Now let’s pass the model training history and check the model performance on the dataset.
plot_loss(history) plot_accuracy(history)We can see from the graphs below that the model has learnt over time to classify different species almost accurately.
Save and Load ModelSince we have the trained model, we can export it for further use cases, deploy it in applications, or continue the training from left off. We can do this by using the save method and exporting the model in H5 format.
# Save the model model.save("trained_classifier_model.h5")We can load the saved model checkpoint by using the load_model method.
# Load the saved model and perform classification loaded_model = models.load_model('trained_classifier_model.h5')Now let’s try to find predictions from the loaded model. Since the model contains softmax as the output activation function, we need to use the np.argmax() method to pick the class with the highest probability.
# The results the model returns are softmax outputs i.e. the probabilities of each class. results = loaded_model.predict(X_test) preds = np.argmax(results, axis=1)Now we can evaluate the predictions by using metric functions.
# Predictions print(accuracy_score(y_test, preds)) print(classification_report(y_test, preds))Awesome! Our results match the previous ones.
Till now we have trained a deep neural network using TensorFlow to perform basic classification tasks using tabular data. By using the above method, we can train classifier models on any tabular dataset with any number of input features. By leveraging the different types of layers available in Keras, we can optimize and have more control over the model training, thus improving the metric performance. It is recommended to try replicating the above procedure on other datasets and experiment by changing different hyperparameters like learning rate, the number of layers, optimizers etc until we get desirable model performance.
Related
Plant Seedlings Classification Using Cnn – With Python Code
This article was published as a part of the Data Science Blogathon
IntroductionHello Readers!!
I mage. The dataset has 12 sets of images and our ultimate is to classify plant species from an image.
If you want to learn more about the dataset, check this Link. We are going to perform multiple steps such as importing the libraries and modules, reading images and resizing them, cleaning the images, preprocessing of images, model building, model training, reduce overfitting, and finally predictions on the testing dataset.
📌Check out my latest articles here
📌Solve Sudoku using Deep Learning, check here
Image Source
TABLE OF CONTENTS
PROBLEM STATEMENT
IMPORT LIBRARIES
GETTING THE DATA AND RESIZING THE IMAGES
CLEANING THE IMAGES AND REMOVING THE BACKGROUND
CONVERTING THE LABELS INTO NUMBERS
DEFINING OUR MODEL AND SPLITTING THE DATASET
PREVENTING OVERFITTING
DEFINING THE CONVOLUTIONAL NEURAL NETWORK
FITTING THE CNN ONTO THE DATA
CONFUSION MATRIX
GETTING PREDICTIONS
PROBLEM STATEMENTThis dataset is provided by Aarhus University Signal Processing group. This is a typical image recognition problem statement. We have provided a dataset of images that has plant photos at various stages of growth. Each photo has its unique id and filename. The dataset contains 960 unique plants that are from 12 plant species. The final aim is to build a classifier that is capable to determine the plant species from a photo.
List of Species
Black-grass
Charlock
Cleavers
Common Chickweed
Common wheat
Fat Hen
Loose Silky-bent
Maize
Scentless Mayweed
Shepherds Purse
Small-flowered Cranesbill
Sugar beet
IMPORT LIBRARIESFirst import all the necessary libraries for our further analysis. We are going to use NumPy, Pandas, matplotlib, OpenCV, Keras, and sci-kit-learn. Check the below commands for importing all the required libraries
import numpy as np # MATRIX OPERATIONS import pandas as pd # EFFICIENT DATA STRUCTURES import matplotlib.pyplot as plt # GRAPHING AND VISUALIZATIONS import math # MATHEMATICAL OPERATIONS import cv2 # IMAGE PROCESSING - OPENCV from glob import glob # FILE OPERATIONS import itertools # KERAS AND SKLEARN MODULES from keras.utils import np_utils from keras.preprocessing.image import ImageDataGenerator from keras.models import Sequential from keras.layers import Dense from keras.layers import Dropout from keras.layers import Flatten from keras.layers.convolutional import Conv2D from keras.layers.convolutional import MaxPooling2D from keras.layers import BatchNormalization from keras.callbacks import ModelCheckpoint,ReduceLROnPlateau,CSVLogger from sklearn import preprocessing from sklearn.model_selection import train_test_split from sklearn.metrics import confusion_matrix # GLOBAL VARIABLES scale = 70 seed = 7 GETTING THE DATA AND RESIZING THE IMAGESFor training our model, we need to read the data first. Our dataset has different sizes of images, so we are going to resize the images. Reading the data and resizing them are performed in a single step. Check the below code for complete information on how to perform different operations.
path_to_images = 'plant-seedlings-classification/train/png' images = glob(path_to_images) trainingset = [] traininglabels = [] num = len(images) count = 1 #READING IMAGES AND RESIZING THEM for i in images: print(str(count)+'/'+str(num),end='r') trainingset.append(cv2.resize(cv2.imread(i),(scale,scale))) traininglabels.append(i.split('/')[-2]) count=count+1 trainingset = np.asarray(trainingset) traininglabels = pd.DataFrame(traininglabels) CLEANING THE IMAGES AND REMOVING THE BACKGROUNDIt is a very important step to performing the cleaning. Cleaning an image is an intensive task. We will be performing the following steps in order to clean the images
Convert the RGB images into the HSV
In order to remove the noise, we will have to blur the images
In order to remove the background, we will have to create a mask.
new_train = [] sets = []; getEx = True for i in trainingset: blurr = cv2.GaussianBlur(i,(5,5),0) hsv = cv2.cvtColor(blurr,cv2.COLOR_BGR2HSV) #GREEN PARAMETERS lower = (25,40,50) upper = (75,255,255) mask = cv2.inRange(hsv,lower,upper) struc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11)) mask = cv2.morphologyEx(mask,cv2.MORPH_CLOSE,struc) new = np.zeros_like(i,np.uint8) new[boolean] = i[boolean] new_train.append(new) if getEx: plt.subplot(2,3,1);plt.imshow(i) # ORIGINAL plt.subplot(2,3,2);plt.imshow(blurr) # BLURRED plt.subplot(2,3,3);plt.imshow(hsv) # HSV CONVERTED plt.subplot(2,3,4);plt.imshow(mask) # MASKED plt.subplot(2,3,5);plt.imshow(boolean) # BOOLEAN MASKED plt.subplot(2,3,6);plt.imshow(new) # NEW PROCESSED IMAGE plt.show() getEx = False new_train = np.asarray(new_train) # CLEANED IMAGES for i in range(8): plt.subplot(2,4,i+1) plt.imshow(new_train[i]) CONVERTING THE LABELS INTO NUMBERS
The labels are strings and these are hard to process. So we’ll convert these labels into a binary classification.
The classification can be represented by an array of 12 numbers which will follow the condition:
0 if the species is not detected.
1 if the species is detected.
Example: If Blackgrass is detected, the array will be = [1,0,0,0,0,0,0,0,0,0,0,0]
labels = preprocessing.LabelEncoder() labels.fit(traininglabels[0]) print('Classes'+str(labels.classes_)) encodedlabels = labels.transform(traininglabels[0]) clearalllabels = np_utils.to_categorical(encodedlabels) classes = clearalllabels.shape[1] print(str(classes)) traininglabels[0].value_counts().plot(kind='pie') DEFINING OUR MODEL AND SPLITTING THE DATASETIn this step, we are going to split the training dataset for validation. We are using the train_test_split() function from scikit-learn. Here we are splitting the dataset keeping the test_size=0.1. It means 10% of total data is used as testing data and the other 90% as training data. Check the below code for splitting the dataset.
new_train = new_train/255 x_train,x_test,y_train,y_test = train_test_split(new_train,clearalllabels,test_size=0.1,random_state=seed,stratify=clearalllabels) PREVENTING OVERFITTINGOverfitting is a problem in machine learning in which our model performs very well on train g data but performs poorly on testing data.
The problem of overfitting is severe in deep learning where deep neural networks get overfitted. The problem of overfitting affects our end results badly.
To get rid of it, we need to reduce it. In this problem, we are using the ImageDataGenerator() function which randomly changes the characteristics of images and provides randomness in the data. To avoid overfitting, we need a function. This function randomly changes the image characteristics. Check the below code on how to reduce overfitting
generator = ImageDataGenerator(rotation_range = 180,zoom_range = 0.1,width_shift_range = 0.1,height_shift_range = 0.1,horizontal_flip = True,vertical_flip = True) generator.fit(x_train) DEFINING THE CONVOLUTIONAL NEURAL NETWORKOur dataset consists of images so we can’t use machine learning algorithms like linear regression, logistic regression, decision trees, etc. We need a deep neural network for the images. In this problem, we are going to use a convolutional neural network. This neural network will take images as input and it will provide the final output as a species value. We are randomly using 4 convolution layers and 3 fully connected layers. Also, We are using multiple functions like Sequential(), Conv2D(), Batch Normalization, Max Pooling, Dropout, and Flatting.
We are using a convolutional neural network for training.
This model has 4 convolution layers.
This model has 3 fully connected layers.
np.random.seed(seed) model = Sequential() model.add(Conv2D(filters=64, kernel_size=(5, 5), input_shape=(scale, scale, 3), activation='relu')) model.add(BatchNormalization(axis=3)) model.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu')) model.add(MaxPooling2D((2, 2))) model.add(BatchNormalization(axis=3)) model.add(Dropout(0.1)) model.add(Conv2D(filters=128, kernel_size=(5, 5), activation='relu')) model.add(BatchNormalization(axis=3)) model.add(Conv2D(filters=128, kernel_size=(5, 5), activation='relu')) model.add(MaxPooling2D((2, 2))) model.add(BatchNormalization(axis=3)) model.add(Dropout(0.1)) model.add(Conv2D(filters=256, kernel_size=(5, 5), activation='relu')) model.add(BatchNormalization(axis=3)) model.add(Conv2D(filters=256, kernel_size=(5, 5), activation='relu')) model.add(MaxPooling2D((2, 2))) model.add(BatchNormalization(axis=3)) model.add(Dropout(0.1)) model.add(Flatten()) model.add(Dense(256, activation='relu')) model.add(BatchNormalization()) model.add(Dropout(0.5)) model.add(Dense(256, activation='relu')) model.add(BatchNormalization()) model.add(Dropout(0.5)) model.add(Dense(classes, activation='softmax')) model.summary() FITTING THE CNN ONTO THE DATANext is to fit the CNN model onto our dataset so that model will get learn from the training dataset and weights get updated. This trained CNN model can be further used to get the final predictions on our testing dataset. There are some pre-requirements that we have to follow like reducing the learning rate, find the best weights for the model and save these calculated weights so that we can use them further for testing and getting predictions.
We need the following as per our general knowledge
Best weights for the model
Reduce learning rate
Save the last weights of the model
lrr = ReduceLROnPlateau(monitor='val_acc', patience=3, verbose=1, factor=0.4, min_lr=0.00001) filepath="drive/DataScience/PlantReco/weights.best_{epoch:02d}-{val_acc:.2f}.hdf5" checkpoints = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max') filepath="drive/DataScience/PlantReco/weights.last_auto4.hdf5" checkpoints_full = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=False, mode='max') callbacks_list = [checkpoints, lrr, checkpoints_full] #MODEL # hist = model.fit_generator(datagen.flow(trainX, trainY, batch_size=75), # epochs=35, validation_data=(testX, testY), # steps_per_epoch=trainX.shape[0], callbacks=callbacks_list) # LOADING MODEL model.load_weights("../input/plantrecomodels/weights.best_17-0.96.hdf5") dataset = np.load("../input/plantrecomodels/Data.npz") data = dict(zip(("x_train","x_test","y_train", "y_test"), (dataset[k] for k in dataset))) x_train = data['x_train'] x_test = data['x_test'] y_train = data['y_train'] y_test = data['y_test'] print(model.evaluate(x_train, y_train)) # Evaluate on train set print(model.evaluate(x_test, y_test)) # Evaluate on test set CONFUSION MATRIXA confusion matrix is a way to check how our model performs on data. It is a good way to analyse the error in the model. Check the below code for the confusion matrix
# PREDICTIONS y_pred = model.predict(x_test) y_class = np.argmax(y_pred, axis = 1) y_check = np.argmax(y_test, axis = 1) cmatrix = confusion_matrix(y_check, y_class) print(cmatrix) GETTING PREDICTIONSIn the final part, we are getting our predictions on the testing dataset. Check the below code for getting the predictions using the trained model
path_to_test = '../input/plant-seedlings-classification/test/*.png' pics = glob(path_to_test) testimages = [] tests = [] count=1 num = len(pics) for i in pics: print(str(count)+'/'+str(num),end='r') tests.append(i.split('/')[-1]) testimages.append(cv2.resize(cv2.imread(i),(scale,scale))) count = count + 1 testimages = np.asarray(testimages) newtestimages = [] sets = [] getEx = True for i in testimages: blurr = cv2.GaussianBlur(i,(5,5),0) hsv = cv2.cvtColor(blurr,cv2.COLOR_BGR2HSV) lower = (25,40,50) upper = (75,255,255) mask = cv2.inRange(hsv,lower,upper) struc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11)) mask = cv2.morphologyEx(mask,cv2.MORPH_CLOSE,struc) masking = np.zeros_like(i,np.uint8) masking[boolean] = i[boolean] newtestimages.append(masking) if getEx: plt.subplot(2,3,1);plt.imshow(i) plt.subplot(2,3,2);plt.imshow(blurr) plt.subplot(2,3,3);plt.imshow(hsv) plt.subplot(2,3,4);plt.imshow(mask) plt.subplot(2,3,5);plt.imshow(boolean) plt.subplot(2,3,6);plt.imshow(masking) plt.show() getEx=False newtestimages = np.asarray(newtestimages) # OTHER MASKED IMAGES for i in range(6): plt.subplot(2,3,i+1) plt.imshow(newtestimages[i]) Newtestimages=newtestimages/255 prediction = model.predict(newtestimages) # PREDICTION TO A CSV FILE pred = np.argmax(prediction,axis=1) predStr = labels.classes_[pred] result = {'file':tests,'species':predStr} result = pd.DataFrame(result) result.to_csv("Prediction.csv",index=False) End NotesSo in this article, we had a detailed discussion on Plants Seedlings Classification Using CNN. Hope you learn something from this blog and it will help you in the future. Thanks for reading and your patience. Good luck!
You can check my articles here: Articles
Email id: [email protected]
Connect with me on LinkedIn: LinkedIn.
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.
Related
Rotate Image Without Cutting Off Sides Using Opencv Python
Rotating an image is the most basic operation in image editing. The python OpenCV library provides the methods cv2.getRotationMatrix2D(),cv2.rotate() to do this task very easily.
The cv2.rotate() will rotate the image in 0 or 90 or 180 or 270 angles only where as Cv2.getRotationMatrix2D() will rotate the image to any specified angle. In the article below, we will rotate the image without cropping or cutting off sides using OpenCV Python.
To rotate an image using the cv2.getRotationMatrix2D() method then we need to follow below three steps −
First, we need to get the centre of rotation.
Next by using the getRotationMatrix2D() method, we need to create the 2D-rotation matrix.
Finally, by using the warpAffine() function in OpenCV, we need to apply the affine transformation to the image to correct the geometric distortions or deformations of the image.
Using Cv2.getRotationMatrix2D() functionThe function creates a transformation matrix of the input image array, therefore it will be used for rotating an image. If the value of the angle parameter is positive, then the image gets rotated in the counter-clockwise direction. If you want to rotate the image clockwise, then the angle needs to be negative.
Syntax cv2.getRotationMatrix2D(center, angle, scale) Parameters
center: Center of the rotation for the input image.
angle: The angle of rotation in degrees.
scale: An isotropic scale factor. Which scales the image up or down according to the value provided.
ExampleLet’s take an example, and rotate the image using the trigonometric functions of the math module.
import cv2 import math def rotate_image(array, angle): height, width = array.shape[:2] image_center = (width / 2, height / 2) rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1) radians = math.radians(angle) sin = math.sin(radians) cos = math.cos(radians) bound_w = int((height * abs(sin)) + (width * abs(cos))) bound_h = int((height * abs(cos)) + (width * abs(sin))) rotation_mat[0, 2] += ((bound_w / 2) - image_center[0]) rotation_mat[1, 2] += ((bound_h / 2) - image_center[1]) rotated_mat = cv2.warpAffine(array, rotation_mat, (bound_w, bound_h)) return rotated_mat img = cv2.imread('Images/car.jpg',1) rotated_image = rotate_image(img, 256) cv2.imshow('Rotated image', rotated_image) cv2.waitKey(0) cv2.destroyAllWindows() Input image OutputThe output Rotated image is displayed below.
The input image is successfully rotated to the 256 degrees angle.
ExampleIn this example, we will rotate an image using cv2.getRotationMatrix2D() and python built in abs() functions.
import cv2 def rotate_image(arr, angle): height, width = arr.shape[:2] # get the image centers image_center = (width/2, height/2) rotation_arr = cv2.getRotationMatrix2D(image_center, angle, 1) abs_cos = abs(rotation_arr[0,0]) abs_sin = abs(rotation_arr[0,1]) bound_w = int(height * abs_sin + width * abs_cos) bound_h = int(height * abs_cos + width * abs_sin) rotation_arr[0, 2] += bound_w/2 - image_center[0] rotation_arr[1, 2] += bound_h/2 - image_center[1] rotated_mat = cv2.warpAffine(arr, rotation_arr, (bound_w, bound_h)) return rotated_arr img = cv2.imread('Images/cat.jpg',1) rotated_image = rotate_image(img, 197) cv2.imshow('Original image', img) cv2.imshow('Rotated image', rotated_image) cv2.waitKey(0) cv2.destroyAllWindows() Original Image Rotated ImageThe input image is successfully rotated to the 197degrees angle.
cv2.rotate()The cv2.rotate() function rotates an image frame in multiples of 90 degrees(0 or 90 or 180 or 270 angles). The function rotates the image in three different ways using the rotateCode= 0 or 1 or 2 parameters.
Syntax cv2.cv.rotate( src, rotateCode[, dst] ) Parameters
src: Input image
rotateCode: It specifies how to rotate the image.
dst: It is the output image of the same size and depth as the input image.
ReturnsIt returns a rotated image.
ExampleIn this example, the input image “Fruits.jpg” will be rotated to the 90 degrees anticlockwise direction.
import cv2 import numpy as np img = cv2.imread('Images/logo.jpg',1) rotated_image = cv2.rotate(img,rotateCode = 2) cv2.imshow('Original image', img) cv2.imshow('Rotated image', rotated_image) cv2.waitKey(0) cv2.destroyAllWindows() Original Image Rotated Image Using np.rot90()functionThe numpy.rot90() method is used to rotate an array by 90 degrees. If it is sufficient to rotate our input only about 90 degrees rotation, then it is a simple and easier way.
ExampleIn this example, we will take an input rectangular image “car.jpg” with 850X315 dimensions.
import cv2 import numpy as np img = cv2.imread('Images/car.jpg',1) rotated_image = np.rot90(img) cv2.imwrite('Rotated image.jpg', rotated_image) cv2.imshow('InputImage', img) cv2.waitKey(0) Original Image Rotated ImageThe method rotates the array from the first towards the second axis direction. So that the given image is rotated in an anti-clock wise direction.
A Quick Introduction To K – Nearest Neighbor (Knn) Classification Using Python
This article was published as a part of the Data Science Blogathon.
IntroductionThis article concerns one of the supervised ML classification algorithm-KNN(K Nearest Neighbors) algorithm. It is one of the simplest and widely used classification algorithms in which a new data point is classified based on similarity in the specific group of neighboring data points. This gives a competitive result.
Working
For a given data point in the set, the algorithms find the distances between this and all other K numbers of datapoint in the dataset close to the initial point and votes for that category that has the most frequency. Usually, Euclidean distance is taking as a measure of distance. Thus the end resultant model is just the labeled data placed in a space. This algorithm is popularly known for various applications like genetics, forecasting, etc. The algorithm is best when more features are present and out shows SVM in this case.
KNN reducing overfitting is a fact. On the other hand, there is a need to choose the best value for K. So now how do we choose K? Generally we use the Square root of the number of samples in the dataset as value for K. An optimal value has to be found out since lower value may lead to overfitting and higher value may require high computational complication in distance. So using an error plot may help. Another method is the elbow method. You can prefer to take root else can also follow the elbow method.
Let’s dive deep into the different steps of K-NN for classifying a new data point
Step 1: Select the value of K neighbors(say k=5)
Step 2: Find the K (5) nearest data point for our new data point based on euclidean distance(which we discuss later)
Step 3: Among these K data points count the data points in each category
Step 4: Assign the new data point to the category that has the most neighbors of the new datapoint
Example
Let’s start the programming by importing essential libraries
import numpy as np import matplotlib.pyplot as plt import pandas as pd import sklearn
Importing of the dataset and slicing it into independent and dependent variables
dataset = pd.read_csv('Social_Network_Ads.csv') X = dataset.iloc[:, [1, 2, 3]].values y = dataset.iloc[:, -1].values
Since our dataset containing character variables we have to encode it using LabelEncoder
from sklearn.preprocessing import LabelEncoder le = LabelEncoder() X[:,0] = le.fit_transform(X[:,0])
We are performing a train test split on the dataset. We are providing the test size as 0.20, that means our training sample contains 320 training set and test sample contains 80 test set
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0)
Next, we are doing feature scaling to the training and test set of independent variables for reducing the size to smaller values
from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test)
Now we have to create and train the K Nearest Neighbor model with the training set
from sklearn.neighbors import KNeighborsClassifier classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2) classifier.fit(X_train, y_train)
We are using 3 parameters in the model creation. n_neighbors is setting as 5, which means 5 neighborhood points are required for classifying a given point. The distance metric we are using is Minkowski, the equation for it is given below
As per the equation, we have to select the p-value also.
In our problem, we are choosing the p as 2 (also u can choose the metric as “euclidean”)
Our Model is created, now we have to predict the output for the test set
y_pred = classifier.predict(X_test)
y_test
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 1], dtype=int64) array([0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1], dtype=int64)We can evaluate our model using the confusion matrix and accuracy score by comparing the predicted and actual test values
from sklearn.metrics import confusion_matrix,accuracy_score cm = confusion_matrix(y_test, y_pred) ac = accuracy_score(y_test,y_pred)
confusion matrix –
[[64 4] [ 3 29]]accuracy is 0.95
# Importing the libraries import numpy as np import matplotlib.pyplot as plt import pandas as pd # Importing the dataset dataset = pd.read_csv('Social_Network_Ads.csv') X = dataset.iloc[:, [2, 3]].values y = dataset.iloc[:, -1].values # Splitting the dataset into the Training set and Test set from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.20, random_state = 0) # Feature Scaling from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test) # Training the K-NN model on the Training set from sklearn.neighbors import KNeighborsClassifier classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2) classifier.fit(X_train, y_train) # Predicting the Test set results y_pred = classifier.predict(X_test) # Making the Confusion Matrix from sklearn.metrics import confusion_matrix, accuracy_score cm = confusion_matrix(y_test, y_pred) ac = accuracy_score(y_test, y_pred)
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.
Related
Update the detailed information about Image Classification Using Convolutional Neural Network With Python on the Katfastfood.com website. We hope the article's content will meet your needs, and we will regularly update the information to provide you with the fastest and most accurate information. Have a great day!