0% found this document useful (0 votes)
50 views

CV Assignment 2 Group02

Uploaded by

Manash Barman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

CV Assignment 2 Group02

Uploaded by

Manash Barman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Group No 02

Group Member Names:


1. MANASH JYOTI BARMNA 2022AA05102
2. MAHAJAN BARKHA 2022AA05044
3. RISHAB SHARMA 2022AA05051
4. DEEPANKAR KALITA 2022AA05159

Dataset Link: https://drive.google.com/drive/folders/1SGkY8kBdrIdFu-sjzNfkZxRvmg_MMb6i?


usp=sharing

##IMPORTS:

from google.colab import drive


import pandas as pd
import os
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import numpy as np
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization,
Activation, MaxPooling2D, Flatten, Dense, concatenate
from tensorflow.keras import layers, models
from tensorflow.keras.models import Model
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder
from PIL import Image

STEP 1
Read the dataset and plot the dataset distribution

drive.mount('/content/drive')

Mounted at /content/drive

Checks for the existence of a specified dataset directory for butterfly images on a Google Drive
path. If the directory exists, it lists and prints the contents of the directory; if not, it outputs a
message indicating the directory was not found.

dataset_path = '/content/drive/My Drive/cv/butterfly_dataset'


# Check if the directory exists
if os.path.exists(dataset_path):
# List the contents of the directory
contents = os.listdir(dataset_path)

# Print the contents


print("Contents of", dataset_path)
for item in contents:
print(item)
else:
print("Directory not found:", dataset_path)

Contents of /content/drive/My Drive/cv/butterfly_dataset


Testing_set.csv
Training_set.csv
test
train

Reads a CSV file from directory path (denoted by dataset_path) into a pandas DataFrame. This
DataFrame is used to load and structure the training dataset, which is utilized for training the
machine learning model.

train_df = pd.read_csv(dataset_path + '/Training_set.csv')

Visualizes the distribution of butterfly species in a training dataset using a bar chart, with the
species names on the x-axis and their respective counts on the y-axis. It customizes the plot to
improve readability by setting the figure size, adding a title, labels, and adjusting the x-axis
labels to be centered and rotated for better visibility.

# Plot dataset distribution


plt.figure(figsize=(10, 6))
sns.countplot(data=train_df, x='label')
plt.title('Dataset Distribution')
plt.xlabel('Butterfly Species')
plt.ylabel('Count')
# Rotate labels to 90 degrees and align them to center
plt.xticks(rotation=90, ha='center')
plt.tight_layout()
plt.show()
STEP 2
Extract local features

SIFT (Scale-Invariant Feature Transform) to extract local features.

train_image_paths = ['/content/drive/My
Drive/cv/butterfly_dataset/train/' + fname for fname in
train_df['filename']]

Defines a function to extract SIFT (Scale-Invariant Feature Transform) features from a list of
image paths. It iterates through each image, converts it to grayscale, and uses the SIFT
algorithm to detect and compute descriptors, which are then appended to a list. This process is
applied to a training dataset, enabling the extraction of local features from each image for
further analysis or model input.

def extract_sift_features(image_paths):
sift = cv2.SIFT_create()
descriptors_list = []
for path in image_paths:
img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
keypoints, descriptors = sift.detectAndCompute(img, None)
descriptors_list.append(descriptors)
return descriptors_list

train_descriptors = extract_sift_features(train_image_paths)
print("got local features")
got local features

##STEP 3

Extract global features

Color histograms capture the color distribution of images.

This function extracts color histograms from a collection of image paths, utilizing OpenCV for
image processing. For each image, it calculates a 3D histogram across the RGB color space with
a specified number of bins, normalizes the histogram, flattens it into a vector, and aggregates
these vectors into a list to represent the color distribution features of the images.

def extract_color_histograms(image_paths, bins=(8, 8, 8)):


histograms = []
for path in image_paths:
img = cv2.imread(path)
hist = cv2.calcHist([img], [0, 1, 2], None, bins, [0, 256, 0,
256, 0, 256])
hist = cv2.normalize(hist, hist).flatten()
histograms.append(hist)
return histograms

# Extract color histograms


train_histograms = extract_color_histograms(train_image_paths)
print("got global features")

got global features

STEP 4
Fuse local and global features

Fusing local (SIFT) and global (color histograms) features is achieved by concatenating their
feature vectors. However, because these features reside in different dimensions and scales,
normalization techniques is necessary for effective fusion.

This standardizes and condenses feature representations, facilitating their integration into
machine learning models, particularly for tasks like image classification or recognition.

def aggregate_descriptors(descriptors_list):
aggregated_descriptors = []
for descriptors in descriptors_list:
if len(descriptors) > 0:
aggregated_descriptor = np.mean(descriptors, axis=0)
else:
aggregated_descriptor =
np.zeros(descriptors_list[0].shape[1]) # Assuming non-empty
descriptors exist
aggregated_descriptors.append(aggregated_descriptor)
return np.array(aggregated_descriptors)

# Aggregate SIFT descriptors


train_descriptors_aggregated =
aggregate_descriptors(train_descriptors)

Diagnostic step to verify the dimensions of these datasets, ensuring they are suitable for further
processing or model input.

#Verify shapes
print(f"Aggregated Descriptors Shape:
{train_descriptors_aggregated.shape}")
print(f"Color Histograms Shape: {np.array(train_histograms).shape}")

Aggregated Descriptors Shape: (6499, 128)


Color Histograms Shape: (6499, 512)

Stacks two arrays: train_descriptors_aggregated and train_histograms, effectively merging


them into a single array named features_fused- enhancing the input data's dimensionality and
informational content for model training.

# Shapes are now compatible for horizontal stacking


features_fused = np.hstack((train_descriptors_aggregated,
np.array(train_histograms)))
print(f"Fused feature Shape: {np.array(features_fused).shape}")

Fused feature Shape: (6499, 640)

STEP 5
Add convolution and fcc layers

Load an image to determine its dimensions, calculate the length of the fused feature vector and
the number of unique classes in the dataset from a dataframe train_df, essential for configuring
the final layer of the neural network.

# Load image to determine the input shape


image = Image.open(dataset_path + '/train/Image_1.jpg')

# Get image dimensions


width, height = image.size
channels = 3 # The image is in RGB format

input_shape = (height, width, channels)


print(input_shape)

(224, 224, 3)
feature_vector_length = features_fused.shape[1]
num_classes = train_df['label'].nunique()

print(feature_vector_length , num_classes)

640 75

The model integrates Convolutional Neural Network (CNN) blocks for image processing with
externally fused features (e.g., SIFT + Color Histograms) to enhance learning from both image
textures and manually extracted features. It uses CNN layers to process images, concatenates
the CNN output with the fused feature vector, and finally passes through dense layers to classify
images into predefined categories.

# CNN Block
def cnn_block(input_layer, filters):
x = Conv2D(filters=filters, kernel_size=(3, 3), padding='same')
(input_layer)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
return x

# Define the CNN model


input_img = Input(shape=(input_shape)) # Image input

# CNN Blocks
x = cnn_block(input_img, 32)
x = cnn_block(x, 64)
x = cnn_block(x, 128)

# Flatten the output of the CNN


cnn_output = Flatten()(x)

# 'features_fused' into a Keras layer to concatenate it (named


`fused_features_input`) with the cnn_output.
fused_features_input = Input(shape=(640,),
name='fused_features_input')

# Concatenate the CNN output and the fused features


combined_input = concatenate([cnn_output, fused_features_input])

# Fully Connected Layers


x = Dense(512, activation='relu')(combined_input)
x = Dense(256, activation='relu')(x)
output = Dense(num_classes, activation='softmax')(x)

model = Model(inputs=[input_img, fused_features_input],


outputs=output)
The code compiles the model with the Adam optimizer and sets categorical crossentropy as the
loss function, targeting multi-class classification tasks. It also specifies accuracy as the metric to
monitor the model's performance during training and evaluation.

model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])

##STEP 6

Train and test the classifier performance (plot the loss accuracy graphs and confusion matrix)

labels = train_df['label'].values

Iterates through a directory of images, converting each image into a NumPy array and
aggregating them into a single 4D array. It's designed to handle and recover from potential
errors during the image loading process, ensuring that only successfully processed images are
included in the final dataset.

# Directory path where the images are located


train_path = os.path.join(dataset_path, 'train')

# Initialize an empty list to store the images


images = []

# Loop through image names


for i in range(1, 6500):
try:
# Construct the image path
image_path = os.path.join(train_path, f'Image_{i}.jpg')

# Open the image using PIL


img = Image.open(image_path)

# Convert the image to a NumPy array


img_array = np.array(img)

# Append the image array to the list of images


images.append(img_array)
except Exception as e:
# If there's any error loading or converting the image, print
the error message
print(f"Error loading image {image_path}: {e}")

# Convert the list of images to a NumPy array


images_array = np.array(images)

# Now, 'images_array' is a 4D NumPy array containing all the images in


the specified directory with shape (number_of_images, height, width,
channels)
Iterates through a sequence of image files, located in a training directory within a dataset path to
open each image using the Python Imaging Library (PIL) and process/store them into a list for
further training the machine learning model

# Initialize a LabelEncoder
label_encoder = LabelEncoder()

# Fit label encoder and return encoded labels


encoded_labels = label_encoder.fit_transform(labels)

# 'encoded_labels' contains numerical labels for classes


# 'encoded_labels' for one-hot encoding
labels_encoded = to_categorical(encoded_labels)

Divides the dataset into training and testing subsets, using both image data (images_array) and
externally fused features (features_fused) are split in the same proportion. It reserves 20% of
the data for testing to evaluate the model's performance, using a fixed random state for
reproducible splits.

# Split data into training and testing sets


X_train, X_test, y_train, y_test, features_train, features_test =
train_test_split(images_array, labels_encoded, features_fused,
test_size=0.2, random_state=42)

The training process for the model on the training data (X_train and features_train) and
corresponding labels (y_train), allocating 20% of it for validation to monitor overfitting.

The model will be trained for 15 epochs in batches of 32, allowing for iterative optimization and
evaluation of its performance on unseen data during the training phase.

history = model.fit([X_train, features_train], y_train,


validation_split=0.2, epochs=15, batch_size=32)

Epoch 1/15
130/130 [==============================] - 767s 6s/step - loss:
13.9035 - accuracy: 0.0498 - val_loss: 5.0412 - val_accuracy: 0.0827
Epoch 2/15
130/130 [==============================] - 800s 6s/step - loss: 3.7303
- accuracy: 0.1330 - val_loss: 4.2940 - val_accuracy: 0.0962
Epoch 3/15
130/130 [==============================] - 813s 6s/step - loss: 3.3004
- accuracy: 0.2041 - val_loss: 3.4779 - val_accuracy: 0.1644
Epoch 4/15
130/130 [==============================] - 806s 6s/step - loss: 2.8412
- accuracy: 0.2847 - val_loss: 3.4505 - val_accuracy: 0.2135
Epoch 5/15
130/130 [==============================] - 767s 6s/step - loss: 2.4336
- accuracy: 0.3729 - val_loss: 3.0252 - val_accuracy: 0.2510
Epoch 6/15
130/130 [==============================] - 804s 6s/step - loss: 2.0999
- accuracy: 0.4374 - val_loss: 2.9357 - val_accuracy: 0.2942
Epoch 7/15
130/130 [==============================] - 788s 6s/step - loss: 1.7395
- accuracy: 0.5155 - val_loss: 2.6957 - val_accuracy: 0.3856
Epoch 8/15
130/130 [==============================] - 817s 6s/step - loss: 1.5285
- accuracy: 0.5670 - val_loss: 2.5848 - val_accuracy: 0.4067
Epoch 9/15
130/130 [==============================] - 803s 6s/step - loss: 1.2344
- accuracy: 0.6506 - val_loss: 2.5770 - val_accuracy: 0.4260
Epoch 10/15
130/130 [==============================] - 769s 6s/step - loss: 0.9910
- accuracy: 0.7110 - val_loss: 2.6132 - val_accuracy: 0.4038
Epoch 11/15
130/130 [==============================] - 801s 6s/step - loss: 0.8295
- accuracy: 0.7468 - val_loss: 2.6823 - val_accuracy: 0.4346
Epoch 12/15
130/130 [==============================] - 799s 6s/step - loss: 0.7428
- accuracy: 0.7704 - val_loss: 2.6877 - val_accuracy: 0.4346
Epoch 13/15
130/130 [==============================] - 798s 6s/step - loss: 0.5765
- accuracy: 0.8177 - val_loss: 2.8450 - val_accuracy: 0.4558
Epoch 14/15
130/130 [==============================] - 800s 6s/step - loss: 0.3781
- accuracy: 0.8803 - val_loss: 3.5168 - val_accuracy: 0.3827
Epoch 15/15
130/130 [==============================] - 763s 6s/step - loss: 0.3657
- accuracy: 0.8860 - val_loss: 3.0345 - val_accuracy: 0.4337

Apply the previously trained model to the test dataset, consisting of image data (X_test) and the
corresponding extracted features (features_test), to generate predictions (y_pred). The model
outputs the predicted classes or probabilities for each sample in the test dataset, depending on
its configuration.

#Applying the learned model on the test data to make predictions


y_pred = model.predict([X_test, features_test])

41/41 [==============================] - 57s 1s/step

Evaluate the model's performance on the test dataset by calculating the loss and accuracy,
comparing the predicted outputs against the true labels (y_test).

It feeds both the image data (X_test) and the extracted features (features_test) to the model. The
test accuracy as a percentage and the test loss is shown, providing insights into the model's
generalization ability on unseen data.

# Evaluate the model


test_loss, test_accuracy = model.evaluate([X_test, features_test],
y_test)
print(f"Test accuracy: {test_accuracy*100:.2f}%, Test loss:
{test_loss}")

41/41 [==============================] - 57s 1s/step - loss: 3.3029 -


accuracy: 0.4269
Test accuracy: 42.69%, Test loss: 3.3028740882873535

Visualize the model's training and validation loss and accuracy over epochs. It plots the
progression of training and validation metrics to help in assessing the model's learning
performance and generalization ability, concluding with a print statement displaying the test set
accuracy and loss.

# Plotting the training/validation loss and accuracy


import matplotlib.pyplot as plt

plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Loss Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Accuracy Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.show()
Convert model predictions and true labels from one-hot encoded vectors to their class indices,
then compute and visualize the confusion matrix using a heatmap.

This matrix helps in evaluating the model's performance, highlighting the accuracy of
predictions across different classes by displaying the number of correct and incorrect
predictions.

# Convert predictions classes to one hot vectors


Y_pred_classes = np.argmax(y_pred, axis=1)
# Convert validation observations to one hot vectors
Y_true = np.argmax(y_test, axis=1)

# Compute the confusion matrix


confusion_mtx = confusion_matrix(Y_true, Y_pred_classes)

# Plot the confusion matrix


plt.figure(figsize=(10,8))
sns.heatmap(confusion_mtx, annot=True, fmt='d')
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix')
plt.show()

You might also like