CV Assignment 2 Group02
CV Assignment 2 Group02
##IMPORTS:
STEP 1
Read the dataset and plot the dataset distribution
drive.mount('/content/drive')
Mounted at /content/drive
Checks for the existence of a specified dataset directory for butterfly images on a Google Drive
path. If the directory exists, it lists and prints the contents of the directory; if not, it outputs a
message indicating the directory was not found.
Reads a CSV file from directory path (denoted by dataset_path) into a pandas DataFrame. This
DataFrame is used to load and structure the training dataset, which is utilized for training the
machine learning model.
Visualizes the distribution of butterfly species in a training dataset using a bar chart, with the
species names on the x-axis and their respective counts on the y-axis. It customizes the plot to
improve readability by setting the figure size, adding a title, labels, and adjusting the x-axis
labels to be centered and rotated for better visibility.
train_image_paths = ['/content/drive/My
Drive/cv/butterfly_dataset/train/' + fname for fname in
train_df['filename']]
Defines a function to extract SIFT (Scale-Invariant Feature Transform) features from a list of
image paths. It iterates through each image, converts it to grayscale, and uses the SIFT
algorithm to detect and compute descriptors, which are then appended to a list. This process is
applied to a training dataset, enabling the extraction of local features from each image for
further analysis or model input.
def extract_sift_features(image_paths):
sift = cv2.SIFT_create()
descriptors_list = []
for path in image_paths:
img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
keypoints, descriptors = sift.detectAndCompute(img, None)
descriptors_list.append(descriptors)
return descriptors_list
train_descriptors = extract_sift_features(train_image_paths)
print("got local features")
got local features
##STEP 3
This function extracts color histograms from a collection of image paths, utilizing OpenCV for
image processing. For each image, it calculates a 3D histogram across the RGB color space with
a specified number of bins, normalizes the histogram, flattens it into a vector, and aggregates
these vectors into a list to represent the color distribution features of the images.
STEP 4
Fuse local and global features
Fusing local (SIFT) and global (color histograms) features is achieved by concatenating their
feature vectors. However, because these features reside in different dimensions and scales,
normalization techniques is necessary for effective fusion.
This standardizes and condenses feature representations, facilitating their integration into
machine learning models, particularly for tasks like image classification or recognition.
def aggregate_descriptors(descriptors_list):
aggregated_descriptors = []
for descriptors in descriptors_list:
if len(descriptors) > 0:
aggregated_descriptor = np.mean(descriptors, axis=0)
else:
aggregated_descriptor =
np.zeros(descriptors_list[0].shape[1]) # Assuming non-empty
descriptors exist
aggregated_descriptors.append(aggregated_descriptor)
return np.array(aggregated_descriptors)
Diagnostic step to verify the dimensions of these datasets, ensuring they are suitable for further
processing or model input.
#Verify shapes
print(f"Aggregated Descriptors Shape:
{train_descriptors_aggregated.shape}")
print(f"Color Histograms Shape: {np.array(train_histograms).shape}")
STEP 5
Add convolution and fcc layers
Load an image to determine its dimensions, calculate the length of the fused feature vector and
the number of unique classes in the dataset from a dataframe train_df, essential for configuring
the final layer of the neural network.
(224, 224, 3)
feature_vector_length = features_fused.shape[1]
num_classes = train_df['label'].nunique()
print(feature_vector_length , num_classes)
640 75
The model integrates Convolutional Neural Network (CNN) blocks for image processing with
externally fused features (e.g., SIFT + Color Histograms) to enhance learning from both image
textures and manually extracted features. It uses CNN layers to process images, concatenates
the CNN output with the fused feature vector, and finally passes through dense layers to classify
images into predefined categories.
# CNN Block
def cnn_block(input_layer, filters):
x = Conv2D(filters=filters, kernel_size=(3, 3), padding='same')
(input_layer)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
return x
# CNN Blocks
x = cnn_block(input_img, 32)
x = cnn_block(x, 64)
x = cnn_block(x, 128)
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
##STEP 6
Train and test the classifier performance (plot the loss accuracy graphs and confusion matrix)
labels = train_df['label'].values
Iterates through a directory of images, converting each image into a NumPy array and
aggregating them into a single 4D array. It's designed to handle and recover from potential
errors during the image loading process, ensuring that only successfully processed images are
included in the final dataset.
# Initialize a LabelEncoder
label_encoder = LabelEncoder()
Divides the dataset into training and testing subsets, using both image data (images_array) and
externally fused features (features_fused) are split in the same proportion. It reserves 20% of
the data for testing to evaluate the model's performance, using a fixed random state for
reproducible splits.
The training process for the model on the training data (X_train and features_train) and
corresponding labels (y_train), allocating 20% of it for validation to monitor overfitting.
The model will be trained for 15 epochs in batches of 32, allowing for iterative optimization and
evaluation of its performance on unseen data during the training phase.
Epoch 1/15
130/130 [==============================] - 767s 6s/step - loss:
13.9035 - accuracy: 0.0498 - val_loss: 5.0412 - val_accuracy: 0.0827
Epoch 2/15
130/130 [==============================] - 800s 6s/step - loss: 3.7303
- accuracy: 0.1330 - val_loss: 4.2940 - val_accuracy: 0.0962
Epoch 3/15
130/130 [==============================] - 813s 6s/step - loss: 3.3004
- accuracy: 0.2041 - val_loss: 3.4779 - val_accuracy: 0.1644
Epoch 4/15
130/130 [==============================] - 806s 6s/step - loss: 2.8412
- accuracy: 0.2847 - val_loss: 3.4505 - val_accuracy: 0.2135
Epoch 5/15
130/130 [==============================] - 767s 6s/step - loss: 2.4336
- accuracy: 0.3729 - val_loss: 3.0252 - val_accuracy: 0.2510
Epoch 6/15
130/130 [==============================] - 804s 6s/step - loss: 2.0999
- accuracy: 0.4374 - val_loss: 2.9357 - val_accuracy: 0.2942
Epoch 7/15
130/130 [==============================] - 788s 6s/step - loss: 1.7395
- accuracy: 0.5155 - val_loss: 2.6957 - val_accuracy: 0.3856
Epoch 8/15
130/130 [==============================] - 817s 6s/step - loss: 1.5285
- accuracy: 0.5670 - val_loss: 2.5848 - val_accuracy: 0.4067
Epoch 9/15
130/130 [==============================] - 803s 6s/step - loss: 1.2344
- accuracy: 0.6506 - val_loss: 2.5770 - val_accuracy: 0.4260
Epoch 10/15
130/130 [==============================] - 769s 6s/step - loss: 0.9910
- accuracy: 0.7110 - val_loss: 2.6132 - val_accuracy: 0.4038
Epoch 11/15
130/130 [==============================] - 801s 6s/step - loss: 0.8295
- accuracy: 0.7468 - val_loss: 2.6823 - val_accuracy: 0.4346
Epoch 12/15
130/130 [==============================] - 799s 6s/step - loss: 0.7428
- accuracy: 0.7704 - val_loss: 2.6877 - val_accuracy: 0.4346
Epoch 13/15
130/130 [==============================] - 798s 6s/step - loss: 0.5765
- accuracy: 0.8177 - val_loss: 2.8450 - val_accuracy: 0.4558
Epoch 14/15
130/130 [==============================] - 800s 6s/step - loss: 0.3781
- accuracy: 0.8803 - val_loss: 3.5168 - val_accuracy: 0.3827
Epoch 15/15
130/130 [==============================] - 763s 6s/step - loss: 0.3657
- accuracy: 0.8860 - val_loss: 3.0345 - val_accuracy: 0.4337
Apply the previously trained model to the test dataset, consisting of image data (X_test) and the
corresponding extracted features (features_test), to generate predictions (y_pred). The model
outputs the predicted classes or probabilities for each sample in the test dataset, depending on
its configuration.
Evaluate the model's performance on the test dataset by calculating the loss and accuracy,
comparing the predicted outputs against the true labels (y_test).
It feeds both the image data (X_test) and the extracted features (features_test) to the model. The
test accuracy as a percentage and the test loss is shown, providing insights into the model's
generalization ability on unseen data.
Visualize the model's training and validation loss and accuracy over epochs. It plots the
progression of training and validation metrics to help in assessing the model's learning
performance and generalization ability, concluding with a print statement displaying the test set
accuracy and loss.
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Loss Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Accuracy Over Epochs')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
Convert model predictions and true labels from one-hot encoded vectors to their class indices,
then compute and visualize the confusion matrix using a heatmap.
This matrix helps in evaluating the model's performance, highlighting the accuracy of
predictions across different classes by displaying the number of correct and incorrect
predictions.