Assignment 02# - Machine Learning 2023

University Of management and technology
Machine Learning
Assignment 02#
Submitted To:
Sir Hafiz Abdurrahman
Submitted By:
Ijaz ull haq
F2019266400
21/11/2023
How to Develop a CNN for MNIST Handwritten

Digit Classification
The MNIST dataset is an acronym that stands for the Modified National Institute of
Standards and Technology dataset.
It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single
digits between 0 and 9.
The task is to classify a given image of a handwritten digit into one of 10 classes
representing integer values from 0 to 9, inclusively.
It is a widely used and deeply understood dataset and, for the most part, is “solved.” Top-
performing models are deep learning convolutional neural networks that achieve a
classification accuracy of above 99%, with an error rate between 0.4 %and 0.2% on the
hold out test dataset.
The example below loads the MNIST dataset using the Keras API and creates a plot of
the first nine images in the training dataset.
# example of loading the mnist dataset

from tensorflow.keras.datasets import mnist
from matplotlib import pyplot as plt
# load dataset
(trainX, trainy), (testX, testy) = mnist.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
# define subplot
plt.subplot(330 + 1 + i)
# plot raw pixel data
plt.imshow(trainX[i], cmap=plt.get_cmap('gray'))
# show the figure
plt.show()
Model Evaluation Methodology:
Although the MNIST dataset is effectively solved, it can be a useful starting point for
developing and practicing a methodology for solving image classification tasks using
convolutional neural networks.
Instead of reviewing the literature on well-performing models on the dataset, we can

develop a new model from scratch.
# example of k-fold cv for a neural net

data = ...
# prepare cross validation
kfold = KFold(5, shuffle=True, random_state=1)
# enumerate splits
for train_ix, test_ix in kfold.split(data):
model =
Load Dataset
We know some things about the dataset.
For example, we know that the images are all pre-aligned (e.g. each image only contains
a hand-drawn digit), that the images all have the same square size of 28×28 pixels, and
that the images are grayscale.
Therefore, we can load the images and reshape the data arrays to have a single color
channel.
1 # load dataset
2 (trainX, trainY), (testX, testY) = mnist.load_data()
3 # reshape dataset to have a single channel
4 trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
5 testX = testX.reshape((testX.shape[0], 28, 28, 1))
We also know that there are 10 classes and that classes are represented as unique
integers.
We can, therefore, use a one hot encoding for the class element of each sample,
transforming the integer into a 10 element binary vector with a 1 for the index of the class
value, and 0 values for all other classes. We can achieve this with
the to_categorical() utility function.
1 # one hot encode target values
2 trainY = to_categorical(trainY)
3 testY = to_categorical(testY)
The load_dataset() function implements these behaviors and can be used to load the
dataset.
1 # load train and test dataset
2 def load_dataset():
3 # load dataset
11 return trainX, trainY, testX, testY
Define Model
Next, we need to define a baseline convolutional neural network model for the problem.
The model has two main aspects: the feature extraction front end comprised of
convolutional and pooling layers, and the classifier backend that will make a prediction.
For the convolutional front-end, we can start with a single convolutional layer with a small
filter size (3,3) and a modest number of filters (32) followed by a max pooling layer. The
filter maps can then be flattened to provide features to the classifier.
Given that the problem is a multi-class classification task, we know that we will require an
output layer with 10 nodes in order to predict the probability distribution of an image
belonging to each of the 10 classes. This will also require the use of a softmax activation
function. Between the feature extractor and the output layer, we can add a dense layer to
interpret the features, in this case with 100 nodes.
The define_model() function below will define and return this model.
1 # define cnn model
2 def define_model():
3 model = Sequential()
4 model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
5 model.add(MaxPooling2D((2, 2)))
6 model.add(Flatten())
7 model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
8 model.add(Dense(10, activation='softmax'))
9 # compile model
10 opt = SGD(learning_rate=0.01, momentum=0.9)
11 model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
12 return model
Evaluate Model
After the model is defined, we need to evaluate it.
The model will be evaluated using five-fold cross-validation. The value of k=5 was chosen
to provide a baseline for both repeated evaluation and to not be so large as to require a
long running time. Each test set will be 20% of the training dataset, or about 12,000
examples, close to the size of the actual test set for this problem.
The training dataset is shuffled prior to being split, and the sample shuffling is performed
each time, so that any model we evaluate will have the same train and test datasets in
each fold, providing an apples-to-apples comparison between models.
The evaluate_model() function below implements these behaviors, taking the training
dataset as arguments and returning a list of accuracy scores and training histories that
can be later summarized.
1 # evaluate a model using k-fold cross-validation
2 def evaluate_model(dataX, dataY, n_folds=5):
3 scores, histories = list(), list()
4 # prepare cross validation
5 kfold = KFold(n_folds, shuffle=True, random_state=1)
6 # enumerate splits
7 for train_ix, test_ix in kfold.split(dataX):
8 # define model
9 model = define_model()
10 # select rows for train and test
11 trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
12 # fit model
13 history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
14 # evaluate model
15 _, acc = model.evaluate(testX, testY, verbose=0)
16 print('> %.3f' % (acc * 100.0))
17 # stores scores
18 scores.append(acc)
19 histories.append(history)
20 return scores, histories
Complete Example
We need a function that will drive the test harness.
This involves calling all of the define functions.
1 # run the test harness for evaluating a model

2 def run_test_harness():
3 # load dataset
4 trainX, trainY, testX, testY = load_dataset()
5 # prepare pixel data
6 trainX, testX = prep_pixels(trainX, testX)
7 # evaluate model
8 scores, histories = evaluate_model(trainX, trainY)
9 # learning curves
10 summarize_diagnostics(histories)
11 # summarize estimated performance
12 summarize_performance(scores)
We now have everything we need; the complete code example for a baseline
convolutional neural network model on the MNIST dataset is listed below.
1 # baseline cnn model for mnist

2 from numpy import mean
3 from numpy import std
4 from matplotlib import pyplot as plt
5 from sklearn.model_selection import KFold
6 from tensorflow.keras.datasets import mnist
7 from tensorflow.keras.utils import to_categorical
8 from tensorflow.keras.models import Sequential
9 from tensorflow.keras.layers import Conv2D
10 from tensorflow.keras.layers import MaxPooling2D
11 from tensorflow.keras.layers import Dense
12 from tensorflow.keras.layers import Flatten
13 from tensorflow.keras.optimizers import SGD
14
15 # load train and test dataset
16 def load_dataset():
17 # load dataset
25 return trainX, trainY, testX, testY
26
27 # scale pixels
28 def prep_pixels(train, test):
29 # convert from integers to floats
30 train_norm = train.astype('float32')
31 test_norm = test.astype('float32')
32 # normalize to range 0-1
33 train_norm = train_norm / 255.0
34 test_norm = test_norm / 255.0
35 # return normalized images
36 return train_norm, test_norm
37
38 # define cnn model
39 def define_model():
40 model = Sequential()
41 model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
42 model.add(MaxPooling2D((2, 2)))
43 model.add(Flatten())
44 model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
45 model.add(Dense(10, activation='softmax'))
46 # compile model
47 opt = SGD(learning_rate=0.01, momentum=0.9)
48 model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
49 return model
50
51 # evaluate a model using k-fold cross-validation
52 def evaluate_model(dataX, dataY, n_folds=5):
53 scores, histories = list(), list()
54 # prepare cross validation
55 kfold = KFold(n_folds, shuffle=True, random_state=1)
56 # enumerate splits
57 for train_ix, test_ix in kfold.split(dataX):
58 # define model
59 model = define_model()
60 # select rows for train and test
61 trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
62 # fit model
63 history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
64 # evaluate model
65 _, acc = model.evaluate(testX, testY, verbose=0)
66 print('> %.3f' % (acc * 100.0))
67 # stores scores
68 scores.append(acc)
69 histories.append(history)
70 return scores, histories
71
72 # plot diagnostic learning curves
73 def summarize_diagnostics(histories):
74 for i in range(len(histories)):
75 # plot loss
76 plt.subplot(2, 1, 1)
77 plt.title('Cross Entropy Loss')
78 plt.plot(histories[i].history['loss'], color='blue', label='train')
79 plt.plot(histories[i].history['val_loss'], color='orange', label='test')
80 # plot accuracy
81 plt.subplot(2, 1, 2)
82 plt.title('Classification Accuracy')
83 plt.plot(histories[i].history['accuracy'], color='blue', label='train')
84 plt.plot(histories[i].history['val_accuracy'], color='orange', label='test')
85 plt.show()
86
87 # summarize model performance
88 def summarize_performance(scores):
89 # print summary
90 print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
91 # box and whisker plots of results
92 plt.boxplot(scores)
93 plt.show()
94
95 # run the test harness for evaluating a model
96 def run_test_harness():
97 # load dataset
98 trainX, trainY, testX, testY = load_dataset()
99 # prepare pixel data
100 trainX, testX = prep_pixels(trainX, testX)
101 # evaluate model
102 scores, histories = evaluate_model(trainX, trainY)
103 # learning curves
104 summarize_diagnostics(histories)
105 # summarize estimated performance
106 summarize_performance(scores)
107
108 # entry point, run the test harness
109 run_test_harness()

Assignment 02# - Machine Learning 2023

Uploaded by

Assignment 02# - Machine Learning 2023

Uploaded by

University Of management and technology

How to Develop a CNN for MNIST Handwritten

# example of loading the mnist dataset

Instead of reviewing the literature on well-performing models on the dataset, we can

# example of k-fold cv for a neural net

This involves calling all of the define functions.

1 # run the test harness for evaluating a model

1 # baseline cnn model for mnist

You might also like