Machine Learning Assignment
Machine Learning Assignment
Transformed Image
For each preprocessed image, each pixel corresponds to an attribute taking one of 16 values.
Thus, the data has 64 attributes. Although each attribute is an integer from 0 to 16, you must
treat each attribute as real-valued. There are 10 classes corresponding to each of digit.
You are provided five (5) data files.
optdigits train.dat contains (a permuted version of) the original training data.
optdigits train trans.dat contains the transformed image version of the examples in the
file optdigits train.dat.
optdigits test trans.dat contains the transformed image version of the test examples.
optdigits trial.dat contains an example of each digit from the original validation set.
1
The reason for the preprocessing is to normalize the data to help correct for small distortions and reduce
dimensionality. The resulting images provide a good approximation of the original images for classification purposes.
Please visit http://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits at the UCI
Machine Learning Repository [Lichman, 2013] for more information about this data set.
Page 1 of 8
Programming Homework 1
Spring 2015
optdigits trial trans.dat contains the transformed image version of the examples from
optdigits trial.dat.
Each file contains one data example per line. Each line contains 65 integers separated by a
single empty space. The first 64 integers correspond to the values of each of the attributes (i.e., a
number from 0 to 16) and the last integer is the example class label (i.e., the corresponding digit
0, 1, . . . , 9). The training and test data sets have 3823 and 1797 examples, respectively.
You are asked to implement the technique of K-nearest neighbors classification using Euclidean
distance as the distance metric and apply it, using K = 1 and K = 3, to the data set of optical
handwritten digits described above. As a tie-breaking rule during classification, select the lowest
digit as the class label (i.e., if there is a tie between 3 and 7, pick 3 as the label).
1-Nearest Neighbor
Here, you will evaluate the performance of a 1-nearest neighbor classifier by producing its learning
graph, which is a plot of the test error as a function of the number of training examples (i.e., the size
of the training set). In particular, compute and display/plot the learning graph for training sets consisting of the first m = 10, 50, 100, 500, 1000, 3823(all) examples in the optdigits train trans.dat
file. Recall that the test error is the misclassification error obtained from evaluating the classifier
on the test data. The file optdigits test trans.dat contains the test data, which in this case is
the transformed version of the images.
3-Nearest Neighbors
For each image in the trial set, provide the indexes to the 3-nearest neighbor examples in the
training dataset. The index to an example is the row, or line number, of the file with the training
dataset.
1. For each transformed image example in the trial data set given in the optdigits trial trans.dat
file, identify the indexes to the 3-nearest neighbors in the training dataset in the transformed
space, using the training data file of transformed images given in optdigits train trans.dat.
2. List, in increasing order of Euclidean distance, the indexes to the 3-nearest neighbors you
identified for each of the 10 exemplars trial dataset, include their respective labels. Also
provide the output labels of the corresponding 3-nearest-neighbors classifier for each of the
trial examples. How many of the trial examples are correctly classified by the 3-nearestneighbors classifier?
3. (Optional) Display the corresponding original 32x32 pixels, B&W images, which you can find
in the optdigits train.dat file, of the 3-nearest neighbors of each of the 10 exemplars in the
trial data set, along with their respective labels. Display the images as a row, starting with
the exemplar itself (whose original image you get from the file optdigits trial.dat), and
Page 2 of 8
Programming Homework 1
Spring 2015
followed by the original images for the 3-nearest neighbors, in increasing order of Euclidean
distance.
NOTE: The first 1024 binary values of each line in the files optdigits train.dat and
optdigits trial.dat encode the original bitmap images as bit vectors. You need to appropriately reshape that vector to obtain the original bitmap image in matrix form.
Programming Homework 1
Spring 2015
(b) the training and test misclassification error rate together on a separate graph
as a function of the number of rounds of BackProp. Include these two graph plots, along
with a brief discussion of the results, in your report.
2. report the value of the error function on the training and test data, and the train and test
misclassification error rate. Tabulate and include those values in your report.
3. evaluate each example in the trial data set on each neural network learned and report the
corresponding output classification.
The first two rows above corresponds to the weight files for each consecutive layer and error-related
information for the best and last neural network obtained during BackProp. The weight file
best nnet W 1 0.dat contains the weights parameters between the first and second layer as a table
with 10 rows and 65 columns (64 + 1 to account for the offset threshold parameter that is input
to each unit of each layer). The error information file best nnet err final.dat contains a single
Page 4 of 8
Programming Homework 1
Spring 2015
row of values of the error function on the training and test data, followed by the values of the
training and test error rates of the neural network. The case for the file last nnet W 1 0.dat
and last nnet err final.dat is similar. (You will not be using the files last nnet W 1 0.dat
and last nnet err final.dat in this homework.) The last two rows of the table of files above
corresponds to the training and test error function values and rates that the neural networks found
at each iteration of BackProp achieve. Each one of those files is a sequence of values (written
as a single column in the file) for each round of BackProp. You can use those values to plot the
evolution of the respective error values during BackProp.
Neural-Network Classification Code. Given the weight-parameter files for a neural network
obtained using the learning bash-shell script just described, we can use the neural network for
classification. To do this, we first copy all the inputs of all the examples we want to classify in
the file X.dat, one example per row, and one column per input feature in the file. Suppose the
neural network architecture is a perceptron with 64 inputs units and 10 output units. Then copy
the weight parameters into the file W 1 0.dat. (For example, if we want to classify using the best
neural network found during BackProp in the example above, we would simply copy the content
of the file best nnet W 1 0.dat into the file W 1 0.dat.) Then we execute the following bash-shell
scripts
./classify nnet script [64 10]
which will produce the file Y.dat with the classification output that the neural network we are
evaluating assigns to each of the examples in the file X.dat; the classification of each example in
X.dat appears as a single column in Y.dat, in the same order as given in X.dat.
On the Neural-Network Black-box. The implementation of the actual neural-network algorithms is in a Matlab-like language called Octave. The wrapper assumes that you have installed
either Octave or Matlab. 2 You need to edit the line in the script corresponding to the call to
Matlab or Octave, and replace it with the correct command call and path to the Matlab or Octave
program installed in your system. (The source code was not tested in Matlab; it may need minor
modifications to run in that environment.) Once you have specified a program and edited the script
accordingly, the wrapper will call the corresponding program for you.
GNU Octave is freely distributed software. See http://www.gnu.org/software/octave/ for download information.
Page 5 of 8
Programming Homework 1
Spring 2015
One-against-all Approach to Multiclass Problems: You will use a one-against-all approach to cast the L-class classification problem into L individual binary classification problems.
In this approach, for each class c, you will learn an SVM using the provided code to discriminate between class c and the rest, where class c examples correspond to the positive examples (y = +1), while examples of the other classes are the negative examples (y = 1). Suppose that
Pthe SMV (l)binary(l)classifier learned this way for class c is Hc (x) = sign (fc (x)), where
fc (x) = m
l=1 l (c)y K(x , x) + b(c). Then, we built an overall multi-class classifier H as
H(x) Hc (x) where c arg max fc (x) .
c=1,...,L
Page 6 of 8
Programming Homework 1
Spring 2015
GNU Octave is freely distributed software. See http://www.gnu.org/software/octave/ for download information.
Page 7 of 8
Programming Homework 1
Spring 2015
What to Turn In
You need to submit the following (electronically via Blackboard):
1. A written report (in PDF) that includes the information/plots/graphs/images requested
above along with a brief discussion of all your results. For example, for nearest-neighbors,
include plots, list of indexes, ordered by Euclidean distance, classifiers output labels, answer
to question, and, for the optional part, a grid of images, properly formatted per the layout
instructions given above, along with a brief discussion. In your discussion, compare and
contrast the different classifier models and learning algorithms based on your results.
2. All your code and executable (as a tared-and-gziped compressed file), with instructions on
how to run your program. A platform-independent executable is preferred; otherwise, also
provide instructions on how to compile your program. Please use standard tools/compilers/etc.,
generally available in most popular platforms.
Collaboration Policy: It is OK to discuss the homework with peers outside your own group,
but each group of students must write and turn in their own report, code, etc., based on their own
work. In addition, every member of the team must be able to answer any question on any aspect of
the submitted report and source code.
References
M. Lichman. UCI machine learning repository, 2013. URL http://archive.ics.uci.edu/ml.
Page 8 of 8