0% found this document useful (0 votes)
296 views

COMP3308/COMP3608 Artificial Intelligence Week 10 Tutorial Exercises Support Vector Machines. Ensembles of Classifiers

This document provides exercises on support vector machines, ensembles of classifiers, and their implementation in Weka. It includes multiple choice questions on SVM concepts and exercises using Weka to test linear and non-linear SVMs on binary and multi-class data, as well as ensembles methods like bagging, boosting, and random forests. It also reminds students to allocate sufficient time to the report portion of the upcoming Assignment 2.

Uploaded by

hariet
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
296 views

COMP3308/COMP3608 Artificial Intelligence Week 10 Tutorial Exercises Support Vector Machines. Ensembles of Classifiers

This document provides exercises on support vector machines, ensembles of classifiers, and their implementation in Weka. It includes multiple choice questions on SVM concepts and exercises using Weka to test linear and non-linear SVMs on binary and multi-class data, as well as ensembles methods like bagging, boosting, and random forests. It also reminds students to allocate sufficient time to the report portion of the upcoming Assignment 2.

Uploaded by

hariet
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

COMP3308/3608 Artificial Intelligence, s1 2021

COMP3308/COMP3608 Artificial Intelligence

Week 10 Tutorial exercises


Support Vector Machines. Ensembles of Classifiers.
This week we have a smaller number of tutorial exercises. We will use the remaining time for questions
about Assignment 2.

Regarding Assignment 2: Please do not underestimate the report! It is worth 12/24 marks = 50% of your
mark for this assignment. Sometimes students spend too much time on the code and then there is not enough
time left to write a good quality report.

Exercise 1. (Homework)
This is a set of 5 multiple choice questions to be answered online on Canvas. To complete the homework,
please go to Canvas -> Quizzes->w10-homework-submission. Remember to press the “Submit” button at
the end. Only 1 attempt is allowed.

Here are the questions:


Q1. The problem of finding a decision boundary in SVM can be formulated as an optimisation problem
using Lagrange multipliers. What is the goal?
a) To maximize the margin of the decision boundary.
b) To minimize the margin of the decision boundary.

Q2. After SVM learning, each Lagrange multiplier lambda_i takes either zero or non-zero value. Which
one is correct:
a) A non-zero lambda_i indicates that example i is a support vector.
b) A zero lambda_i indicates that example i is a support vector.
c) A non-zero lambda_i indicates that the learning has not yet converged to a global minimum.
d) A zero lambda_i indicates that the learning process has identified support for example i.

Q3. In linear SVM, during training we compute dot products between:


a) training vectors
b) training and testing vectors
c) support vectors
d) support vectors and Lagrange multiplayers

Q4. Bagging is only applicable to classification problems and cannot be applied to regression problems.
a) True
b) False

Q5. Boosting is guaranteed to improve the performance of the single classifier it uses.
a) True
b) False

Exercise 2. Bagging, Boosting and Random Forest


a) What are the similarities and differences between Bagging and Boosting?

b) What are the 2 main ideas that are combined in Random Forest?
COMP3308/3608 Artificial Intelligence, s1 2021

Exercise 3. Boosting
Continue the example from slides 24-25. The current probabilities of the training examples to be used by
AdaBoost are given below:

p(x1) p(x2) p(x3) p(x4) p(x5) p(x6) p(x7) p(x8) p(x9) p(x10)
0.07 0.07 0.07 0.07 0.07 0.07 0.07 0.17 0.17 0.17

From these, a training set T2 has been created and using T2 a classifier C2 has been generated. Suppose that
C2 misclassifies examples x2 and x9. Show how the probabilities of all training examples are recalculated
and normalized.

Exercises using Weka

Exercise 4. SVM in Weka – Binary classification


1. Load the diabetes data (diabetes.arff) or another binary classification data. The two classes is the diabetes
data are “tested_positive” and “tested_negative”.

Choose 10-fold cross validation. Run the SVM (SMO located under “functions”). This implementation of
SVM uses John Platt's sequential minimal optimization algorithm to solve the optimization problem (i.e. to
train SVM classifier) and shows the computed decision boundary.

Examine the output.


a) What type of SVM is used – linear or non-linear? What type of kernel function is used? Where is the
equation of the decision boundary?

b) Let’s try non-linear SVM. Edit the properties of SMO (by clicking on “SMO”). Keep the same type of
kernel (Poly Kernel, it is a general type of kernel and we can use it to create both linear and no-linear SVMs)
but change the exponent from 1 to 2 (by clicking on “Poly kernel”). Now the kernel function is nonlinear:
Poly Kernel: K(x,y) = <x,y>^2.0. Run the classifier, observe the new equation of the decision boundary
and the new accuracy. Compare the accuracy of the linear and non-linear SVM, which one is better?
Experiment with other types of kernels (e.g. other polynomial, by increasing and exponent, or RBF).

3) Compare SVM’s accuracy with the backpropagation neural network and the other classifiers we have
studied.

Exercise 5. SVM in Weka – Multi-class classification


Load the iris data (iris.arff). Choose 10-fold cross validation and run the SVM classifier.

a) Examine the output. Not 1 but 3 SVM are generated. Do you know why?

b) Examining again the output, which of the 2 methods does Weka use?

Exercise 6. Ensembles of classifiers in Weka – Bagging, Boosting and Random Forest


1. Load iris data (iris.arff). Choose 10-fold cross validation. Run the decision trees (J48) single classifiers.

2. Run bagging of decision trees (J48). It is under “Meta”. The default base classifier is REPTree, you need
to change it to j48. Compare the performance with the single DT classifier from 1).
COMP3308/3608 Artificial Intelligence, s1 2021

3. Run boosting (AdaBoost) of decision trees (J48). It is also under “Meta” and you need to change the
default base classifier. Compare performance with the single classifiers from 1) and also with the bagged
classifier from 2).

4. Run Random Forest of decision trees (J48). It’s under “Trees” and you need to change the default base
classifier. Compare performance with the previous classifiers.

5. Use AdaBoost with OneR as a base classifier. Experiment with different number of hypotheses M. What
happens with the accuracy on the training data as M increases? What about the accuracy on test data? Is
this consistent with the AdaBoost theorem?

Exercise 7. Do you have any questions about Assignment 2? Ask your tutor!

You might also like