Homework 2

Home Course Information PR Links Projects Homework

This assignment reviews the principles of linear classifiers from Chapter 3

bullet

Text Problems:
bullet

3.4

bullet

3.15

bullet

Computer Problems:

Use the Australian Crab dataset (Text file; Matlab version). This version is randomized in terms of order. (Ref: Toolbox section http://www.public.iastate.edu/~dicook/ggobi-book/ggobi.html). The columns of the data are 'sp'    'sex'    'index'    'FL'    'RW'    'CL'    'CW'    'BD'. Species is either 1 or 2 , Sex  is 1 or 2, index is a number of a particular data point for a log book. FL is the frontal lobe size, RW is the rear width of the shell,  CL is the carapace (the shell covering the body) length, CW is the carapace length, BD is the body depth. Each row is a data point.

 

bullet

Two Class Problem:

The two classes are the two species of crab. Use the measurement data (FL,RW,CL, CW,BD) to classify the crabs into one of the species. Normalize your data to the range [-1, 1] by either mapping it linearly into a range or by “sphering the data” (subtract the mean for each feature and divide by the standard deviation). Use 75% of the data as training data and 25% as test data.
bullet

Use the single-sample perceptron to learn the hyperplane between the data sets. Give your results in a confusion matrix.
bullet

Perceptron and other linear classifiers Example Code

 
bullet

Use the LIBSVM program to design a linear SVM to classify the data. Give the results in a confusion matrix. How many support vectors are there? Try it with a value of C greater than the default of 1. What happened? Were the results better or worse?
bullet

Matlab Interface can be found on SVM page. Example of code using package here

bullet

R Package e1071

 

bullet

 Four Class Problem:

The four classes are the two species of crab and the sex of the crab. Use the measurement data (FL,RW,CL, CW,BD) to classify the crabs into one of four classes defined by the species and sex. Normalize your data to the range [-1, 1] by either mapping it linearly into a range or by “sphering the data”. Use 75% of the data as training data and 25% as test data. Perform multi-class learning using the pairwise comparison method with voting.

 
bullet

1. Use the single-sample perceptron to learn the hyperplanes between the data sets. Give your results in a confusion matrix.

 
bullet

2. Use the LIBSVM program to design a linear SVM to classify the data. Give the results in a confusion matrix. How many support vectors are there? What happened? Were the results better or worse?

Edited: 02/06/2006