GISETTE (https://archive.ics.uci.edu/ml/ datasets/Gisette) is a handwritten digit recognition problem. The problem is to separate the highly confusible digits ‘4’ and ‘9’. This dataset is one of five datasets of the NIPS 2003 feature selection challenge.
We will work on the following:
(a) Standard run: Use all the 6000 training samples from the training set to train the
model, and test over all test instances, using the linear kernel. Report the train error,
test error, and number of support vectors.
(b) Kernel variations: In addition to the basic linear kernel, investigate two other standard
kernels: RBF (a.k.a. Gaussian kernel; set γ = 0.001), Polynomial kernel (set degree = 2, coef0 = 1; e.g, (1 + x^Tx)^2. Which kernel yields the lowest training error? Report
the train error, test error, and number of support vectors for both these kernels.