Project 1: Multilayer perceptron
Overview
Task: implement a general multilayer perceptron classifier (supporting at least one hidden layer), trained by the backpropagation algorithm. Employ this model on a task of classifying points on a plane into three categories. Use a validation technique to select the best performing model, then perform final testing.
Deadline: March 31st, 23:59 CEST
Specifics
Model
- Multi-Layer Perceptron, having at least one non-linear hidden layer
- (Stochastic) Gradient Descent via Back-Propagation (online, “true” or mini-batch)
Data
- one-line header, then one sample per line
- points in a 2D plane (2 real-valued inputs)
three output classes (
A
,B
andC
)- train set –
2d.trn.dat
, 8000 samples – training data (estimation and validation) test set –
2d.tst.dat
, 2000 samples – testing data
Training
Split the training data set into a bigger estimation subset and a smaller validation subset.1 Use this split to perform model selection, i.e. find the best-performing combination of hyper-parameters (model architecture: number of hidden layers, neuron counts, …; training parameters: learning rate, …):
- train the model on the estimation subset
- test the model on the validation subset (not the test set! don’t touch that yet!)
- remember the hyper-parameters of the best performing model
(Sanity check: a properly working network should reach a classification accuracy of \(\geq 95\%\))
Also try experimenting with some of the following:
- input preprocessing (e.g. normalization/rescaling)
- activation functions (logsig, tanh, softmax, …)
- output encoding (one-hot encoding, ordinal)
- training length and/or early-stopping
- learning rate schedule
- weight initialization type (uniform, Gaussian, sparse, orthogonal) and scale
- momentum type (none, classic, Nesterov’s accelerated gradient) and strength
- regularization:
- implicit (weight decay, …)
- explicit (\(L_1\), \(L_2\), …)
- regression error metric used for training (square error, categorical cross-entropy/log-loss, …)
Testing
Using the best performing set of hyper-parameters (on the validation set), train a new model on the full training set, then perform final testing on the test set. Report classification accuracy, regression error and calculate a confusion matrix.
Bonus
Train the model using a more sophisticated method, such as:
- Scaled Conjugated Gradient [2 pt]
- A newer method (published after 2010): Adagrad, RMSprop, ADAM,... [1 pt]
Submission
Submit your code and report using this project as a single archive (e.g .zip
).
Code
Projects should be written in Python; use of previously finished labs is strongly encouraged. All the “interesting” bits should be identifiable in the code (especially all the relevant equations). You’ll probably need no additional libraries other than the standard numpy
/scipy
/matplotlib
combo. Don’t reimplement the wheel, use np.loadtxt
/np.savetxt
/np.load
/np.save
and plt.savefig
where necessary.)
Model selection should not be performed by hand, but the project should rather include a runnable program2 that goes through the various combinations of parameters, selects the best performing model and tests and produces final outputs.
(If you’d desperately prefer to use another language, write us an e-mail.)
Report
Create a report – briefly describing model selection, training and testing – in .pdf
format. The report should be sufficiently detailed so that one can read the description, reimplement your project and using the provided parameters arrive at the same results (reproducibility). (Assume previous knowledge of neural network algorithms, so for example don’t explain how backpropagation works. But include details such as if the training was online/minibatch/batch and whether and what strength of momentum was used.)
- for each examined model (hyper-parameter combination), report estimation and validation error [table]
- for the best model:
- error vs. time (at least one instance) [plot]
- outputs in 2D [plot]
- confusion matrix [table]
- rows = actual classes
- columns = predicted classes
- sum of each column = 100%
- correct submissions with highest (testing) accuracies will be awarded bonus points