CS 448 Lab 2, Due 9/20/2011
Perceptron generalization experiment.
This lab assumes you have a working perceptron; if yours does not work, use mine (in examplesFromClass).
You have seen that a simple perceptron can learn to correctly categorize all the patterns in a reasonably large data set.
This looks pretty impressive; but has it learned anything that will be useful in a situation it has never seen before
(i.e. can it apply knowledge from one situation in a similar one) -- how would it do on novel stimuli?
View the working perceptron as a black box system of interest; try to figure out how well it generalizes.
The experiment
Your task is to determine (and report) the ability of your perceptron to generalize, and to determine what parameters cause it to generalize better.
Here is one way to test a perceptron's generalization:
- partition the data set into training and testing data at random
- Train the perceptron on the training data
- Test it on the test data (Turn off learning during testing).
Step 1: Make a prediction
Assuming you are trying to build a quasimorphic model of machine learning systems, you should practice prediction/correction with your model.
So, before writing code, or at least before running the experiment, make a prediction. How well will it generalize? What percentage of the test data will it get correct? Write down your prediction.
But... before making a prediction, perhaps you should ask yourself this question, "Compared to what?". Say your p'tron correctly categorizes 70% of the test data; that's better than chance, but
is that better than a fake p'tron that always says, "Yes"? Is it better than one with random weights?
Step 2: Design your classes (!)
Do this only if you don't want to spend lotsa time debugging.
Consider writing an Experiment class, or a Driver class. Maybe you would like a Result class, or a ResultList class (to hold a bunch of Results), maybe a Table class (for the whole table of results).
See below for insight into what these classes might do.
Step 3: Design an experiment
Two variables you might investigate: size of training set and, ratio of threshold to learning increment.
Here's some pseudo-code:
for each experimental condition (i.e. for each setting of variables){
iterate n times { // see Step 5
perform one experiment
}
}
where one experiment is:
randomly partition the data set
train on the training set
test on the testing set
record results
For example, you might make the ratio of theta to eta 1, 100, and 10000 and see if there is any difference in results.
You might vary the size of the test set between 10% to 90% and see if there is any difference.
Make predictions about how these variations will affect the results. Or, you could test some entirely different variable; it's up to you.
Step 4: Build your code incrementally.
There is enough happening here to be very confusing if you let it. If you want to avoid confusion and frustration, test your code before you put it all together.
For instance, if the Driver class implements the iteration in the pseudocode above, write and test the Driver code without any perceptrons;
just create the various experiments and let them report, say, their parameters as their results. That way you can create (and debug) the table of results
without waiting for the perceptrons to train. After your Driver can create all the appropriate Experiments and collate their results in a table, then you are ready to actually hook up the perceptron.
Step 5: Run your experiment, outputing data to a file
One run with a random number generator will not necessarly be representative of average performance -- if you repeat the same experiment and average the
results you can be more confident of your findings. Your program should generate a table of data with averages for easy viewing,and store it in a file.
Put the ratios on one axis and the percentages the other. During debugging you might want to display
the data from the individual runs grouped in those cells. Make the data easy to understand (you might want to check out
java.text.NumberFormat).
Step 6: Demo your program before class on 9/20
Then email me: 1) your zipped code, 2) your prediction, 3) your baseline ("compared to what?"), and 4) and your results.
Include a sentence or two summarizing your results. What was your hypothesis? Was it supported by the experimental findings?
If not, what is your new hypotheses?
Extra credit!
Modify your p'tron so that it generalizes better.