CS-422 - Homework 3 (5%)
Classification
Due by: March 26, 2013
In this assignment you will test and implement various classification
techniques. Sample data
files are available at http://archive.ics.uci.edu/ml/datasets/.
The grade for this assignment will be based on your implementation of the algorithms, the thoroughness
of your evaluation of the algorithms, the results you obtain, and the
clarity of your report. Make sure to explain the results you obtain and do not unnecessarily repeat similar results. The code you write should be modular and well documented.
- Use the spam email dataset from the previous assignment and
identify one additional dataset to work with.
- Use R to test several classification algorithms on your two
datasets. Test the following algorithms (using the
e1071
,
nnet
, and caret
packages): naive Bayes classifier,
logistic regression, neural network, and support vector machines.
- Compare the results you obtain with the different methods. Make
sure to test different parameters in the algorithms. Draw conclusions
from your comparison.
- Implement and test the perceptron and logistic regression
algorithms. Compare the results you obtain with the two classifiers
and draw conclusions.
- Compare the logistic regression algorithm when implemented with
gradient descent and stochastic gradient descent. Draw conclusions.
- You are advised (but not required) to use R for implementing the
logistic regression algorithm.
- Write your code in a modular way using functions and make sure to
document it.
- Do not include in the submission large datasets that were provided
by us.
- The assignment contains testing of many algorithms. Try to be
concise and thorough in the way you present your results and make sure
not to include repetitive results. Results you present should have a
purpose.
- Follow the electronic submission instructions of assignment 1.
Gady Agam
2013-03-05