CS-422 - Homework 3 (5%)

Classification

Due by: March 26, 2013

Assignment Specifications

In this assignment you will test and implement various classification techniques. Sample data files are available at http://archive.ics.uci.edu/ml/datasets/. The grade for this assignment will be based on your implementation of the algorithms, the thoroughness of your evaluation of the algorithms, the results you obtain, and the clarity of your report. Make sure to explain the results you obtain and do not unnecessarily repeat similar results. The code you write should be modular and well documented.

  1. Use the spam email dataset from the previous assignment and identify one additional dataset to work with.

  2. Use R to test several classification algorithms on your two datasets. Test the following algorithms (using the e1071, nnet, and caret packages): naive Bayes classifier, logistic regression, neural network, and support vector machines.

  3. Compare the results you obtain with the different methods. Make sure to test different parameters in the algorithms. Draw conclusions from your comparison.

  4. Implement and test the perceptron and logistic regression algorithms. Compare the results you obtain with the two classifiers and draw conclusions.

  5. Compare the logistic regression algorithm when implemented with gradient descent and stochastic gradient descent. Draw conclusions.

General comments



Gady Agam 2013-03-05