Homework #3


You must submit your work to your instructor before midnight on due date. Failure to do so will result in late penalties, see the syllabus for grading detail.

Submit your work on the Blackboard before midnight the day the homework is due. Here are the requirements for your Blackboard submission:

  • Attach the assignment as a compressed archive file (.zip, .tgz, .tbz2, .rar) Include in the archive a copy of any code you've written in order to get the assignment done.
  • The name of the file should be: firstName-lastName-HW-assignmentNumber.extension (e.g. Jane-Doe-HW-3.zip)
  • Include your e-mail address in the Comment field when submitting the assignment through the Digital Drop Box
  • If for any reason you are submitting the assignment more than once, indicate this in the Comment field by including the word COMPLEMENT

The purpose of this homework is to give you a chance to better understand encryption.

Here is what you have to do:

  • Find, download, and install FOSS symmetric encryption software, or otherwise use software you already have installed on your computer.
  • Download a story from the Gutenberg Project. For purposes of this project you should download a plain text (English) story that has at least 500,000 characters.
  • Measure the relative frequency of letters in the plain text.
  • Encrypt the story using the software you just downloaded. Make sure the output is ascii.
  • Measure the relative frequency of letters in the ciphertext.
  • Measure the time it takes to encrypt your story. Make multiple measurements such that you have a statistically significant sample; include in your report the average, median and standard deviation of your measurements.
  • Repeat the previous three bullet points using GnuPG, the software you installed on your computer for HW#2.
  • Compress, using gzip, the plain text. What is the compression ratio?
  • Compress, using gzip, the cipher text you obtained using the symmetric encryption software. What is the compression ratio?
  • Compress, using gzip, the cipher text you obtained using GnuPG. What is the compression ratio?
  • In addition to the above you'll have to do a little bit of extra work on the plain text:
    • Find the most frequent words in the text; include them in your report with their relative frequency; limit to 64 entries.
    • Find the most frequent two-letter words; include them in your report with their relative frequency; limit to 32 entries.
    • Find the most frequent three-letter words; include them in your report with their relative frequency; limit to 32 entries.
    • Find the most frequent four-letter words; include them in your report with their relative frequency; limit to 32 entries.
    • Find the most frequent bigrams (not spanning across words); include them in your report with their relative frequency; limit to 32 entries.
    • Find the most frequent trigrams (not spanning across words); include them in your report with their relative frequency; limit to 32 entries.
  • Create a report (PDF format) that includes your findings and comments on those findings. Don't forget to mention relevant statistics about the story you've used in your work, a link to the story you've used, the name and version(s) of the encryption software, the cipher used for encryption, the operating system you're using and the hardware it's running on (make, model, CPU type, CPU clock rate, amout of main memory), a description of how you measured the letter frequency, etc. Don't forget to compare and contrast the running times between the two encryption softwares as well discuss your findings related to compression ratios. Also, don't forget to mention your sources. Attach a table with your running times to the report. Make sure your report follows the guidelines for all written work in this class, as described in the syllabus


$Id: hw3.html,v 1.4 2010/01/31 01:40:35 virgil Exp $