Overview

This course teaches you about systems, algorithms, and the fundamental principles that enable distributed analysis of very large datasets using high-level languages, i.e., Modern Big Data Analytics.

  • Lecture: overview of content covered in the lectures: here
  • Project: information about the project: here
  • Literature review: information about the literature review: here

Important Dates

  • Select a paper to review: 09/05
  • Submit the report review report: 11/15
  • Select a project: 09/12
  • Meet to discuss project design: 11/15
  • Finish project implementation: 12/03 12:00 pm - 1:00 pm

Workload and Grading Scheme

Grading Policy:

Grading scheme:

  • 80+ = A
  • 50+ = B
  • 35+ = C
  • <35 = E

Syllabus

Textbook

White, Hadoop: The Definitive Guide, 4th Edition, O’Reilly Media, ISBN-13: 978-1491901632

Depending on your background, a standard database textbook may be useful:

Elmasri and Navathe. Fundamentals of Database Systems, 6th Edition, Addison-Wesley, 2003

Ramakrishnan and Gehrke. Database Management Systems, 3nd Edition, McGraw-Hill, 2002

Silberschatz, Korth, and Sudarshan. Database System Concepts, 6th Edition, McGraw Hill, 2010

Garcia-Molina, Ullman, and Widom. Database Systems: The Complete Book, 2nd Edition, Prentice Hall, 2008