Ioan Raicu

Illinois Institute of Technology

Argonne National Laboratory

CS495: Introduction to Distributed Computing

Semester: Fall 2012
Lecture Time:
Tuesday/Thursday, 11:25AM - 12:40PM
Lecture Location:
Stuart Building 239
Professor:
Dr. Ioan Raicu (iraicu@cs.iit.edu, 1-312-567-5704)
Office Hours Time:
Thursday, 12:45PM-1:45PM
Office Hours Location:
Stuart Building 237D
Teaching Assistant:
Tonglin Li (tli13@iit.edu)
Office Hours Time:
Monday/Tuesday, 1PM-2PM
Office Hours Location:
Stuart Building 006

Course Description
This course covers general introductory concepts in the design and implementation of distributed systems, covering all the major branches such as Cloud Computing, Grid Computing, Cluster Computing, Supercomputing, and Many-core Computing.

The specific topics that this course will cover are: scheduling in multiprocessors, memory hierarchies, synchronization, concurrency control, fault tolerance, data parallel programming models, scalability studies, distributed memory message passing systems, shared memory programming models, tasks, dependence graphs and program transformations, parallel I/O, applications, tools (Cuda, Swift, Globus, Condor, Amazon AWS, OpenStack, Cilk, gdb, threads, MPICH, OpenMP, Hadoop, FUSE), SIMD, MIMD, fundamental parallel algorithms, parallel programming exercises, parallel algorithm design techniques, interconnection topologies, heterogeneity, load balancing, memory consistency model, asynchronous computation, partitioning, determinacy, Amdahl's Law, scalability and performance studies, vectorization and parallelization, parallel programming languages, and power. Some of these topics are covered in more depth in the graduate courses focusing on specific sub-domains of distributed systems, such as Advanced Operating Systems (CS550), Parallel Computing (CS546), Cloud Computing (CS553), Data-Intensive Computing (CS554), Advanced Computer Architecture (CS570), and Fault Tolerant Computing (CS595).

While this CS495 course is not a pre-requisite to any of the graduate level courses in distributed systems, both undergraduate and graduate students who wish to be better prepared for these courses could take this CS495 course. Undergraduate students are highly encouraged to take CS495 prior to any of the graduate level courses in distributed systems. Graduate students who have already taken CS546, CS550, CS553, CS554, CS570, or CS595 should not take this CS495 class. Furthermore, this CS495 class should not be taken concurently with CS546, CS550, CS553, CS554, CS570, or CS595.  

Many of these graduate courses are part of the Master of Computer Science Specialization in Distributed and Cloud Computing. This CS495 course is also a part of the Undergraduate Specialization in Data Science and the Specialization in Distributed and Cloud computing.

The course syllabus can be found here.

Unique Opportunity -- Hands-on Practical Experience

An important component of learning is to gain hands-on experience that a textbook just cannot teach. A portion of this course will cover practical aspects of distributed systems. For example, enrolled students will participate in the design, assembling, configuring, and benchmarking of a real cluster. The students would be exposed to practical issues in real cluster design, such as hardware tradeoffs, different operating systems, local and distributed storage, networking, virtualization, and grid/cloud middleware. The students will work in teams to build workstations/servers from scratch. The software stack will include Linux, XEN, Globus, Condor, OpenStack, NFS, PVFS, MPI, Swift, and Hadoop. This new cluster will then be used in subsequent assignments.

Another set of assignments will deal with real cloud systems, such as Google App Engine, Amazon EC2/S3, and Hadoop (MapReduce framework).

Students will also get the opportunity to attend a local conferences in Distributed Systems, specifically eScience 2012, UCC 2012, GlobusWorld 2013, and NSDI 2013.

I am also assembling a team of undergraduate students to compete in the Supercompiting 2013 Student Cluster Competition. The team has three slots (of the total 6 slots) available.   

Finally, I am also looking for an undergraduate student to join my DataSys Laboratory for a paid assistantship. If you are thinking about graduate school, or are excited about the opportunity to work at some of the largest technology companies (e.g. Microsoft, Google, Amazon, Facebook, Twitter, etc), then working in the DataSys Lab for several semesters will give you a significant advantage! Feel free to contact other students in my lab for feedback about the kinds of projects they are working on.

Schedule

Date Topic Reading Assignments
08-21-2012 Syllabus Syllabus  
08-23-2012 Introduction to Distributed Systems DCC Ch. 1  
08-28-2012 Introduction to Distributed Systems   Project #1 (Build Cluster)
08-30-2012 Distributed System Models  and Enabling Technologies      
09-04-2012 Distributed System Models  and Enabling Technologies    
09-06-2012 Distributed System Models  and Enabling Technologies   Project #1 due
Project #2 (Benchmarking)
09-11-2012 Distributed System Architectures DSPD Ch. 2  
09-13-2012 Distributed System Architectures    
09-18-2012 Parallel Programming Systems and Models

 
09-20-2012 Parallel Programming Systems and Models    
09-25-2012 Parallel Programming Systems and Models   Project #2 due
Project #3 (MapReduce)
09-27-2012 MapReduce DCC Ch. 6
MapReduce
MapReduce vs DBMS
 
10-02-2012 MapReduce    
10-04-2012 Workflow Systems DCC Ch. 6
Swift Workflow System
Parallel Scripting
 
10-09-2012
Exam #1 DCC 1, 6, DSPD 2  + External Reading ((MapReduce and Workflows) Exam #1
10-11-2012 Virtualization DCC Ch. 3  
10-16-2012
Virtualization DCC Ch. 3  
10-18-2012 Cloud Computing DCC Ch. 4, 6 Project #3 due
Project #4 (Amazon EC2/S3)
10-23-2012
Cloud Computing DCC Ch. 4, 6  
10-25-2012 Understanding the Cost of the Cloud -- Iman Sadooghi    
10-30-2012
Cloud Computing DCC Ch. 4, 6  
11-01-2012 Cloud Computing DCC Ch. 4, 6
11-06-2012
Cloud Computing DCC Ch. 4, 6 Project #4 due
11-08-2012 Filesystems & Networked file systems DSPD Ch. 11 Project #5 (Shared FUSE-based filesystem)
11-13-2012
Parallel Filesystems External Reading  
11-15-2012 Distributed Filesystems External Reading  
11-20-2012
Distributed Filesystems External Reading  
11-22-2012 NO CLASS (Thanksgiving Break)    
11-27-2012
Distributed Hash Tables DCC Ch. 8
External Reading
Project #5 due
11-29-2012 Exam #2 DCC 3, 4, 6, 8 (DHT), DSPD 7, 8, 11 + External Reading Exam #2

Textbooks

We will also use be using the textbook Distributed and Cloud Computing: Clusters, Grids, Clouds, and the Future Internet (DCC) by Kai Hwang, Jack Dongarra &  Geoffrey C. Fox (Required). This is the most modern book about distributed systems I have found. Some of the fundemental topics in this book are not covered in enough detail, so for some topics, we will use another textbook, Andrew S. Tanenbaum and Maarten van Steen. “Distributed Systems: Principles and Paradigms” (DSPD), Prentice Hall, 2nd Edition, 2007 (Optional). I encourage you to buy both tetxbooks as they are both excellent, but if you have to choose just one, please buy the first (DCC), and the necesarry optional reading material needed will be provided to the students in class.  

Prerequisites
Systems Programming (CS351) or Operating Systems (CS450)

Mailing lists
There is a course mailing list; you can send mail to the list by sending email to cs495-f12@datasys.cs.iit.edu. Please see http://datasys.cs.iit.edu/mailman/listinfo/cs495-f12 for more information about the course mailing list.

Projects:

There will be 6 projects throughout the semester, each worth 10% of the total grade. The projects will be completed in teams of 2 students. The first project will be hands-on, while the others will be primarly programming projects. Some projects will require knowledge of Java, while others will require knowledge of C and/or C++. It is expected that students know the basics of both of these languages.

Late Policy:

Projects will be due at 11:59PM on the day of the due date, through BlackBoard. There will be a 15 minute grace period. Any late submissions beyond the grace period will be penalized 10% every day it is late.  

Exams:

There will be 2 exams, one covering the material from the first half of the class, and the second covering the material from the second half. The exams will be individual, but students will be allowed to use their textbooks and any notes they have (on paper). No electronic devices such as phones, eReaders, tables, or laptops will be allowed. Simple calculators can be used.

The exams are scheduled on:

  • 10-09-2012 from 11:25AM - 1:25PM in SB239

  • 11-29-2012 from 11:25AM - 1:25PM in SB239

Please note that they extend for 45 minutes after the usual end of class, but this should not interfere with anyone's other classes due to the lunch period.

There will be no makeup exams.

Grading Policies:

  • Projects (6): 60%
  • Exam (2): 40%

The following grading scale will be used. The scale will be adjusted downwards based on the overall performance of the entire class. Traditionally, in my classes, the class average score will typically fall in the B-grade range.

  • A: 90% ~ 100%

  • B: 80% ~ 89%

  • C: 70% ~ 79%

  • D: 60% ~ 69%

  • E: 0% ~ 59%

Next Semester Spring 2013

CS553: Cloud Computing