Ioan Raicu

Illinois Institute of Technology

Argonne National Laboratory

CS554: Data-Intensive Computing

Semester: Spring 2015

Lecture Time: Monday/Wednesday, 11:25AM - 12:40PM

Lecture Location: Stuart Building 113

Professor: Dr. Ioan Raicu (iraicu@cs.iit.edu)

Office Hours Time: Wednesday 12:45PM-1:45PM

Office Hours Location: Stuart Building 237D

Teaching Assistant: Ke Wang (kwang22@hawk.iit.edu)

Office Hours Time: Monday 10:15AM-11:15AM

Office Hours Location: Stuart Building 006

Teaching Assistant: Tonglin Li (tli13@iit.edu)

Office Hours Time: Tuesday 12:45PM-1:45PM

Office Hours Location: Stuart Building 006

Teaching Assistant: Dongfang Zhao (dzhao8@hawk.iit.edu

Office Hours Time: Thursday 12:45PM-1:45PM

Office Hours Location: Stuart Building 006

 

 

This course is a tour through various research topics in distributed data-intensive computing, covering topics in cluster computing, grid computing, supercomputing, and cloud computing. We will explore solutions and learn design principles for building large network-based computational systems to support data intensive computing. This course is geared for junior/senior level undergraduates and graduate students in computer science. Prerequsites: CS450; however, one or more of the following courses would be recommended: 451, CS546, CS550, CS552, CS553, or CS570. 

We will be using Piazza to facilitate course discussions, at http://piazza.com/iit/spring2015/cs554/home 

In order to highight some of the best projects from the class this year (11 of the 27 projects), I have posted some of the final reports below (for a complete list of project titles and students, click here):

  1. Arvind Shekar, Arihant Raj Nagarajan, Itua Ijagbone, Shivakumar Vinayagam. "Distributed Scheduling and monitoring service leveraging FaBRiQ as a building block for CloudKon+", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  2. Antonios Kougkas. "A Decoupled Execution Paradigm Programming Model", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  3. Kevin Brandstatter. "FusionFS: Enabling Distributed Indexing And Text Search", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  4. Tonglin Li, Chaoqi Ma, Jiabao Li, Ioan Raicu. "ZHT+: A Graph Database On ZHT", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  5. Alekya Thalari, Krishnaja Kethireddy, Nirmal Kumar Ravi, Prathamesh Mantri. "T-FUSE: Improving Hadoop through FusionFS", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  6. Vivek Viswanathan. "Hadoop Mapreduce OpenCL Plugin", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  7. Eric Faurie, Chaitanya Reddy Chatla. "JFusionFS: A Java Implementation of FusionFS", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  8. Thomas Dubucq, Tony Forlini, Virgile Landeiro Dos Reis, and Isabelle Santos. "MATRIX: Bench - Benchmarking the state-of-the-art Task Execution Frameworks of Many-Task Computing", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  9. Karl Stough, Serapheim Dimitropoulos, Poornima Nookala. "Evaluating the Support of MTC Applications on Intel Xeon Phi Many-Core Accelerators", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  10. Sughosh Divanji, Raghav Kapoor, Dongfang Zhao, Ioan Raicu. "PVFS simulation using CODES/ROSS simulator", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
  11. Gagan Munisiddha Gowda Benjamin L. Miwa Anirudh Sunkineni. "ZHT+ : Design and Implementation of a Graph Database Using ZHT", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015

 

Schedule

Date Lecture Topic Reading (To be completed by posted date) Assignments
01-12-2015 Syllabus (Slides, PDF)    
01-14-2015 Introduction to Distributed Systems (Slides)    
01-19-2015 NO CLASS    
01-21-2015 Introduction to Distributed Systems 1. Foreward, by Gordon Bell
2. Jim Gray on eScience: A Transformed Scientific Method 
Quiz#1
01-26-2015 Introduction to Data-Intensive Distributed Computing (Slides)    
01-28-2015 ZHT: the Zero-hop Distributed Hash Table (Slides) -- Tonglin Li ZHT: A Light-weight Reliable Persistent Dynamic Scalable Zero-hop Distributed Hash Table, IEEE IPDPS 2013  
02-02-2015 FusionFS: the Fusion Distributed File System (Slides) -- Dongfang Zhao FusionFS: Towards Supporting Data-Intensive Scientific Applications on Extreme-Scale High-Performance Computing Systems, IEEE BigData 2014

Optional:
Virtual Chunks: On Supporting Random Accesses to Scientific Data in Compressible Storage Systems, IEEE BigData 2014

HyCache+: Towards Scalable High-Performance Caching Middleware for Parallel File Systems, IEEE/ACM CCGrid 2014

Distributed Data Provenance for Large-Scale Data-Intensive Computing”, IEEE Cluster 2013

Towards High Performance Key-Value Stores through GPU-Accelerated Coding (see BB)
 
02-04-2015 MATRIX: a Many-Task Computing Eexecution Fabric (Slides) -- Ke Wang Distributed Load-Balancing with Adaptive Work Stealing for Many-Task Computing on Billion-Core Systems (see BB)

Optional:
Optimizing Load Balancing and Data-Locality with Data-aware Scheduling , IEEE BigData 2014

SimMatrix: Simulator for MAny-Task computing execution fabRIc at eXascales, ACM HPC 2013
 
02-09-2015 Slurm++: a Distributed Workload Manager for High-Performance Computing (Slides) -- Ke Wang Slurm++: a Distributed Workload Manager for Extreme-Scale High-Performance Computing Systems (see BB)

Optional:
Next Generation Job Management Systems for Extreme Scale Ensemble Computing, ACM HPDC 2014 
 
02-11-2015 FaBRiQ: a Distributed Message Queuing System (Slides) -- Iman Sadooghi FaBRiQ: Leveraging Distributed Hash Tables towards Distributed Publish-Subscribe Message Queues (see BB)

Optional:
"Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues, SCRAMBL 2014
Quiz#2
02-16-2015 GeMTC: ManyGPU-enabled Many-Task Computing (Slides) Design and Evaluation of the GeMTC Framework for GPU-enabled Many-Task Computing, ACM HPDC 2014  
02-18-2015 GeMTC: ManyGPU-enabled Many-Task Computing  Project Brainstorming Writeups Project Proposal Writeup
02-23-2015 CloudKon: a Cloud enabled Distributed tasK executiON framework (slides) -- Iman Sadooghi Achieving Efficient Distributed Scheduling with Message Queues in the Cloud for Many-Task Computing and High-Performance Computing, IEEE/ACM CCGrid 2014  
02-25-2015 Project Brainstorming (slides) -- Dongfang Zhao, Tonglin Li, Ke Wang, Iman Sadooghi    
03-02-2015 Project Brainstorming (slides)    
03-04-2015 Project Brainstorming    
03-06-2015     Group formation Due Project Proposal Due Quiz#3
03-09-2015 MapReduce (Slides) MapReduce: Simplified Data Processing on Large Clusters

Optional
MapReduce: a flexible data processing tool

Apache Hadoop YARN: yet another resource negotiator  

Google’s MapReduce programming model — Revisited
 
03-11-2015 Swift Workflow System (Slides) Swift/T: Large-scale Application Composition via Distributed-memory Dataflow Processing

Optional
Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications

Compiler Techniques for Massively Scalable Implicit Task Parallelism

Swift: A language for distributed parallel scripting
 
03-16-2015 NO CLASS (Spring Break)    
03-18-2015 NO CLASS (Spring Break)    
03-23-2015 Swift Workflow System    
03-25-2015 Swift Workflow System    
03-30-2015 A Berkeley View of Resource Management(Spark, Mesos, RDD, Shark, Sparrow) (Slides #1,Slides #2) Sparrow: distributed, low latency scheduling

Optional
Spark: cluster computing with working sets

Mesos: A platform for fine-grained resource sharing in the data center

Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing

Shark: fast data analysis using coarse-grained distributed memory
Project Midterm Report Writeup
04-01-2015  A Berkeley View of Resource Management   Quiz#4
Project Final Report Writeup
04-06-2015 Parallel File Systems (Slides #1,Slides #2, Slides #3) I/O Performance Challenges at Leadership Scale

Optinoal
GPFS: A Shared-Disk File System for Large Computing Clusters (PDF)

PVFS: A Parallel File System for Linux Clusters (PDF)

Lustre: Building a File System for 1,000-node Clusters (PDF)

Scalable Performance of the Panasas Parallel File System (PDF)
Project Midterm Progress Report Due
04-08-2015 Distributed File Systems (Slides) The Google File System Emulated PC Meeting Instructions
04-13-2015 Distributed File Systems (Slides #1,Slides #2) Ceph: A Scalable, High-Performance Distributed File System

Optional
Ceph as a scalable alternative to the Hadoop Distributed File System
 
04-15-2015 Distributed Databases Hive-a petabyte scale data warehouse using hadoop

Optional 
Pig latin: a not-so-foreign language for data processing

Dremel: interactive analysis of web-scale datasets

Spanner: Google's Globally-Distributed Database
 
04-20-2015 Emulated PC Meeting    
04-22-2015 Emulated PC Meeting   Quiz#5
04-27-2015
8AM-8PM
NO CLASS (everyone attending GCASR 2015 at UIC)    
04-29-2015
10AM-4:30PM
Final Presentations    
05-04-2015 NO CLASS   Project Final Reports Due


Next Semester Fall 2015

CS550