Ioan Raicu

Illinois Institute of Technology

Argonne National Laboratory

CS554: Data-Intensive Computing


Project Ideas

Please read all of the following documents to get ideas for viable projects. I would encourage all of you to choose one of the following projects. I will allow other project ideas, but you have to discuss it with me before writing it up (to make sure both relevance and difficulty is appropriate). I encourage groups of 2 for these projects, although some projects might work with 1 person, or some might require 3 poeple because of the ambitious nature of the project. In certain cases, I will allow more than 3 students in a group, but it highly depends on the project and nature of the work. 

Fabriq:Bench Benchmarking mainstream Distributed Message Queues
Fabriq:Sched Leveraging Fabriq as a building block for Distributed Scheduling with CloudKon+
FusionFS:CKPT Efficient Checkpointing with Distributed File Systems
FusionFS:Hadoop Improving Hadoop through FusionFS
FusionFS:Lib Improving FusionFS Performance through User-level Library Interfaces
FusionFS:Sched Data-Aware Scheduling for Distributed File Systems
GeMTC:Hadoop Supporting Hadoop Applications on Accelerators
GeMTC:MIC Supporting MTC Applications on Intel Xeon Phi Many-Core Accelerators
GeMTC:Mon Monitoring GeMTC on NVIDIA Hardware
MATRIX:Bench Benchmarking the state-of-the-art Task Execution Frameworks of Many-Task Computing
MATRIX:Hadoop Improving Hadoop Performance and Scalability through Distributed Data-Aware Scheduling
MATRIX:HadoopSim Understanding the Scalability of the Hadoop Framework through Simulations
MATRIX:Swift Accelerating Swift Many-Task Computing Applications through Distributed Data-Aware Scheduling
NET:MPNet Improving Network Throughput through Multipath Network Routing Systems
NET:MPSim Improving Network Throughput through Multipath Network Routing Simulations
OS:DistOS Light-weight Distributed Operating Systems
OS:Power Understanding power management and its impact in modern Intel Haswell Processors
Slurm:Bench Benchmarking of the state-of-the-art Job Management.Systems for High Performance Computing
Slurm:Data Exploring Data-Aware HPC Scheduling with Slurm
ZHT:Bench Benchmarking mainstream NoSQL databases
ZHT:Graph Design and implement a graph database on top of ZHT
ZHT:Mon Investigation of the Communication Overheads for Distributed Monitoring with Aggregation Tree
ZHT:YCSB Yahoo Cloud Serving Benchmark on ZHT

Next Semester Fall 2015