free website builder

Current Projects


Hermes: Extending the HDF Library to Support Intelligent I/O Buffering for Deep Memory and Storage Hierarchy System

Modern high performance computing (HPC) applications generate massive amounts of data. However, the performance improvement of disk based storage systems has been much slower than that of memory, creating a significant Input/Output (I/O) performance gap. To reduce the performance gap, storage subsystems are under extensive changes, adopting new technologies and adding more layers into the memory/storage hierarchy. With a deeper memory hierarchy, the data movement complexity of memory systems is increased significantly ...


IRIS: I/O Redirection Via Integrated Storage

There is an ocean of available storage solutions in modern high-performance and distributed systems. These solutions consist of Parallel File Systems (PFS) for the more traditional high-performance computing (HPC) systems and of Object Stores for emerging cloud environments. More often than not, these storage solutions are tied to specific APIs and data models and thus, bind developers, applications, and entire computing facilities to using certain interfaces. Each storage system is designed and optimized for certain applications but does not perform well for others. Furthermore, modern applications have become more and more complex consisting of a collection of phases with different... 


Empower Data-Intensive Computing: the integrated data management approach

From the system point of view, there are two types of data: observational data, the data collected by electrical devices such as sensors, monitors, cameras, etc.; and simulation data, data generated by computing. In general, the latter is used in the traditional scientific high-performance computing (HPC) and requires strong consistency for correctness. The former is popular for newly emerged big data applications and does not require strong consistency. The difference in consistency leads to two kinds of file systems: data-intensive distributed file system, represented by the MapReduce-based Hadoop distributed file systems (HDFS) from Google and Yahoo...


Utilizing Memory Parallelism for High Performance Data Processing

While advances in microprocessor design continue to increase computing speed, improvements in data access speed of computing systems lag far behind. At the same time, data-intensive large-scale applications, such as information retrieval, computer animation, and big data analytics are emerging. Data access delay has become the vital performance bottleneck of modern high performance computing (HPC). Memory concurrency exists at each layer of modern memory hierarchies; however, conventional computing systems are primarily designed to improve CPU utilization and have inherent limitations in addressing ...


DEP: A Decoupled Execution Paradigm for Data-intensive High-End Computing

Large scale applications in critical areas of science and technology have become more and more data intensive. I/O has become a vital performance bottleneck of modern high-end computing (HEC) practices. Conventional HEC execution paradigms, however, are computing-centric. They are designed to utilize CPU performance for computation intensive applications, and have inherent limitations in addressing newly-emerged data access and data management issues of HEC.In this project, we propose an innovative decoupled execution paradigm (DEP) and the notion of separation of computing-intensive and data-intensive operations....


Application-Specific Optimization via Server Push I/O Architecture 

As modern multicore architectures put ever more pressure on the sluggish memory systems, computer applications become more and more data intensive. Advanced memory hierarchies and parallel file systems have been developed in recent years. However, they only provide performance for well-formed data streams, and fail to meet a more general demand. I/O has become a crucial performance bottleneck of high-end computing (HEC), especially for data intensive applications. New mechanisms and new I/O architectures need to be developed to solve the ‘I/O-wall’ problem. We propose a new I/O architecture for HEC. Unlike traditional I/O designs ... 


FENCE: Fault awareness ENabled Computing Environment

Modern high-end computing (HEC) systems are powerful but unprecedented complex. Unfortunately, the more complex a system is, the less reliable it is. With the design of HEC systems with thousands of processors, fault tolerance has become a timely important issue. The conventional solution, checkpointing, achieves fault tolerant by periodically saving system snapshots to the disk or memory. With the growing gap between processor speed and data access speed, such a reactive approach can further increase the disparity between sustained performance and peak performance in HEC. Recently, the proactive fault tolerance approach has been proposed...


Multicore Scheduling and Data Access

Multicore microprocessor has totally changed the landscape of what we know as (single machine) computing. It brings parallel processing into a single processor at task level. On one hand, it significantly further enlarges the performance gap between data processing and data access. On the other hand, it calls for a rethinking of system design to utilize the potential of multicore architecture. We believe the key of utilizing multicore microprocessors is to reduce data access delay. We rethink the system scheduling and support from the viewpoint of data access under multicore environments. To this end, in this research we focus on design and development of multicore specific system supports and mechanisms to reduce data access delay... 


Empowering Data Management, Diagnosis, and Visualization of Cloud-Resolving Models by Cloud Library upon Spark and Hadoop

In the age of big data, scientific applications are generating large volume of data, leading to an explosion of requirements and complexity to process these data. In High Performance Computing (HPC), data management is traditionally supported by the Parallel File Systems (PFS), such as Lustre, PVFS2, GPFS, etc. In big data environments, general-purpose analysis frameworks like MapReduce and Spark are popular and highly available with data storage supported by distributed file systems, such as Hadoop Distributed File Systems...

Past Projects


Grid Harvest Service (GHS) 

Rapid advancement of communication technology has changed the landscape of computing. New models of computing such as business-on-demand, Web services, peer-to-peer networks, Grid computing and Cloud computing have emerged to harness distributed computing and network resources and provide powerful services. In such computing platforms, resources are shared and are likely to be remote, thereby out of the user's control. Consequently, resource availability to each user varies largely from time to time due to resource sharing, system configuration change, potential...


Workflow Research Project

Workflow management is a new area of distributed computing. It shares many common characteristics with business workflow. However, with the management of thousands processes running coordinatedly in a widely distributed, shared network environment, the workflow of distributed computing is much more complex than the conventional business workflow. Workflow supports task scheduling but is more than task scheduling. From the view point of computing service, any user request is a service, which can be decomposed into a series of known basic services. These basic services may have inherent control...


Pervasive Computing

Computing addresses the issues of making human life easier, but in true sense in todays computing the users have to follow computing rather then other way around. With the advance of mobile computing and wireless communication technologies, pervasive computing has emerged as a feasible technology for human-centered computing. Pervasive Computing creates a ubiquitous environment that combines processors and sensors with network technologies (wireless and otherwise) and intelligent software to create an immerse environment to improve life, in which computers have become an embedded ...

Dynamic Virtual Machine Project

DVM system is a prototype middleware providing applications a secure, stable and specialized computing environment in cyberspace. It encapsulates computational resources in a secure, isolated and customized virtual machine environment, and enables transparent service mobility and service provisioning. In this research, a computing environment is modeled as DVM, an abstract virtual machine, and is incarnated automatically on various virtualization platforms. To migrate a virtual machine, the DVM system needs to collect the runtime states of a VM. The communication should be kept alive...

DistDLB: Dynamic Load Balancing of Scientific Applications on Parallel and Distributed Systems

Large-scale simulation is an important method in scientific and engineering research, as a compliment to theory and experiment, and has been widely used to study complex phenomena in many disciplines. Many large-scale applications are adaptive in that their computational load varies throughout the execution and causes uneven distribution of the workload at run-time. Dynamic load balancing (DLB) of adaptive applications involves in efficiently partitioning of the application and then migrating of ...

Highly Accurate PArallel Numerical Simulations (HAPANS)

In this interdisciplinary research, we study scalable parallel algorithms and simulations based on new mathematical development of adaptive wavelet methods. The driving force of this research is the need of next generation large scale simulation capability as required by the intrinsic physical property of industrial applications. The recent advances in both parallel wavelet methods and scalable parallel algorithms with new computer architectures have made such an endeavor a more realistic task. This on-going research activities combine the...

High Performance Computing Mobility (HPCM) middleware

Mobility is a primary functionality of the next generation computing. Intensive research has been done in recent years on mobile agents and mobile computing...

Virtual Collaboratory for Numerical Simulation (VCNS) 

A suite of software systems, communication protocals, and tools that enable computer-based cooperative work. It is a sister project of SNOW.


Stuart Building 
Room 112i, Room010 
10 W. 31st Street
Chicago, Illinois 60616


Phone: +1 312 567 6885