SCS | Hermes Project

Modern high performance computing (HPC) applications generate massive amounts of data. However, the performance improvement of disk based storage systems has been much slower than that of memory, creating a significant Input/Output (I/O) performance gap. To reduce the performance gap, storage subsystems are under extensive changes, adopting new technologies and adding more layers into the memory/storage hierarchy. With a deeper memory hierarchy, the data movement complexity of memory systems is increased significantly, making it harder to utilize the potential of the deep memory and storage hierarchy (DMSH) design.

As we move towards the exascale era, I/O bottleneck is a must to solve performance bottleneck facing the HPC community. DMSHs with multiple levels of memory/storage layers offer a feasible solution but are very complex to use effectively. Ideally, the presence of multiple layers of storage should be transparent to applications without having to sacrifice I/O performance. There is a need to enhance and extend current software systems to support data access and movement transparently and effectively under DMSHs.

Hierarchical Data Format (HDF) technologies are a set of current I/O solutions addressing the problems in organizing, accessing, analyzing, and preserving data. HDF5 library is widely popular within the scientific community. Among the high level I/O libraries used in DOE labs, HDF5 is the undeniable leader with 99% of the share. HDF5 addresses the I/O bottleneck by hiding the complexity of performing coordinated I/O to single, shared files, and by encapsulating general purpose optimizations. While HDF technologies, like other existing I/O middleware, are not designed to support DMSHs, its wide popularity and its middleware nature make HDF5 an ideal candidate to enable, manage, and supervise I/O buffering under DMSHs.

Why Hermes?

Hermes will advance HDF5 core technology by developing
new buffering algorithms and mechanisms to support:

01.

Vertical and Horizontal Buffering in DMSHs

Here vertical means access data to/from different levels locally and horizontal means spread/gather data across remote compute nodes.

02.

Selective Buffering via HDF5

Here selective means some memory layer, e.g. NVMe, only for selected data.

03.

Dynamic Buffering via
Online System Profiling

The buffering schema can be changed dynamically based on messaging traffic

04.

Adaptive Buffering via
Reinforcement Learning

By learning the application's access pattern, we can adapt prefetching algorithms and cache replacement policies at runtime.

Hermes Architecture

Hermes machine model

Large amount of RAM, Local NVMe and/or SSD device, Shared Burst Buffers and Remote disk-based PFS.

Two data paths

Vertical ->within node
Horizontal ->across nodes

Fully distributed

fully scalable deployment on distributed clusters, consisting of node/local end remote shared storage layers

Hierarchy based on

Access Latency
Data Throughput
Capacity.

Hermes Objectives

01.

Being application- and system-aware

02.

Maximizing productivity

03.

Increasing resource utilization

H. Devarajan, A. Kougkas, H. Zheng, V. Vishwanath, X.-H. Sun "Stimulus: Accelerate Data Management for Scientific AI applications in HPC," The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID'22), May 16-19, 2022
H. Devarajan, H. Zheng, A. Kougkas, X.-H. Sun, V. Vishwanath "DLIO: A Data-Centric Benchmark for Scientific Deep Learning Applications," The 2021 IEEE/ACM International Symposium in Cluster, Cloud, and Internet Computing (CCGrid'21), May 17 - 20, 2021
J. Cernuda, H. Devarajan, L. Logan, K. Bateman, N. Rajesh, J. Ye, A. Kougkas, X.-H. Sun "HFlow: A Dynamic and Elastic Multi-Layered Data Forwarder," The 2021 IEEE International Conference on Cluster Computing (CLUSTER'21), September 7-10, 2021
N. Rajesh, H. Devarajan, J. Cernuda Garcia, K. Bateman, L. Logan, J. Ye, A. Kougkas, X.-H. Sun "Apollo: An ML-assisted Real-Time Storage Resource Observer," The 30th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'21), June 21-25, 2021
H. Devarajan, A. Kougkas, K. Bateman, X.-H. Sun "HCL: Distributing Parallel Data Structures in Extreme Scales," IEEE International Conference on Cluster Computing (CLUSTER'20), Sept. 14-17, 2020
H. Devarajan, A. Kougkas, and X-H Sun. "HReplica: A Dynamic Data Replication Engine with Adaptive Compression for Multi-Tiered Storage," 2020 IEEE International Conference on Big Data (Big Data), 2020, pp. 256-265, doi: 10.1109/BigData50022.2020.9378167.
H. Devarajan, A. Kougkas, and X-H Sun. "A Dynamic Multi-Tiered Storage System for Extreme Scale Computing," Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'20), 2020, Atlanta, Georgia.
H. Devarajan, A. Kougkas and X. Sun, "HFetch: Hierarchical Data Prefetching for Scientific Workflows in Multi-Tiered Storage Environments," 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, 2020, pp. 62-72, doi: 10.1109/IPDPS47924.2020.00017.
H. Devarajan, A. Kougkas, L. Logan and X. Sun, "HCompress: Hierarchical Data Compression for Multi-Tiered Storage Environments," 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), New Orleans, LA, USA, 2020, pp. 557-566, doi: 10.1109/IPDPS47924.2020.00064.
A. Kougkas, H. Devarajan, and X-H Sun. "I/O acceleration via multi-tiered data buffering and prefetching," Journal of Computer Science and Technology 35.1 (2020): 92-120.
H. Devarajan, A. Kougkas, and X-H Sun. "HFetch: Hierarchical Data Prefetching in Multi-Tiered Storage Environments (Poster)," Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'19), 2019.
A. Kougkas, H. Devarajan, J. Lofstead, and X-H. Sun, " LABIOS: A Distributed Label-Based I/O System," In the 28th International Symposium on High-Performance Parallel and Distributed Computing (HPDC’19), Phoenix, USA, Jun. 2019.
A. Kougkas, H. Devarajan, J. Lofstead, and X-H. Sun, " Harmonia: An Interference-Aware Dynamic I/O Scheduler," In the 2018 IEEE International Conference on Cluster Computing (Cluster’18), Belfast, Sept. 2018, pp. 290-301.
A. Kougkas, H. Devarajan and X.-H. Sun, " Hermes: A Heterogeneous-Aware Multi-Tiered Distributed I/O Buffering System," In the 27th ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC’18), NY, USA, Jun. 2018, pp. 219-230.
H. Devarajan, A. Kougkas, P. Challa, X-H. Sun, " Vidya: Performing Code-Block I/O Characterization for Data Access Optimization," In the IEEE International Conference on High Performance Computing, Data, and Analytics 2018 (HiPC’18), Bengaluru, India, Dec.2018.
H. Devarajan, A. Kougkas, X-H. Sun, " An Intelligent, Adaptive, and Flexible Data Compression Framework," In the 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’19), Larnaca, Cyprus, May, 2019.

Our Team

Dr. Xian-He Sun

Principal Investigator
Illinois Tech

Dr. Anthony Kougkas

Technical Lead
Illinois Tech

Gerd Heber

Co-Principal Investigator
HDF Group

Hermes

Background

Today's multi-tieredenvironments demonstrate:

Complex data placement among the tiers of a deep memory and storage hierarchy

Independent management of each tier of the DMSH

Deep memory and storage hierarchy (DMSH) systems requires:

1

Efficient and transparent data movement through the hierarchy

2

New data placement algorithms

3

Effective memory and metadata management,

4

An efficient communication fabric.

Hermes Project Synopsis

Why Hermes?

Hermes will advance HDF5 core technology by developing new buffering algorithms and mechanisms to support:

01.

Vertical and Horizontal Buffering in DMSHs

02.

Selective Buffering via HDF5

03.

Dynamic Buffering via Online System Profiling

04.

Adaptive Buffering via Reinforcement Learning

A new, multi-tiered, distributed buffering platform that is:

Hierarchical

Dynamic

Modular

Flexible

Hermes Contributions

Hermes Architecture

Hermes machine model

Two data paths

Fully distributed

Hierarchy based on

Hermes Node Design

01.

Dedicated core for Hermes

02.

RDMA-capable communication

03.

Can also be deployed in I/O Forwarding Layer

04.

Multithreaded Node Manager

Hermes Components

01.

Middle-ware library written in C++: Link with applications (i.e., re-compile or LD_PRELOAD) and Wrap-around I/O calls.

02.

Modular, extensible, performance-oriented.

03.

Will support: POSIX, HDF5 and MPI-IO.

04.

Hinting mechanism to pass user’s operations.

Hermes Objectives

01.

Being application- and system-aware

02.

Maximizing productivity

03.

Increasing resource utilization

04.

Abstracting data movement

05.

Maximizing performance

06.

Supporting a wide range of scientific applications and domains

Design Implications

Evaluation Results

Hermes Library Evaluation

RAM Management

Metadata Management

Communication

Workload Evaluation

Alternating Compute-I/O

Repetitive Read Operations

VPIC

HACC

Webinars

Publications

Today's multi-tiered
environments demonstrate:

Hermes will advance HDF5 core technology by developing
new buffering algorithms and mechanisms to support:

Dynamic Buffering via
Online System Profiling

Adaptive Buffering via
Reinforcement Learning

A new, multi-tiered, distributed buffering
platform that is:

Multithreaded Node
Manager

National Science Foundation
(NSF OCI-1835764)