Cobalt: A High Performance, Multi-Dimensional Batch Scheduler for Pre-exascale and Beyond Systems

Batch scheduling is crucial to high-performance computing (HPC) for efficient application execution and resource utilization. However, HPC systems and applications are undergoing significant changes, and current batch scheduling approaches can no longer keep up with these changes due to ever-growing system scale and diverse workload requirements. Future batch schedulers will increasingly emphasize on diverse workloads, on-demand availability, flexible resource sharing, as well as fine-grained resource needs, expressed not just in terms of one-size-fits-all number of cores and time duration, but also multi-aspect requirements on communication, I/O, power, network bandwidth, etc.

The goal of this project is to add multi-dimensional scheduling capabilities to the Cobalt scheduler at ALCF. Multi-dimensional capabilities include memory resources (e.g., on-chip and off-chip RAM, external RAM/NVRA), network resources, I/O burst buffer resources, power, and possibly other resources. We will develop a general framework that can dynamically analyze platform state and application requirements, and adaptively make runtime decision for job scheduling and resource allocation. Moreover, we intend to make integration of future dimensions easier and more consistent in Cobalt. Areas of research include advanced learning techniques, methods for users and the scheduler to communicate and assess multi-aspect requirements, scheduling policies supporting flexible, dynamic, and multi-dimensional resource constraints, and a more tightly coupled scheduling simulator to facilitate decision making and policy evaluation.

  • Zhiling Lan at Illinois Tech
  • Yuping Fan (PhD student)
  • Bill Allcock at ALCF
  • Paul Rich at ALCF
  • Mike Papka at ALCF

  • Cobalt Scheduler at ALCF

    Key Publications:
  • S. Wallace, X. Yang, V. Vishwanath, W. Allcock, S. Coghlan, M. Papka, and Z. Lan, "A Data Driven Scheduling Approach for Power Management on HPC Systems", Proc. of SC16 (acceptance rate is 18%), 2016.[PDF]
  • Y. Fan, P. Rich, W. Allcock, M. Papka, and Z. Lan, "Trade-off Between Prediction Accuracy and Underestimation Rate in Job Runtime Estimates", Proc. of IEEE Cluster'17 (acceptance rate is 21.8%), 2017.[PDF]
  • W. Allcock, P. Rich, Y. Fan, and Z. Lan, "Experience and Practice of Batch Scheduling on Leadership Supercomputers at Argonne", Proc. of the 21st workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), 2017. [PDF]

  • Workload traces and RAS logs from Intrepid and Mira at ALCF [Link].
    Note: For the use of the logs, please acknowledge the Argonne Leadership Computing Facility and cite the following paper:
    W. Allcock, P. Rich, Y. Fan, and Z. Lan, "Experience and Practice of Batch Scheduling on Leadership Supercomputers at Argonne", Proc. of the 21st Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), held in conjunction with IPDPS'17, 2017. [PDF]

    Dr. Zhiling Lan (lan AT iit DOT edu)

    This project is supported by the Office of Science of the U.S. Department of Energy under contract DEAC02- 06CH11357. Note: Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DOE.