Home About Research Invention Members area Contact
Resources
Projects
Publications
Presentations
Supports

Publication List Based on Areas

Dissertations and Theses

Technical Reports

Current Publications

[2012]

  • W. Tang, N. Desai, V. Vishwanath, D. Buettner, and Z. Lan, "Multi-Domain Job Coscheduling on Leadership Computing Systems", Journal of Supercomputing (J Supercomput), 2012.
  • H. Song, Y. Yin, Y. Chen, X.-H. Sun, "Cost-intelligent Application-specific Data layout Optimization for Parallel File Systems", Cluster Computing, pp. 1-14, Feb. 2012.
  • Y. Chen, H. Zhu, H. Jin, and X.-H. Sun, "Algorithm-level Feedback-controlled Adaptive Data Prefetcher: Accelerating Data Access for High-Performance Processors", accepted to appear in Parallel Computing (ParCo), 2012.
  • X.-H. Sun and D. Wang, "APC: A Performance Metric of Memory Systems", ACM SIGMETRICS Performance Evaluation Review, Volume 40 , Issue 2, 2012.
  • J. He, J. Kowalkowski, M. Paterno, D. Holmgren, J. Simone, and X.-H. Sun, "Layout-aware Scientific Computing - A Case Study Using MILC", accepted to appear in a special issue of Journal of Computational Science.
  • H. Jin and X.-H. Sun, "Performance Comparison under Failures of MPI and MapReduce: An Analytical Approach", accepted to appear in 2nd International Workshop on Cloud Computing and Scientific Applications (CCSA), in conjunction with CCGrid 2012. Invited to a special issue of Future Generation Computing System (FGCS).
  • H. Song, H. Jin, J. He, X.-H. Sun and R. Thakur, "A Server-Level Adaptive Data Layout Strategy for Parallel File Systems", accepted to appear in the Proc. of 2012 International Workshop on High Performance Data Intensive Computing(HPDIC 2012), in Conjunction With IEEE IPDPS 2012.
  • H. Jin, X. Yang, X. -H. Sun, and I. Raicu, "ADAPT: Availability-aware MapReduce Data Placement in Non-Dedicated Distributed Computing Environment", accepted to appear in Proc. of the 32nd International Conference on Distributed Computing Systems (ICDCS), 2012. (acceptance rate: 71/515=13%).
  • Y. Yu, D. Rudd, Z. Lan, N. Gnedin, A. Kravtsov, and J. Wu, "Improving Parallel IO Performance of Cell-based AMR Cosmology Applications", accepted to appear in Proc. of IPDPS'12, May, 2012.
  • H. Zou, X.-H. Sun, S. Ma, and X. Duan, "A Source-Aware Interrupt Scheduling for Modern Parallel I/O Systems", accepted to appear in Proc. of IEEE International Parallel and Distributed Processing Symposium (IPDPS' 12), 2012.
  • H. Jin, T. Ke, Y. Chen and X.-H. Sun, "Checkpointing Orchestration: Toward a Scalable HPC Fault-Tolerant Environment", accepted to appear in Proc. of IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May, 2012. (acceptance rate: 83/302=27.5%).
  • Y. Yin, S. Byna, H. Song, X.-H. Sun, and R. Thakur, "Boosting Application-Specific Parallel I/O Optimization Using IOSIG", accepted to appear in Proc. of IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May, 2012. (acceptance rate: 83/302=27.5%).
  • R. Ge, X. Feng and X.-H. Sun, "SERA-IO: Integrating Energy Consciousness into Parallel I/O Middleware", accepted to appear in Proc. of IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), May, 2012. (acceptance rate: 83/302=27.5%).

[2011]

  • H. Song, Y. Yin, X.-H. Sun, R. Thakur, and S. Lang, "Server-Side I/O Coordination for Parallel File Systems", in Proc. of the ACM/IEEE SuperComputing Conference (SC'11), Nov. 2011. (acceptance rate: 74/352=21.0%).
  • X.-H. Sun and D. Wang, "Memory Access Cycle and the Measurement of Memory Systems," in the 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS'11), in conjunction with IEEE/ACM SuperComputing 2011, Nov. 2011.
  • J. He, H. Song, X.-H. Sun, Y. Yin and R. Thakur, "Pattern-aware File Reorganization in MPI-IO," in the 6th Parallel Data Storage Workshop (PDSW'11), in conjunction with ACM/IEEE SuperComputing 2011, Nov. 2011.
  • J. He, J. Kowalkowski, M. Paterno, D. Holmgren, J. Simone and X.-H. Sun, "Layout-aware Scientific Computing - A Case Study Using MILC," in the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA'11), in conjunction with ACM/IEEE SuperComputing 2011, Nov. 2011.
  • W. Tang, N. Desai, V. Vishwanath, D. Buettner, and Z. Lan, "Job Coscheduling on Coupled High-End Computing Systems", in Proc. of International Conference on Parallel Processing Workshops (ICPPW'11), Sept, 2011.
  • D. Wang, X.-H. Sun, N. Hu, and N. Sun, "EthSpeeder: A High-performance Scalable Fault-Tolerant Ethernet Network Architecture for Data Center", in Proc. of the 6th IEEE International Conference on Networking, Architecture, and Storage (NAS2011), pp.355-363, July 2011.
  • L. Yu, Z. Zheng, Z. Lan, and S. Coghlan, "Practical Online Failure Prediction for Blue Gene/P: Period-based vs Event-driven", in Proc. of Proactive Failure Avoidance, Recovery, and Maintenance workshop(in conjunction with DSN'11), June, 2011.
  • H. Song, Y. Yin, Y. Chen, X.-H. Sun, "A Cost-intelligent Application-specific Data layout Scheme for Parallel File Systems", in Proc. of the 20th International ACM Symposium on High Performance Distributed Computing (HPDC'11), June 2011. (acceptance rate: 22/170=12.9%).
  • H. Song, X.-H. Sun, Y. Chen, "A Hybrid Shared-nothing/Shared-data Storage Scheme for Large-scale Data Processing", Best Paper Award, in Proc. of the 9th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA'11), May 2011.
  • H. Jin, K. Qiao, X.-H. Sun and Y. Li, "Performance under Failures of MapReduce Applications (Poster Presentation)", in Proc. of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'11), May 2011.
  • K. Zhang, Z. Wang, Y. Chen, H. Zhu and X.-H. Sun, "PAC-PLRU: A Cache Replacement Policy to Salvage Discarded Predictions from Hardware Prefetchers", Student Scholar Award, in the Proc. of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'11), May 2011. (acceptance rate: 55/189=29.1%).
  • H. Song, Y. Yin, X.-H. Sun, R. Thakur and S. Lang, "A Segment-Level Adaptive Data Layout Scheme for Improved Load Balance in Parallel File Systems", "in the Proc. of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'11), May 2011. (acceptance rate: 55/189=29.1%).
  • H. Song, Y. Chen, X.-H. Sun, "A Hybrid Shared-nothing/Shared-data Storage Architecture for Large Scale Databases(Poster Presentation)", in Proc. of the 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'11), May 2011.
  • W. Tang, Z. Lan, N. Desai, D. Buettner, Y. Yu, "Reducing Fragmentation on Torus-Connected Supercomputers", in Proc. of IEEE International Parallel and Distributed Processing Symposium (IPDPS' 11), May 2011.
  • Z. Zheng, L. Yu, W. Tang, Z. Lan, R. Gupta, N. Desai, S. Coghlan, and D. Buettner, "Co-Analysis of RAS Log and Job Log on Blue Gene/P", in Proc. of IEEE International Parallel and Distributed Processing Symposium (IPDPS' 11), May 2011.
  • Y. Chen, X.-H. Sun, R. Thakur, P. C. Roth and W. Gropp, "LACIO: A New Collective I/O Strategy for Parallel I/O Systems," in Proc. of IEEE International Parallel and Distributed Processing Symposium (IPDPS' 11), May 2011. (acceptance rate: 112/571=19.6%).

[2010]

  • Z. Lan, J. Gu, Z. Zheng, R. Thakur, and S. Coghlan, "A Study of Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems," in press of Journal of Parallel and Distributed Computing, vol. 70, pp. 630-643, 2010.
  • X.-H. Sun and Y. Chen, "Reevaluating Amdahl's Law in the Multicore Era," in press of Journal of Parallel and Distributed Computing, vol. 70, no. 2, pp. 183-188, Feb 2010.
  • Z. Lan, Z. Zheng, and Y. Li, "Toward Automated Anomaly Identification in Large-Scale Systems," in IEEE Transactions on Parallel and Distributed Systems, vol. 21, no. 2, pp. 174 - 187, Feb. 2010.
  • H. Jin, X.-H. Sun, Y. Chen, T. Ke, "REMEM: REmote MEMory as Checkpointing Storage ," in Proc. of the 2nd International Conference on Cloud Computing, Nov. 2010. (acceptance rate: <25%).
  • R. Ge, X. Feng, J. Hu, X.-H. Sun, "Assessing Energy Efficiency of Parallel I/O Systems (Poster Presentation)," in Proc. of the ACM/IEEE SuperComputing Conference (SC'10), Nov. 2010.
  • H. Song ,X.-H. Sun, H. Jin, Y. Chen, "Trace-based Adaptive Data Layout Optimization for Parallel File systems (Poster Presentation)," the 5th Petascale Data Storage Workshop, in conjunction with SuperComputing 2010, Nov. 2010.
  • Y. Chen, X.-H. Sun, R. Thakur, H. Song and H. Jin, "Improving Parallel I/O Performance with Data Layout Awareness," in Proc. of the IEEE International Conference on Cluster Computing 2010 (Cluster10), Sep. 2010. (acceptance rate: 33/107=30.8%).
  • H. Jin, Y. Chen, H. Zhu and X.-H. Sun, "Optimizing HPC Fault-Tolerant Environment: An Analytical Approach," in the 39th International Conference on Parallel Processing (ICPP'2010), Sep. 2010. (acceptance rate: 72/225=32%).
  • Y. Chen, H. Zhu, H. Jin and X.-H. Sun, "Improving the Effectiveness of Context-based Prefetching with Multi-order Analysis," in Proc. of the 3rd International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), Sep. 2010.
  • Z. Zheng, Z. Lan, R. Gupta, S. Coghlan and Peter Beckman, "A Practical Failure Prediction with Location and Lead Time for Blue Gene/P", in Proc. of Fault-Tolerance at Extreme Scale workshop (in conjunction with DSN'10), 2010.
  • Y. Chen, H. Song, R. Thakur and X.-H. Sun, "A Layout-aware Optimization Strategy for Collective I/O," in Proc. of High Performance Distributed Computing (HPDC-2010) (short paper), June 2010.
  • H. Zhu, Y. Chen and X.-H. Sun, "Timing Local Streams: Improving Timeliness in Data Prefetching," in Proc. of the 24th International Conference on Supercomputing (ICS'10), June 2010. (acceptance rate: 32/180=17.8%).
  • Y. Chen, H. Zhu and X.-H. Sun, "An Adaptive Data Prefetcher for High-Performance Processors," in Proc. of the 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid'10), May 2010. (acceptance rate: 51/219=23.3%).
  • R. Ge, X. Feng, S. Subramanya and X.-H. Sun, "Characterizing the Energy Efficiency of I/O Intensive Parallel Applications on Power-Aware Clusters," in the 6th workshop on high performance power-aware computing in conjunction with the 24th IEEE International Parallel and Distributed Processing Symposium, April 2010.
  • W. Tang, N. Desai, D. Buettner, and Z. Lan, "Analyzing and Adjusting User Runtime Estimates to Improve Job Scheduling on Blue Gene/P," Best Paper Award, in Proc. of IPDPS'10, April 2010.

[2009]

  • Grid Technologies and Utility Computing: Concepts for Managing Large-Scale Applications (Encyclopedia of Grid Computing Technologies and Applications), Chapter "QoS of Grid Computing," M. Wu, Xian-He Sun, Igi Global(May 1, 2009), ISBN-10: 1605661848, ISBN-13: 978-1605661841.
  • Grid Computing: Infrastructure, Service, and Application (Hardcover), Chapter II-1, "Virtual Machines in Grid Environments: Dynamic Virtual Machines," C. Du, P. Shukla, and X.-H. Sun, CRC (January 1, 2009), ISBN-10: 1420067664, ISBN-13: 978-1420067668.
  • H. Jin, X.-H. Sun, B. Xie and Y. Chen, "An Implementation and Evaluation of Memory-based Checkpointing (Poster Presentation)," in Proc. of the ACM/IEEE SuperComputing Conference(SC'09), Nov, 2009.
  • X.-H. Sun, S. Byna, D. Holmgren, "Modeling Data Access Contention in Multicore Architectures," in the Fifteenth International Conference on Parallel and Distributed Systems (ICPADS'09), Dec. 2009.
  • B. Xie, Y. Chen, X.-H. Sun and H. Jin, "Performance under Failure of Multi-tier Web Services," in Workshop on Internet-based Virtual Computing Environment (in conjunction with ICPADS'09), Dec. 2009.
  • X.-H. Sun, Y. Chen and Y. Yin, "Data Layout Optimization for Petascale File Systems," in Proc. of The 4th Petascale Data Storage Workshop (in conjunction with ACM/IEEE SC'09), Nov. 2009.
  • X.-H. Sun, C. Du, H. Zou, Y. Chen, and P. Shukla, "V-MCS: A Configuration System for Virtual Machines," in Proc. of Workshop on Web 2.0 on e-Research Infrastructure, Services and Applications (in conjunction with Cluster'09), Aug. 2009.
  • Z. Zheng and Z. Lan, "Reliability-Aware Scalability Models for High Performance Computing," in Proc. of IEEE Cluster'09, Aug. 2009.
  • W. Tang, Z. Lan, N. Desai, and D. Buettner, "Fault-Aware, Utility-Based Job Scheduling on Blue Gene/P Systems," in Proc. of IEEE Cluster'09, Aug. 2009.
  • Z. Zheng, Z. Lan, B.-H. Park, and A. Geist, "System Log Pre-processing to Improve Failure Prediction," in Proc. of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'09), June 2009.
  • H. Jin, X.-H. Sun, Z. Zheng, Z. Lan, and B. Xie, "Performance under Failures of DAG-based Parallel Computing," in Proc. of IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid'09), May 2009. (acceptance rate: 57/271=21%).
  • Z. Fang, X.-H. Sun, Y. Chen, and S. Byna, "Core-Aware Memory Access Scheduling Schemes," in Proc. of IEEE International Parallel & Distributed Processing Symposium (IPDPS'09), May 2009. (acceptance rate: 100/440=22.7%).

[2008]

  • L. Piccoli, J. B. Kowalkowski, J. N. Simone, X.-H. Sun, H. Jin, D. J. Holmgren, N. Seenu, and A. G. Singh, " Lattice QCD Workflows: A Case Study," in 3rd International Workshop on Scientific Workflows and Business Workflow Standards in e-Science (SWBES), Dec. 2008.
  • Y. Chen, S. Byna, X.-H. Sun, R. Thakur, and W. Gropp, "Hiding I/O Latency with Pre-execution Prefetching for Parallel Applications," Best paper award finalist, in Proc. of the ACM/IEEE SuperComputing Conference (SC'08), Nov. 2008. (acceptance rate: 59/277=21.3%).
  • S. Byna, Y. Chen, X.-H. Sun, R. Thakur, and W. Gropp, "Parallel I/O Prefetching Using MPI File Caching and I/O Signatures," in Proc. of the ACM/IEEE SuperComputing Conference (SC'08), Nov. 2008. (acceptance rate: 59/277=21.3%).
  • X.-H. Sun, Y. Chen, and S. Byna, "Scalable Computing in Multicore Era," in the International Symposium on Parallel Algorithms, Architectures and Programming (PAAP'08), Sept. 2008.
  • Y. Chen, S. Byna, X.-H. Sun, R. Thakur, and W. Gropp, "Exploring Parallel I/O Concurrency with Speculative Prefetching," in Proc. 37th International Conference on Parallel Processing (ICPP'08), Sept. 2008. (acceptance rate: 81/263=30.8%).
  • J. Gu, Z. Zheng, Z. Lan, J. White, E. Hocks, and B-H. Park, "Dynamic Meta-Learning for Failure Prediction in Large-scale Systems: A Case Study," in Proc. 37th International Conference on Parallel Processing (ICPP'08), Sept., 2008.
  • L. Piccoli, J. Simone, J. Kowalkowski, et.al, " Tracking LQCD Workflows(Poster Presentation)," in Lattice 2008, July 2008.
  • Y. Li and Z. Lan, "A Fast Recovery Mechanism for Checkpointing in Networked Environments," in the Proc. of DSN'08, June, 2008.
  • S. Byna, Y. Chen, and X.-H. Sun, "A Taxonomy of Data Prefetching Mechanisms," in Proc. of the International Symposium on Parallel Architectures, Algorithms, and Networks (I-SPAN), May, 2008.
  • Z. Lan, Y.Li, Z. Zheng, and P. Gujrati, " Enhancing Application Robustness through Adaptive Fault Tolerance," in Proc. of the NSFNGS Workshop (in conjunction with IPDPS'08), April, 2008.
  • X.-H. Sun, Z. Lan, Y. Li, H. Jin, and Z. Zheng, "Towards a Fault-aware Computing Environment," in Proc. of the High Availability and Performance Computing Workshop (HAPCW), Mar. 2008

[2007]

  • L. Piccoli, X.-H. Sun, J. Simone, et. al.,"The LQCD Workflow Experience: What We Have Learned (Poster Presentation)," in Proc. of the ACM/IEEE SuperComputing Conf. 2007 (SC'07), Nov. 2007.
  • M. Wu, X.-H. Sun, and H. Jin, "Performance under Failure of High-End Computing," in Proc. of the ACM/IEEE SuperComputing Conf. 2007 (SC'07), Nov. 2007. (acceptance rate: 54/268=20.1%).
  • Y. Chen, S. Byna, and X.-H. Sun, "Data Access History Cache and Associated Data Prefetching Mechanisms," in Proc. of the ACM/IEEE SuperComputing Conf. 2007 (SC'07), Nov. 2007. (acceptance rate: 54/268=20.1%).
  • Z. Zheng, Y. Li, and Z. Lan, "Anomaly Localization in Large-scale Clusters," In Proc. of IEEE Cluster'07, Sep. 2007
  • P. Gujrati, Y. Li, Z. Lan, R. Thakur, and J. White, "Exploring Meta-learning to Improve Failure Prediction in Supercomputing Clusters," in Proc. of 2007 International Conference on Parallel Processing (ICPP'07), Sept. 2007.
  • Y. Li, P. Gujrati, Z. Lan, and X.-H. Sun, "Fault-Driven Re-Scheduling For Improving System-level Fault Resilience," in Proc. of 2007 International Conference on Parallel Processing (ICPP'07), Sept. 2007.
  • X.-H. Sun and M. Wu, "Quality of Service of Grid Computing: Resource Sharing," in Proc. of the 6th International Conference on Grid and Cooperative Computing(GCC'07), Aug. 2007.
  • Z. Lan, Y. Li, P. Gujrati, Z. Zheng, R. Thakur, and J. White, "A Fault Diagnosis and Prognosis Service for TeraGrid Clusters," in Proc. of TeraGrid'07, Jun. 2007.
  • Y. Li and Z. Lan, "Using Adaptive Fault Tolerance to Improve Application Robustness on the TeraGrid," in Proc. of TeraGrid'07, Jun. 2007.
  • K. Xiao, N. Chen, S. Ren, L. Shen, X.-H. Sun, K. Kwiat, and M. Macalik, "A Workflow-based Non-intrusive Approach for Enhancing the Survivability of Critical Infrastructures in Cyber Environment," in Proc. of the 3rd International Workshop on Software Engineering for Secure Systems (SESS'07), May 2007.
  • C. Du, X.-H. Sun, and M. Wu, "Dynamic Scheduling with Process Migration," in Proc. of IEEE International Symposium on Cluster Computing and the Grid 2007, Rio de Janeiro, Brazil, May 2007.
  • X.-H. Sun, S. Byna, and Y. Chen, "Improving Data Access Performance with Server Push Architecture," in Proc. of the NSF Next Generation Software Program Workshop (in conjunction with IPDPS '07), March 2007.

[2006]

[2005]

[2004]

Previous Publications



Illinois Institute of Technology
Home | About | Contact | Sitemap
Copyright 1999-2009. All rights reserved. Last updated on Dec 17, 2009