The Paradigm of Power Bounded High-Performance Computing

Ge, Rong; Feng, Xizhou; Zou, Pengfei; Allen, Tyler

doi:10.1007/s11390-023-2885-7

The Paradigm of Power Bounded High-Performance Computing

Perspective
Published: 31 January 2023

Volume 38, pages 87–102, (2023)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

241 Accesses
1 Altmetric
Explore all metrics

Abstract

Modern computer systems are increasingly bounded by the available or permissible power at multiple layers from individual components to data centers. To cope with this reality, it is necessary to understand how power bounds impact performance, especially for systems built from high-end nodes, each consisting of multiple power hungry components. Because placing an inappropriate power bound on a node or a component can lead to severe performance loss, coordinating power allocation among nodes and components is mandatory to achieve desired performance given a total power budget. In this article, we describe the paradigm of power bounded high-performance computing, which considers coordinated power bound assignment to be a key factor in computer system performance analysis and optimization. We apply this paradigm to the problem of power coordination across multiple layers for both CPU and GPU computing. Using several case studies, we demonstrate how the principles of balanced power coordination can be applied and adapted to the interplay of workloads, hardware technology, and the available total power for performance improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Convergence of Malleability and the HPC PowerStack: Exploiting Dynamism in Over-Provisioned and Power-Constrained HPC Systems

Power Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer

Power Management Framework for Post-petascale Supercomputers

References

Lucas R, Ang J, Bergman K et al. Top ten exascale research challenges. DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report, U.S. Department of Energy, Office of Science, 2014. https://doi.org/10.2172/1222713.
Jeon M, Venkataraman S, Phanishayee A, Qian J J, Xiao W C, Yang F. Analysis of large-scale multi-tenant GPU clusters for DNN training workloads. In Proc. the 2019 USENIX Annual Technical Conference, Jul. 2019, pp.947–960.
Ge R, Feng X Z, Allen T, Zou P F. The case for cross-component power coordination on power bounded systems. IEEE Trans. Parallel and Distributed Systems, 2021, 32(10): 2464-2476. https://doi.org/10.1109/TPDS.2021.3068235.
Article Google Scholar
Ge R, Feng X Z, He Y Y, Zou P F. The case for crosscomponent power coordination on power bounded systems. In Proc. the 45th International Conference on Parallel Processing (ICPP), Aug. 2016, pp.516–525. https://doi.org/10.1109/ICPP.2016.66.
Ge R, Zou P F, Feng X Z. Application-aware power coordination on power bounded NUMA multicore systems. In Proc. the 46th International Conference on Parallel Processing (ICPP), Aug. 2017, pp.591–600. https://doi.org/10.1109/ICPP.2017.68.
Zou P F, Allen T, Davis C H, Feng X Z, Ge R. CLIP: Cluster-level intelligent power coordination for powerbounded systems. In Proc. the 2017 IEEE International Conference on Cluster Computing (CLUSTER), Sept. 2017, pp.541–551. https://doi.org/10.1109/CLUSTER.2017.98.
Zou P F, Feng X Z, Ge R. Contention aware workload and resource co-scheduling on power-bounded systems. In Proc. the 2019 IEEE International Conference on Networking, Architecture and Storage (NAS), Aug. 2019. https://doi.org/10.1109/NAS.2019.8834721.
Zou P F, Rodriguez D, Ge R. Maximizing throughput on power-bounded HPC systems. In Proc. the 2018 IEEE International Conference on Cluster Computing (CLUSTER), Sept. 2018, pp.156–157. https://doi.org/10.1109/CLUSTER.2018.00030.
Eyerman S, Eeckhout L. System-level performance metrics for multiprogram workloads. IEEE Micro, 2008, 28(3): 42–53. https://doi.org/10.1109/MM.2008.44.
Article Google Scholar
Blagodurov S, Zhuravlev S, Fedorova A. Contentionaware scheduling on multicore systems. ACM Trans. Computer Systems, 2010, 28(4): Article No. 8. https://doi.org/10.1145/1880018.1880019.
Subramanian L, Seshadri V, Ghosh A, Khan S, Mutlu O. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In Proc. the 48th Annual IEEE/ACM International Symposium on Microarchitecture, Dec. 2015, pp.62–75. https://doi.org/10.1145/2830772.2830803.
Kelley J, Stewart C, Tiwari D, Gupta S. Adaptive power profiling for many-core HPC architectures. In Proc. the 2016 IEEE International Conference on Autonomic Computing (ICAC), Jul. 2016, pp.179–188. https://doi.org/10.1109/ICAC.2016.45.
Mishra N, Lafferty J D, Hoffmann H. ESP: A machine learning approach to predicting application interference. In Proc. the 2017 IEEE International Conference on Autonomic Computing (ICAC), Jul. 2017, pp.125–134. https://doi.org/10.1109/ICAC.2017.29.

Download references

Author information

Authors and Affiliations

School of Computing, Clemson University, Clemson, SC, 29634, USA
Rong Ge
Meta Platform, Inc., Menlo Park, CA, 94025, USA
Xizhou Feng
Amazon, Inc., Seattle, WA, 98170, USA
Pengfei Zou
University of North Carolina at Charlotte, Charlotte, NC, 27599, USA
Tyler Allen

Authors

Rong Ge
View author publications
Search author on:PubMed Google Scholar
Xizhou Feng
View author publications
Search author on:PubMed Google Scholar
Pengfei Zou
View author publications
Search author on:PubMed Google Scholar
Tyler Allen
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Rong Ge.

Supplementary Information

ESM 1

(PDF 629 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ge, R., Feng, X., Zou, P. et al. The Paradigm of Power Bounded High-Performance Computing. J. Comput. Sci. Technol. 38, 87–102 (2023). https://doi.org/10.1007/s11390-023-2885-7

Download citation

Received: 04 October 2022
Accepted: 02 January 2023
Published: 31 January 2023
Version of record: 31 January 2023
Issue date: February 2023
DOI: https://doi.org/10.1007/s11390-023-2885-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Paradigm of Power Bounded High-Performance Computing

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Convergence of Malleability and the HPC PowerStack: Exploiting Dynamism in Over-Provisioned and Power-Constrained HPC Systems

Power Consumption Modeling and Prediction in a Hybrid CPU-GPU-MIC Supercomputer

Power Management Framework for Post-petascale Supercomputers

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Supplementary Information

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now