🌙

大规模传感器网络中的超级计算绿色数据分析:工作负载分布重要吗?

Green Data Analytics of Supercomputing from Massive Sensor Networks: Does Workload Distribution Matter?

Information Systems Research · 2023
被引 7
人大 AFT50UTD24ABS 4*

中文导读

利用国家级超级计算中心大规模传感器网络的独特数据集,识别影响能耗的关键因素,发现工作负载分布显著影响能效,并开发了动态资源管理方法以实现近最优能效。

Abstract

Energy costs represent a significant share of the total cost of ownership in high-performance computing systems. Using a unique data set collected by massive sensor networks in a petascale national supercomputing center, we first present an explanatory model to identify key factors affecting energy consumption in supercomputing. Our analytic results show that workload distribution among the nodes has significant effects and could effectively be leveraged to improve energy efficiency. We then establish the high model performance using in-sample and out-of-sample analyses and develop prescriptive models for energy-optimal runtime workload management. We present four dynamic resource management methodologies (packing, load balancing, threshold-based switching, and energy optimization), model their application at two levels (within-rack and cross-rack resource allocation), and explore runtime resource redistribution policies for jobs under the computational steering and comparatively evaluate strategies that use computational steering with those that do not. Our experimental results lead to a threshold strategy that yields near-optimal energy efficiency under all workload conditions. We further calibrate the energy-optimal resource allocations over the full range of workloads and present a bi-criteria evaluation to consider energy consumption and job performance tradeoffs. We conclude with implementation guidelines and policy insights into energy-efficient computing resource management in large supercomputing centers.

高性能计算能源效率资源管理数据分析