SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving

hgpu.org » Applications » Computer science » SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving

SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving

Kaijie Fan, Marco D’Antonio, Lorenzo Carpentieri, Biagio Cosenza, Federico Ficarelli, Daniele Cesarini

Technische Universität Berlin, Germany

International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023

@article{fan2023synergy,

title={SYnergy: Fine-grained Energy-Efficient Heterogeneous Computing for Scalable Energy Saving},

author={Fan, Kaijie and D’Antonio, Marco and Carpentieri, Lorenzo and Cosenza, Biagio and Ficarelli, Federico and Cesarini, Daniele},

journal={Memory},

volume={900},

pages={1000},

year={2023}

}

Download (PDF)

View

Source

1133

views

Energy-efficient computing uses power management techniques such as frequency scaling to save energy. Implementing energy-efficient techniques on large-scale computing systems is challenging for several reasons. While most modern architectures, including GPUs, are capable of frequency scaling, these features are often not available on large systems. In addition, achieving higher energy savings requires precise energy tuning because not only applications but also different kernels can have different energy characteristics. We propose SYnergy, a novel energy-efficient approach that spans languages, compilers, runtimes, and job schedulers to achieve unprecedented fine-grained energy savings on large-scale heterogeneous clusters. SYnergy defines an extension to the SYCL programming model that allows programmers to define a specific energy goal for each kernel. For example, a kernel can aim to minimize well-known energy metrics such as EDP and ED2P or to achieve predefined energy-performance tradeoffs, such as the best performance with 25% energy savings. Through compiler integration and a machine learning model, each kernel is statically optimized for the specific target. On large computing systems, a SLURM plugin allows SYnergy to run on all available devices in the cluster, providing scalable energy savings. The methodology is inherently portable and has been evaluated on both NVIDIA and AMD GPUs. Experimental results show unprecedented improvements in energy and energy-related metrics on real-world applications, as well as scalable energy savings on a 64-GPU cluster.

Tags: AMD Radeon Instinct MI100, ATI, Computer science, Energy-efficient computing, GPU cluster, Heterogeneous systems, Machine learning, nVidia, nVidia V100, SYCL

August 13, 2023 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org