Exploring power efficiency and optimizations targeting heterogeneous applications

hgpu.org » Applications » Computer science » Exploring power efficiency and optimizations targeting heterogeneous applications

Exploring power efficiency and optimizations targeting heterogeneous applications

Yash Sanjeev Ukidave

Northeastern University

Northeastern University, 2012

BibTeX

Download (PDF)

View

Source

2140

views

Graphics processing units (GPUs) have become widely accepted as the computing platform of choice in many high performance computing domains, due to the potential for approaching or exceeding the performance of a large cluster of CPUs for many parallel applications. The availability of programming standards such as OpenCL makes the use of GPUs even more popular to leverage the inherent parallelism offered by them. However, given the power consumption of GPUs, some devices can exhaust power budgets quickly. Better solutions are needed to effectively exploit the power-efficiency available on heterogeneous systems. In this work, we evaluate the power-performance trade-offs of different optimizations used on heterogeneous applications. More specifically, we compare the performance of different optimization techniques on Fast Fourier Transform algorithms. Our study covers discrete GPUs and shared memory GPUs (APUs) from AMD (Llano APUs and the Southern Islands GPU), Nvidia (Kepler) and Intel (Ivy Bridge) as test platforms. This thesis studies each optimization platform to categorize it as power-efficient or compute efficient. The study identifies the architectural and algorithmic factors which affect power consumption. We observe up to 27% increase in power consumption across optimizations which yield more than 1.8X speedup. More importantly, we demonstrate that different optimizations used to leverage the execution performance of a heterogeneous application can affect the power efficiency of the application. And also, different algorithms implementing the same fundamental function (FFT) can perform vastly different in terms of power-performance depending on target hardware and optimizations used.

Tags: AMD Fusion, ATI, ATI Radeon HD 7770, Computer science, Energy-efficient computing, FFT, Heterogeneous systems, nVidia, nVidia GeForce GTX 680, OpenCL, Thesis

March 16, 2013 by hgpu

No votes yet.

Please wait...

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org