Exploring power efficiency and optimizations targeting heterogeneous applications

Yash Sanjeev Ukidave
Northeastern University
Northeastern University, 2012

   title={Exploring power efficiency and optimizations targeting heterogeneous applications},

   author={Ukidave, Yash Sanjeev},



Download Download (PDF)   View View   Source Source   



Graphics processing units (GPUs) have become widely accepted as the computing platform of choice in many high performance computing domains, due to the potential for approaching or exceeding the performance of a large cluster of CPUs for many parallel applications. The availability of programming standards such as OpenCL makes the use of GPUs even more popular to leverage the inherent parallelism offered by them. However, given the power consumption of GPUs, some devices can exhaust power budgets quickly. Better solutions are needed to effectively exploit the power-efficiency available on heterogeneous systems. In this work, we evaluate the power-performance trade-offs of different optimizations used on heterogeneous applications. More specifically, we compare the performance of different optimization techniques on Fast Fourier Transform algorithms. Our study covers discrete GPUs and shared memory GPUs (APUs) from AMD (Llano APUs and the Southern Islands GPU), Nvidia (Kepler) and Intel (Ivy Bridge) as test platforms. This thesis studies each optimization platform to categorize it as power-efficient or compute efficient. The study identifies the architectural and algorithmic factors which affect power consumption. We observe up to 27% increase in power consumption across optimizations which yield more than 1.8X speedup. More importantly, we demonstrate that different optimizations used to leverage the execution performance of a heterogeneous application can affect the power efficiency of the application. And also, different algorithms implementing the same fundamental function (FFT) can perform vastly different in terms of power-performance depending on target hardware and optimizations used.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

Follow us on Twitter

HGPU group

1542 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

274 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: