Introducing Energy Efficiency into Graphics Processors
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol. Delhi, New Delhi, India
International Symposium on Electronic System Design (ISED), 2010
@inproceedings{silpa2010introducing,
title={Introducing Energy Efficiency into Graphics Processors},
author={Silpa, BVN and Panda, P.R.},
booktitle={2010 International Symposium on Electronic System Design},
pages={10},
year={2010},
organization={IEEE}
}
Graphics processor (GPU) architectures have evolved rapidly in recent years with increasing performance demanded by 3D graphics applications such as games. However, challenges exist in integrating complex GPUs into mobile devices because of power and energy constraints, motivating the need for energy efficiency in GPUs. While a significant amount of power optimisation research effort has concentrated on the CPU system, GPU power efficiency is a relatively new and important area because the power consumed by GPUs is similar in magnitude to CPU power. Power and energy efficiency can be introduced into GPUs at many different levels: 1) Hardware component level – queue structures, caches, filter arithmetic units, interconnection networks, processor cores, etc., can be optimised for power. 2) Algorithm level – the deep and complex graphics processing computation pipeline can be modified to be energy aware. Shader programs written by the user can be transformed to be energy aware. 3) System level – co-ordination at the level of task allocation, voltage and frequency scaling, etc., requires knowledge and control of several different GPU system components We outline two strategies for applying energy optimisations at different levels of granularity in a GPU: (1) Texture Filter Memory (TFM) – an energy-efficient addition to the texture memory hierarchy component; and (2) an overall system level Dynamic Voltage and Frequency Scaling (DVFS) based energy reduction strategy based on an accurate tile-level slack prediction. Texture Filter Memory is an augmentation of the standard GPU texture cache hierarchy. Instead of a regular data cache hierarchy, we employ a small first level register based structure that is optimised for the relatively predictable memory access stream in the texture filtering computation. Predictability of the memory access patterns leads us to reduce power by using a simpler register access structure. Power is saved by avoiding the expensive tag lookup and –comparisons present in regular caches. Further, the texture filter memory is a very small structure, whose access energy is much smaller than a data cache of similar performance. Dynamic Voltage and Frequency Scaling, an established energy management technique, can be applied in GPUs by first predicting the workload in a given frame, and, where sufficient slack exists, lowering the voltage and frequency levels so as to save energy while still completing the work within the frame rendering deadline. We apply DVFS in a tiled graphics renderer, where the workload prediction and voltage/frequency adjustment is performed at a tile-level of granularity, which creates opportunities for on-the-fly correction of prediction inaccuracies, ensuring high frame rates while still delivering low power. Tile-level energy management schemes are distinctly superior to schemes that work at the granularity of the entire frame in terms of energy saving and quality of the rendering measured in terms of frames rendered per second. The prediction mechanism relies both on the history – workload observed for the tile in recent frames, as well as rank – a quantification of the tile workload based on the frame structure, consisting of geometry, pixel shading, texture, and raster operations workloads.
July 7, 2011 by hgpu