Power-Efficient Accelerators for High-Performance Applications

Ganesh Suryanarayan Dasika
University of Michigan
University of Michigan, 2011


   title={Power-Efficient Accelerators for High-Performance Applications},

   author={Dasika, G.S.},


   school={The University of Michigan}


Download Download (PDF)   View View   Source Source   



Computers, regardless of their function, are always better if they can operate more quickly. The addition of computation resources allows for improved response times, greater functionality and more flexibility. The drawback with improving a computer’s performance, however, is that it often comes at the cost of power and energy consumption. For many platforms, this is not an issue – desktop computers, for instance, have been consuming an increasing amount of power yet because they comprise a small fraction of a household’s overall power consumption, this increase is tolerated. Power increases are not tolerated, however, in modern mobile devices. These devices do more and more each generation and are deployed in increasingly interesting environments – whether it is the common smart phone, a portable medical imaging device or a complex sensor-processor used for disaster relief. Even in less portable circumstances, such as servers used for high-throughput computation, reducing power consumption is an important way to reduce the cost of operating the server. This thesis studies various applications from these and other domains that require everincreasing compute ability but cannot tolerate a similar increase in power consumption. These domains include wireless signal-processing, portable medical imaging and scientific computing. These applications are analyzed in detail to ascertain the most appropriate hardware substrates and microarchitectural structures to help deliver the best performanceper-power efficiency. The architectures evaluated included fixed-function and flexible loop accelerators, SIMD data-engines, general-purpose graphics-processing units and general-purpose processors. These architectures were improved in ways that increased programmability, instructionlevel parallelism, memory bandwidth, and average throughput; and reduced control-divergence overhead, effective memory latency, and data-dependent stalls. These modifications led to solutions with 50 times the performance-per-power efficiency as current designs.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: