Stream computing on graphics hardware

Ian Buck
Stanford University
Stanford University, 2006


   title={Stream computing on graphics hardware},

   author={Buck, I.},


   school={stanford university}


Download Download (PDF)   View View   Source Source   



The raw compute performance of today’s graphics processor is truly amazing. With peak performance of over 60 GFLOPS, the compute power of the graphics processor (GPU) dwarfs that of today’s commodity CPU at a price of only a few hundred dollars. As the programmability and performance of modern graphics hardware continues to increase, many researchers are looking to graphics hardware to solve computationally intensive problems previously performed on general purpose CPUs. The challenge, however, is how to re-target these processors from game rendering to general computation, such as numerical modeling, scientific computing, or signal processing. Traditional graphics APIs abstract the GPU as a rendering device, involving textures, triangles, and pixels. Mapping an algorithm to use these primitives is not a straightforward operation, even for the most advanced graphics developers. The results were difficult and often unmanageable programming approaches, hindering the overall adoption of GPUs as a mainstream computing device. In this dissertation, we explore the concept of stream computing with GPUs. We describe the stream processor abstraction for the GPU and how this abstraction and corresponding programming model can efficiently represent computation on the GPU. To formalize the model, we present Brook for GPUs, a programming system for general-purpose computation on programmable graphics hardware. Brook extends C to include simple data-parallel constructs, enabling the use of the GPU as a streaming co-processor. We present a compiler and runtime system that abstracts and virtualizes many aspects of graphics hardware. In addition, we present an analysis of the effectiveness of the GPU as a streaming processor and evaluate the performance of a collection of benchmark applications in comparison to their CPU implementations. ivWe also discuss some of the algorithmic decisions which are critical for efficient execution when using the stream programming model for the GPU. For a variety of the applications explored in this dissertation, we demonstrate that our Brook implementations not only perform comparably to hand-written GPU code but also up to seven times faster than their CPU counterparts.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: