ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs

Budirijanto Purnomo, Norman Rubin, Michael Houston
Advanced Micro Devices, Inc.
SIGGRAPH ’10 ACM SIGGRAPH 2010 Posters, 2010


   title={ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs},

   author={Purnomo, B. and Rubin, N. and Houston, M.},

   booktitle={ACM SIGGRAPH 2010 Posters},





Download Download (PDF)   View View   Source Source   



Modern GPUs have been shown to be highly efficient machines for data-parallel applications such as graphics, image, video processing, or physical simulation applications. For example, a single ATI Radeon HD 5870 GPU has a theoretical peak of 2.72 teraflops (1012 floating-point operations per second) with a video memory bandwidth of 153.6 GB/s. While it is not difficult to port CPU algorithms to run on GPUs, it is extremely challenging to optimize the algorithms to achieve teraflops performance on GPUs. Only a select few expert engineers with the application domain expertise, a deep understanding of the modern GPU architecture, and an intimate knowledge of shader compiler optimization can program GPUs close to their optimal capabilities. Many developers are content with several folds of improvements rather than one or several orders of magnitude acceleration compared to their optimized CPU implementations.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: