Analyzing program flow within a many-kernel OpenCL application

Perhaad Mistry, Chris Gregg, Norman Rubin, David Kaeli, Kim Hazelwood
Northeastern University, Boston, MA
Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units ACM New York, NY, USA, GPGPU-4, 2011


   title={Analyzing program flow within a many-kernel OpenCL application},

   author={Mistry, P. and Gregg, C. and Rubin, N. and Kaeli, D. and Hazelwood, K.},

   booktitle={Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units},





Download Download (PDF)   View View   Source Source   



Many developers have begun to realize that heterogeneous multi-core and many-core computer systems can provide significant performance opportunities to a range of applications. Typical applications possess multiple components that can be parallelized; developers need to be equipped with proper performance tools to analyze program flow and identify application bottlenecks. In this paper, we analyze and profile the components of the Speeded Up Robust Features (SURF) Computer Vision algorithm written in OpenCL. Our profiling framework is developed using built-in OpenCL API function calls, without the need for an external profiler. We show we can begin to identify performance bottlenecks and performance issues present in individual components on different hardware platforms. We demonstrate that by using run-time profiling using the OpenCL specification, we can provide an application developer with a fine-grained look at performance, and that this information can be used to tailor performance improvements for specific platforms.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: