clpeak – peak performance of your opencl device

Krishnaraj Bhat
Samsung R&D Institute India, Bangalore, India



Source Source   Source codes Source codes




clpeak is a benchmarking tool intended toward developers to fine-tune opencl kernels for a particular device/class of device. It calculates bandwidth & compute performance for different vector-widths of a datatype, say float, float4. Traditionally it is recommended to use scalar code and we expect opencl compiler to auto-vectorize it. But, most of the times compiler will not be able to vectorize a scalar code. A hand-written vector code is always efficient in performance critical scenarios. This tool gives an idea about internal architecture of the device and what vector-widths should be used to realize full potential. It also measures host to device transfer bandwidths and vice-versa. Transfers can be done using enqueueWriteBuffer or enqueueMapBuffer. Map can happen through pinned-memory or sometimes zero-copy. This tool can indicate a zero-copy transfer and memcpy bandwidth on zero-copied memory.
Rating: 2.3. From 3 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: