high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » An adaptive performance modeling tool for GPU architectures

An adaptive performance modeling tool for GPU architectures

Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. Patel, William D. Gropp, Wen mei

University of Illinois at Urbana-Champaign, Urbana, IL 61801

In PPoPP ’10: Proceedings of the 15th ACM SIGPLAN symposium on Principles and practice of parallel programming (2010), pp. 105-114.

DOI:10.1145/1693453.1693470

@article{sanjay2010adaptive,

title={An Adaptive Performance Modeling Tool for GPU Architectures},

author={Sanjay, S.S.B.M.D. and Gropp, J.P.W.D. and Wen-mei, W.H.},

journal={Urbana},

volume={51},

pages={61801},

year={2010}

}

Download (PDF)

View

Source

2303

views

This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information to an auto-tuning compiler and assist it in narrowing down the search to the more promising implementations. It can also be incorporated into a tool to help programmers better assess the performance bottlenecks in their code. We analyze each GPU kernel and identify how the kernel exercises major GPU microarchitecture features. To identify the performance bottlenecks accurately, we introduce an abstract interpretation of a GPU kernel, work ﬂow graph, based on which we estimate the execution time of a GPU kernel. We validated our performance model on the NVIDIA GPUs using CUDA (Compute Uniﬁed Device Architecture). For this purpose, we used data parallel benchmarks that stress different GPU microarchitecture events such as uncoalesced memory accesses, scratch-pad memory bank conﬂicts, and control ﬂow divergence, which must be accurately modeled but represent challenges to the analytical performance models. The proposed model captures full system complexity and shows high accuracy in predicting the performance trends of different optimized kernel implementations. We also describe our approach to extracting the performance model automatically from a kernel code.

Tags: Analytical model, Computer science, CUDA, nVidia, nVidia GeForce 8800 GTX, Performance

October 28, 2010 by hgpu

No votes yet.

Please wait...