Exploring GPGPU workloads: Characterization methodology, analysis and microarchitecture evaluation implications

Nilanjan Goswami, Ramkumar Shankar, Madhura Joshi, Tao Li
Intelligent Design of Efficient Architecture Lab (IDEAL), University of Florida, Gainesville, Florida, USA
IEEE International Symposium on Workload Characterization (IISWC), 2010


   title={Exploring GPGPU workloads: Characterization methodology, analysis and microarchitecture evaluation implications},

   author={Goswami, N. and Shankar, R. and Joshi, M. and Li, T.},

   booktitle={Workload Characterization (IISWC), 2010 IEEE International Symposium on},




Download Download (PDF)   View View   Source Source   



The GPUs are emerging as a general-purpose high-performance computing device. Growing GPGPU research has made numerous GPGPU workloads available. However, a systematic approach to characterize these benchmarks and analyze their implication on GPU microarchitecture design evaluation is still lacking. In this research, we propose a set of microarchitecture agnostic GPGPU workload characteristics to represent them in a microarchitecture independent space. Correlated dimensionality reduction process and clustering analysis are used to understand these workloads. In addition, we propose a set of evaluation metrics to accurately evaluate the GPGPU design space. With growing number of GPGPU workloads, this approach of analysis provides meaningful, accurate and thorough simulation for a proposed GPU architecture design choice. Architects also benefit by choosing a set of workloads to stress their intended functional block of the GPU microarchitecture. We present a diversity analysis of GPU benchmark suites such as Nvidia CUDA SDK, Parboil and Rodinia. Our results show that with a large number of diverse kernels, workloads such as Similarity Score, Parallel Reduction, and Scan of Large Arrays show diverse characteristics in different workload spaces. We have also explored diversity in different workload subspaces (e.g. memory coalescing and branch divergence). Similarity Score, Scan of Large Arrays, MUMmerGPU, Hybrid Sort, and Nearest Neighbor workloads exhibit relatively large variation in branch divergence characteristics compared to others. Memory coalescing behavior is diverse in Scan of Large Arrays, K-Means, Similarity Score and Parallel Reduction.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: