Towards Efficient GPU Sharing on Multicore Processors
ECE Department, George Washington University
The 2nd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems (PMBS11), 2011
Scalable systems employing a mix of GPUs with CPUs are becoming increasingly prevalent in high-performance computing (HPC). The presence of such accelerators introduces significant challenges and complexities to both language developers and end users. This paper provides a close study of efficient coordination mechanisms to handle parallel requests from multiple hosts of control to a GPU under hybrid programming. Using a set of microbenchmarks and applications on a GPU cluster, we show that thread- and process-based context hosting have different tradeoffs. Experimental results on application benchmarks suggest that both thread-based context funneling and process-based context switching natively perform similarly on the latest Fermi GPU, while manually guided context funneling is currently the best way to achieve optimal performance.
November 19, 2011 by hgpu