Performance Evaluation of CPU-GPU communication Depending on the Characteristic of Co-Located Workloads

Dongyou Seo, Shin-gyu Kim, Hyeonsang Eom, Heon Y. Yeom
School of Computer Science and Engineering, Seoul National University, Seoul, Korea
International Journal on Computer Science and Engineering (IJCSE), Vol. 5 No. 05, 2013


   title={Performance Evaluation of CPU-GPU communication Depending on the Characteristic of Co-Located Workloads},

   author={Seo, Dongyou and Kim, Shin-gyu and Eom, Hyeonsang and Yeom, Heon Y},



Download Download (PDF)   View View   Source Source   



Todays, there are many studies in complicated computation and big data processing by using the high performance computability of GPU. Tesla K20X recently announced by NVIDIA provides 3.95 TFLOPS in precision floating point performance [1]. The performance of K20X is 10 times higher than Intel’s high-end CPUs. Due to the high performance computability of GPU, K20X was adapted to Titan, the first super computer in the world [2][3]. However, additional steps are needed in GPU computing process, which aren’t needed in the computation using only CPU. The data required to execute on GPU has to move from main memory to global memory of GPU before GPU computation. The results created on GPU also have to write back to main memory. The data movement is called as CPU-GPU communication. The communication between CPU and GPU is a big part of the computation using GPU. So, many studies tried to optimize CPU-GPU communication [4][5]. In this paper, we evaluated the performance of CPU-GPU communication depending on co-located workloads and presented which workload severely degraded the performance of CPU-GPU communication.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: