12208
B. van Werkhoven, J. Maassen, F.J. Seinstra, H.E. Bal
Many GPU applications perform data transfers to and from GPU memory at regular intervals. For example because the data does not fit into GPU memory or because of inter- node communication at the end of each time step. Overlapping GPU computation with CPU-GPU communication can reduce the costs of moving data. Several different techniques exist […]
View View   Download Download (PDF)   
Jaewoong Sim, Aniruddha Dasgupta, Hyesoon Kim, and Richard Vuduc
Tuning code for GPGPU and other emerging many-core platforms is a challenge because few models or tools can precisely pinpoint the root cause of performance bottlenecks. In this paper, we present a performance analysis framework that can help shed light on such bottlenecks for GPGPU applications. Although a handful of GPGPU profiling tools exist, most […]
View View   Download Download (PDF)   

* * *

* * *

Follow us on Twitter

HGPU group

1745 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

371 people like HGPU on Facebook

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: