Optimized Data Transfers Based on the OpenCL Event Management Mechanism

Hiroyuki Takizawa, Shoichi Hirasawa, Makoto Sugawara, Isaac Gelado, Hiroaki Kobayashi, Wen-mei W. Hwu
Tohoku University/JST CREST
Scientific Programming, 2014


   title={Optimized Data Transfers Based on the OpenCL Event Management Mechanism},

   author={Takizawa, Hiroyuki and Hirasawa, Shoichi and Sugawara, Makoto and Gelado, Isaac and Kobayashi, Hiroaki and Wen-mei, W Hwu},



Download Download (PDF)   View View   Source Source   



In standard OpenCL programming, hosts such as CPUs are supposed to control their compute devices such as GPUs. Since compute devices are dedicated to kernel computation, only hosts can execute several kinds of data transfers such as inter-node communication and file access. These data transfers require one host to simultaneously play two or more roles due to the need for collaboration between the host and devices. The codes for such data transfers are likely to be system-specific, resulting in low function portability as well as performance portability. This paper proposes an OpenCL extension, clDataTransfer, that incorporates such data transfers into the OpenCL event management mechanism. The extension allows a programmer to think as if the compute devices can transfer its device memory data from/to a file or a remote node without any help of the host. Unlike the current OpenCL standard, the main thread running on the host is not blocked to serialize dependent operations done. Hence, an application can easily use the opportunities to overlap parallel activities of hosts and compute devices. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. As a result, the extension can improve not only the performance but also the performance portability across different system configurations. The evaluation results show that the proposed extension can use the optimized data transfer implementation and thereby increase the sustained data transfer performance by about 18% for a real application accessing a big data file.
Rating: 1.2/5. From 3 votes.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: