8946

Formalizing Address Spaces with application to Cuda, OpenCL, and beyond

Benedict R. Gaster
Advanced Micro Devices, 1 AMD, Sunnyvale, CA, USA
6th Annual Workshop on General Purpose Processing with Graphics Processing Units (GPGPU 6), 2013
@article{gaster2013formalizing,

   title={Formalizing Address Spaces with application to Cuda, OpenCL, and beyond},

   author={Gaster, Benedict R},

   year={2013}

}

Download Download (PDF)   View View   Source Source   

861

views

Cuda and OpenCL are aimed at programmers developing parallel applications targeting GPUs and embedded micro-processors. These systems often have explicitly managed memories exposed directly though a notion of disjoint address spaces. OpenCL address spaces are based on a similar concept found in Embedded C. A limitation of OpenCL is that a specific pointer must be assigned to a particular address space and thus functions, for example, must say which pointer arguments point to which address spaces. This leads to a loss of composability and moreover can lead to implementing multiple versions of the same function. This problem is compounded in the OpenCL C++ variant where a class’ implicit this pointer can be applied to multiple address spaces. Modern GPUs, such as AMD’s Graphics Core Next and Nvidia’s Fermi, support an additional generic address space that dynamically determines an address’ disjoint address space, submitting the correct load/store operation to the particular memory subsystem. Generic address spaces allow for dynamic casting between generic and non-generic address spaces that is similar to the dynamic subtyping found in objected oriented languages. The advantage of the generic address space is it simplifies the programming model but sometimes at the cost of decreased performance, both dynamically and due to the optimization a compiler can safely perform. This paper describes a new type system for inferring Cuda and OpenCL style address spaces. We show that the address space system can be inferred. We extend this base system with a notion of generic address space, including dynamic casting, and show that there also exists a static translation to architectures without support for generic address spaces but comes at a potential performance cost. This performance cost can be reclaimed when an architecture directly supports generic address space.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

Like us on Facebook

HGPU group

149 people like HGPU on Facebook

Follow us on Twitter

HGPU group

1238 peoples are following HGPU @twitter

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: AMD APP SDK 2.9
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.2
  • SDK: nVidia CUDA Toolkit 6.0.1, AMD APP SDK 2.9

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2014 hgpu.org

All rights belong to the respective authors

Contact us: