GPU Parallel Collections For Scala
Faculty of the Graduate School, The University of Texas, Arlington
The University of Texas, 2011
@article{rajanna2011gpu,
title={GPU Parallel Collections For Scala},
author={Rajanna, K. D. K.},
year={2011},
publisher={Computer Science & Engineering}
}
A decade ago, graphics processing units have been used specifically for high-speed graphics. Of late, they are becoming more popular as general purpose parallel processors. With the release of CUDA, ATI Stream and OpenCL, programmers can now split their program execution between CPU and GPU, whenever appropriate, resulting in huge performance gain. The cost of GPU is declining and also their performance is improving faster than CPUs. Although enormous performance gains can be achieved by parallelizing the code, identifying the right candidate for GPU execution is very tricky. Coding in OpenCL is a difficult task because of the memory management complexities of the GPU. We are developing Firepile, a library and a compiler for GPU programming in Scala. Firepile makes it easier to port parallelizable code to GPU. As part of the library, we have been working on parallel collections that abstract the memory management and code complexities from the end-user. Many of basic collection frameworks like Array, Map, and Matrix have been provided as part of this library. Functions like map, reduce, scan, sort, find, and filter have been parallelized and implemented in the library that run on the GPU. Our experiments show that the library achieves performance similar to OpenCL implementation with much shorter, easier to understand code. Many of the inner details of the GPU architecture are hidden within the library. So, programmers need not understand the GPU architecture to achieve high performance. Scala is a natural blend of functional paradigm and object orientation and is interoperable with Java. Hence it has been selected for the Firepile compiler over Java.
October 19, 2011 by hgpu