Scalable and Parallel Implementation of a Financial Application on a GPU: With Focus on Out-of-Core Case
Dept. of Comput. Sci. & Eng., Myong Ji Univ., Yong In, South Korea
IEEE 10th International Conference on Computer and Information Technology (CIT), 2010
The architecture of the latest Graphic Processing Unit (GPU) consists of a number of uniform programmable units integrated on the same chip, which facilitate the general-purpose computing beyond the graphic processing. With the multiple programmable units executing in parallel, the latest GPU shows superior performance for many non-graphic applications. Furthermore, programmers can have a direct control on the GPU pipeline using easy-to-use parallel programming environments. These advances in hardware and software make General-Purpose GPU computing (GPGPU) widespread. In this paper, we parallelize a computationally demanding financial application and optimize its performance on a latest GPU. We also analyze the performance results compared with those obtained using CPU only. Experimental results show that GPU can achieve a superior performance, greater than 190x, compared with the CPU-only case when the data fits in the graphic memory. We also address the performance issue in the out-of-core case where the data cannot fit in the device memory on the GPU. In such a case, by using streaming technique helps make up the performance gap lost due to data transfer overhead from the CPU side to the GPU DRAM.
March 27, 2011 by hgpu