3D Recursive Gaussian IIR on GPU and FPGAs: A Case Study for Accelerating Bandwidth-Bounded Applications

Jason Cong, Muhuan Huang, Yi Zou
Computer Science Department, University of California, Los Angeles
9th IEEE Symposium on Application Specific Processors (SASP 2011), 2011


   title={3D Recursive Gaussian IIR on GPUs and FPGAs},

   author={Cong, J. and Huang, M. and Zou, Y.},



Download Download (PDF)   View View   Source Source   



GPU devices typically have a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper we present our implementations of a 3D recursive Gaussian IIR on multicore CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MADDs per dimension). Since this application is clearly bandwidth bounded, we show that the difference on the memory subsystems on different platform requires different bandwidth optimization techniques. Our implementations on the GPU and FPGA platforms show a 26X and 33X speedup respectively over the optimized single-thread code on the CPU.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: