A parallel error diffusion implementation on a GPU

Yao Zhang, John Ludd Recker, Robert Ulichney, Giordano B. Beretta, Ingeborg Tastl, I-Jong Lin, and John D. Owens
University of California, Davis, One Shields Avenue, Davis, CA, USA
IS&T/SPIE Electronic Imaging 2011 / Parallel Processing for Imaging Applications, volume 7872, pages 78720K:1-9, 2011


   author={Yao Zhang and John Ludd Recker and Robert Ulichney and Giordano B. Beretta and Ingeborg Tastl and I-Jong Lin and John D. Owens},

   title={A Parallel Error Diffusion Implementation on a {GPU}},

   booktitle={Proceedings of SPIE: IS&T/SPIE Electronic Imaging 2011 / Parallel Processing for Imaging Applications},



Download Download (PDF)   View View   Source Source   



In this paper, we investigate the suitability of the GPU for a parallel implementation of the pinwheel error diffusion. We demonstrate a high-performance GPU implementation by efficiently parallelizing and unrolling the image processing algorithm. Our GPU implementation achieves a 10 – 30x speedup over a two-threaded CPU error diffusion implementation with comparable image quality. We have conducted experiments to study the performance and quality tradeoffs for differences in image block sizes. We also present a performance analysis at assembly level to understand the performance bottlenecks.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2021 hgpu.org

All rights belong to the respective authors

Contact us: