5660

Simple optimizations for an applicative array language for graphics processors

Bradford Larsen
Department of Computer Science, Tufts University
Proceedings of the sixth workshop on Declarative aspects of multicore programming, DAMP ’11, 2011

@inproceedings{larsen2011simple,

   title={Simple optimizations for an applicative array language for graphics processors},

   author={Larsen, B.},

   booktitle={Proceedings of the sixth workshop on Declarative aspects of multicore programming},

   pages={25–34},

   year={2011},

   organization={ACM}

}

Download Download (PDF)   View View   Source Source   

627

views

Graphics processors (GPUs) are highly parallel devices that promise high performance, and they are now flexible enough to be used for general-purpose computing. A programming language based on implicitly data-parallel collective array operations can permit high-level, effective programming of GPUs. I describe three optimizations for such a language: automatic use of GPU shared memory cache, array fusion, and hoisting of nested parallel constructs. These optimizations are simple to implement because of the design of the language to which they are applied but can result in large run-time speedups.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: