5006

Tradeoffs in designing accelerator architectures for visual computing

Aqeel Mahesri, Daniel Johnson, Neal Crago, Sanjay J. Patel
Center for Reliable and High-Performance Computing, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Champaign, IL
41st IEEE/ACM International Symposium on Microarchitecture, 2008. MICRO-41. 2008

@inproceedings{mahesri2008tradeoffs,

   title={Tradeoffs in designing accelerator architectures for visual computing},

   author={Mahesri, A. and Johnson, D. and Crago, N. and Patel, S.J.},

   booktitle={Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture},

   pages={164–175},

   year={2008},

   organization={IEEE Computer Society}

}

Download Download (PDF)   View View   Source Source   

1938

views

Visualization, interaction, and simulation (VIS) constitute a class of applications that is growing in importance. This class includes applications such as graphics rendering, video encoding, simulation, and computer vision. These applications are ideally suited for accelerators because of their parallelizability and demand for high throughput. We compile a benchmark suite, VIS- Bench, to serve as a proxy for this application class. We use VISBench to examine some important high level decisions for an accelerator architecture. We propose a highly parallel base architecture. We examine the need for synchronization and data communication. We also examine GPU-style SIMD execution and find that a MIMD architecture usually performs better. Given these high level choices, we use VISBench to explore the microarchitectural design space. We analyze area versus performance tradeoffs in designing individual cores and the memory hierarchy. We find that a design made of small, simple cores achieves much higher throughput than a general purpose uniprocessor. Further, we find that a limited amount of support for ILP within each core aids overall performance. We find that fine-grained multithreading improves performance, but only up to a point. We find that word-level (SSE-style) SIMD provides a poor performance to area ratio. Finally, we find that sufficient memory and cache bandwidth is essential to performance.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: