10×10: A General-purpose Architectural Approach to Heterogeneity and Energy Efficiency
University of California, San Diego and San Diego Supercomputer Center, 9500 Gilman Drive, La Jolla, CA 92093, USA
International Conference on Computational Science, ICCS 2011
@inproceedings{chien201110,
title={10$times$ 10: A general-purpose architectural approach to heterogeneity and energy efficiency},
author={Chien, A. and Snavely, A. and Gahagan, M.},
booktitle={The Third Workshop on Emerging Parallel Architctures at the International Conference on Computational Science},
year={2011}
}
Two decades of microprocessor architecture driven by quantitative 90/10 optimization has delivered an extraordinary 1000-fold improvement in microprocessor performance, enabled by transistor scaling which improved density, speed, and energy. Recent generations of technology have produced limited benefits in transistor speed and power, so as a result the industry has turned to multicore parallelism for performance scaling . Long-range studies [1, 2] indicate that radical approaches are needed in the coming decade – extreme parallelism, near-threshold voltage scaling (and resulting poor single-thread performance), and tolerance of extreme variability – are required to maximize energy efficiency and compute density. These changes create major new challenges in architecture and software. As a result, the performance and energy-efficiency advantages of heterogeneous architectures are increasingly attractive. However, computing has lacked an optimization paradigm in which to systematically analyze, assess, and implement heterogeneous computing. We propose a new paradigm, "10×10", which clusters applications into a set of less frequent cases (ie. 10% cases), and creates customized architecture, implementation, and software solutions for each of these clusters, achieving significantly better energy efficiency and performance. We call this approach "10×10" because the approach is exemplified by optimizing ten different 10% cases, reflecting a shift away from the 90/10 optimization paradigm framed by Amdahl’s law [3]. We describe the 10×10 approach, explain how it solves the major obstacles to widespread adoption of heterogeneous architectures, and present a preliminary 10×10 clustering, strawman architecture, and software tool chain approach.
October 19, 2011 by hgpu