high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » 10×10: A General-purpose Architectural Approach to Heterogeneity and Energy Efficiency

10×10: A General-purpose Architectural Approach to Heterogeneity and Energy Efficiency

Andrew A. Chien, Allan Snavely, Mark Gahagan

University of California, San Diego and San Diego Supercomputer Center, 9500 Gilman Drive, La Jolla, CA 92093, USA

International Conference on Computational Science, ICCS 2011

BibTeX

Download (PDF)

View

Source

3751

views

Two decades of microprocessor architecture driven by quantitative 90/10 optimization has delivered an extraordinary 1000-fold improvement in microprocessor performance, enabled by transistor scaling which improved density, speed, and energy. Recent generations of technology have produced limited benefits in transistor speed and power, so as a result the industry has turned to multicore parallelism for performance scaling . Long-range studies [1, 2] indicate that radical approaches are needed in the coming decade – extreme parallelism, near-threshold voltage scaling (and resulting poor single-thread performance), and tolerance of extreme variability – are required to maximize energy efficiency and compute density. These changes create major new challenges in architecture and software. As a result, the performance and energy-efficiency advantages of heterogeneous architectures are increasingly attractive. However, computing has lacked an optimization paradigm in which to systematically analyze, assess, and implement heterogeneous computing. We propose a new paradigm, "10×10", which clusters applications into a set of less frequent cases (ie. 10% cases), and creates customized architecture, implementation, and software solutions for each of these clusters, achieving significantly better energy efficiency and performance. We call this approach "10×10" because the approach is exemplified by optimizing ten different 10% cases, reflecting a shift away from the 90/10 optimization paradigm framed by Amdahl’s law [3]. We describe the 10×10 approach, explain how it solves the major obstacles to widespread adoption of heterogeneous architectures, and present a preliminary 10×10 clustering, strawman architecture, and software tool chain approach.

Tags: Clustering, Computer science, Heterogeneous systems, Optimization

October 19, 2011 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

10×10: A General-purpose Architectural Approach to Heterogeneity and Energy Efficiency

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

10×10: A General-purpose Architectural Approach to Heterogeneity and Energy Efficiency

Share this:

Recent source codes

Most viewed papers (last 30 days)