high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » A Domain-Specific Approach To Heterogeneous Parallelism

A Domain-Specific Approach To Heterogeneous Parallelism

Hassan Chafi, Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Anand R. Atreya, Kunle Olukotun

Pervasive Parallelism Laboratory, Stanford University

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, 2011, p.35-46

@conference{chafi2011domain,

title={A domain-specific approach to heterogeneous parallelism},

author={Chafi, H. and Sujeeth, A.K. and Brown, K.J. and Lee, H.J. and Atreya, A.R. and Olukotun, K.},

booktitle={Proceedings of the 16th ACM symposium on Principles and practice of parallel programming},

pages={35–46},

year={2011},

organization={ACM}

}

Download (PDF)

View

Source

1594

views

Exploiting heterogeneous parallel hardware currently requires mapping application code to multiple disparate programming models. Unfortunately, general-purpose programming models available today can yield high performance but are too low-level to be accessible to the average programmer. We propose leveraging domainspecific languages (DSLs) to map high-level application code to heterogeneous devices. To demonstrate the potential of this approach we present OptiML, a DSL for machine learning. OptiML programs are implicitly parallel and can achieve high performance on heterogeneous hardware with no modification required to the source code. For such a DSL-based approach to be tractable at large scales, better tools are required for DSL authors to simplify language creation and parallelization. To address this concern, we introduce Delite, a system designed specifically for DSLs that is both a framework for creating an implicitly parallel DSL as well as a dynamic runtime providing automated targeting to heterogeneous parallel hardware. We show that OptiML running on Delite achieves single-threaded, parallel, and GPU performance superior to explicitly parallelized MATLAB code in nearly all cases.

Tags: Code generation, Compilers, Computer science, CUDA, Heterogeneous systems, High-level Languages, nVidia, nVidia GeForce GTX 275

February 27, 2011 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

A Domain-Specific Approach To Heterogeneous Parallelism

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

A Domain-Specific Approach To Heterogeneous Parallelism

Share this:

Recent source codes

Most viewed papers (last 30 days)