high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy

HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy

Yao Chen, Xin Long, Jiong He, Yuhang Chen, Hongshi Tan, Zhenxiang Zhang, Marianne Winslett, Deming Chen

Advanced Digital Sciences Center

arXiv:2005.08466 [cs.DC], (18 May 2020)

BibTeX

Download (PDF)

View

Source

1844

views

The pervasive adoption of Deep Learning (DL) and Graph Processing (GP) makes it a de facto requirement to build large-scale clusters of heterogeneous accelerators including GPUs and FPGAs. The OpenCL programming framework can be used on the individual nodes of such clusters but is not intended for deployment in a distributed manner. Fortunately, the original OpenCL semantics naturally fit into the programming environment of heterogeneous clusters. In this paper, we propose a heterogeneity-aware OpenCL-like (HaoCL) programming framework to facilitate the programming of a wide range of scientific applications including DL and GP workloads on large-scale heterogeneous clusters. With HaoCL, existing applications can be directly deployed on heterogeneous clusters without any modifications to the original OpenCL source code and without awareness of the underlying hardware topologies and configurations. Our experiments show that HaoCL imposes a negligible overhead in a distributed environment, and provides near-linear speedups on standard benchmarks when computation or data size exceeds the capacity of a single node. The system design and the evaluations are presented in this demo paper.

Tags: Computer science, Deep learning, FPGA, Heterogeneous systems, Machine learning, nVidia, OpenCL, Tesla P4

May 24, 2020 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

HaoCL: Harnessing Large-scale Heterogeneous Processors Made Easy

Share this:

Recent source codes

Most viewed papers (last 30 days)