high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » CUDA » A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures

A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures

Enrico Calore, Sebastiano Fabio Schifano, Raffaele Tripiccione

INFN, Ferrara, Italy

Procedia Computer Science, Volume 29, Pages 40-49, 2014

DOI:10.1016/j.procs.2014.05.004

@article{tsotskas2014design,

title={The Design and Implementation of a GPU-enabled Multi-objective Tabu-search Intended for Real World and High-dimensional Applications},

author={Tsotskas, Christos and Kipouros, Timoleon and Savill, Anthony Mark},

journal={Procedia Computer Science},

volume={29},

pages={2152–2161},

year={2014},

publisher={Elsevier}

}

Download (PDF)

View

Source

2971

views

The architecture of high performance computing systems is becoming more and more heterogeneous, as accelerators play an increasingly important role alongside traditional CPUs. Programming heterogeneous systems efficiently is a complex task, that often requires the use of specific programming environments. Programming frameworks supporting codes portable across different high performance architectures have recently appeared, but one must carefully assess the relative costs of portability versus computing efficiency, and find a reasonable tradeoff point. In this paper we address precisely this issue, using as test-bench a Lattice Boltzmann code implemented in OpenCL. We analyze its performance on several different state-of-the-art processors: NVIDIA GPUs and Intel Xeon-Phi many-core accelerators, as well as more traditional Ivy Bridge and Opteron multi-core commodity CPUs. We also compare with results obtained with codes specifically optimized for each of these systems. Our work shows that a properly structured OpenCL code runs on many different systems reaching performance levels close to those obtained by architecture-tuned CUDA or C codes.

Tags: CUDA, Fluid dynamics, Heterogeneous systems, Intel Xeon Phi, Lattice Boltzmann model, nVidia, OpenCL, Tesla K20

June 17, 2014 by hgpu

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

high performance computing on graphics processing units: hgpu.org

A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures

Recent source codes

SimSYCL: Synchronous, single-threaded, library-only SYCL implementation for debugging and verification

GPU plugin for PySCF

QArray

Celerity: High-level C++ for Accelerator Clusters

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Most viewed papers (last 30 days)

A Portable OpenCL Lattice Boltzmann Code for Multi- and Many-core Processor Architectures

Share this:

Recent source codes

Most viewed papers (last 30 days)