high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » Algorithms » Platform-independent parallelization of the Lattice Boltzmann method with OpenCL

Platform-independent parallelization of the Lattice Boltzmann method with OpenCL

Carolin Wolf

Department Informatik, Lehrstuhl fur Informatik 2, Programmiersysteme, Friedrich-Alexander-Universitat Erlangen-Nurnberg

Friedrich-Alexander-Universitat Erlangen-Nurnberg, 2012

BibTeX

Download (PDF)

View

Source

2830

views

Simulations, like fluid dynamics, are very computationally intensive problems. Since the Lattice Boltzmann method uses a discrete grid of cells for simulating the flow, there are no dependencies between the single cells during the computation for one time step. Therefore, the computing can easily be done in parallel. During the last years, multi-CPU computers have been developed. That caused many algorithms to be re-implemented for multithreaded applications. In consequence, results for the computational fluid dynamics could be provided much faster. While the multi-CPU approach has already been implemented, there is now another possibility to achieve fast results: the Open Computing Language (OpenCL) has been released, that allows to use the data-parallel calculating capacity of GPUs, which were mainly limited for rendering graphics so far, for computationally intensive problems equally. In addition to this, OpenCL allows to use multiple devices for computation, which means that a higher level of parallelism is reached. In this thesis, the possibilities of OpenCL to solve the fluid dynamics calculation should be examined. Therefore, it is important to find out whether the code has to be changed for performance reasons if it is run on different hardware components or OpenCL platforms (like those currently provided by NVIDIA, AMD or IBM) or not, and whether the implementation of the Lattice Boltzmann method in OpenCL brings any further advantages for fast computing in general. The result is that OpenCL is capable of much indeed; high calculation speed can be achieved with it to some extent. Furthermore, a programming strategy for efficient OpenCL programs could be developed during the implementation, testing and measuring: short kernel functions, that promise little synchronization delay and that can quickly be translated by the OpenCL just-in-time compiler, joined by many work-items that simultaneously execute the kernel code, produce efficient OpenCL programs that are able to use the device’s compute units to capacity.

Tags: Algorithms, ATI, ATI Mobility Radeon HD 5470, ATI Radeon HD 4870, Fluid dynamics, Lattice Boltzmann model, nVidia, OpenCL, Tesla M1060, Thesis

October 17, 2012 by hgpu

No votes yet.

Please wait...

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

high performance computing on graphics processing units: hgpu.org

Platform-independent parallelization of the Lattice Boltzmann method with OpenCL

Recent source codes

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Platform-independent parallelization of the Lattice Boltzmann method with OpenCL

Share this:

Recent source codes

Most viewed papers (last 30 days)