5704

Posts

Sep, 27

E(A+M)PEC – An OpenCL Atomic and Molecular Plasma Emission Code For Interstellar Medium Simulations

E(A+M)PEC traces the ionization structure, cooling and emission spectra of plasmas. It is written in OpenCL, runs in NVIDIA Graphics Processor Units and can be coupled to any HD or MHD code to follow the dynamical and thermal evolution of any plasma in, e.g., the interstellar medium (ISM).
Sep, 26

From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming

In this work, we evaluate OpenCL as a programming tool for developing performance-portable applications for GPGPU. While the Khronos group developed OpenCL with programming portability in mind, performance is not necessarily portable. OpenCL has required performance-impacting initializations that do not exist in other languages such as CUDA. Understanding these implications allows us to provide a […]
Sep, 25

Optimizing OpenCL Kernels for Iterative Statistical Applications on GPUs

We present a study of three important kernels that occur frequently in iterative statistical applications: K-Means, Multi-Dimensional Scaling (MDS), and PageRank. We implemented each kernel using OpenCL and evaluated their performance on an NVIDIA Tesla GPGPU card. By examining the underlying algorithms and empirically measuring the performance of various components of the kernel we explored […]
Sep, 24

Utilising OpenCL Framework for Ray-Tracing Acceleration

Modern graphics accelerators do not serve for classic computer games graphics computation accelerations only any more. Their highly parallel architectures enable their use in a broad spectrum of calculations. Because of the release of the OpenCL library and our interest in ray-tracing, we decided to show that ray-tracing is feasible not only on a multi-core […]
Sep, 24

A portable implementation of the radix sort algorithm in OpenCL

We present a portable OpenCL implementation of the radix sort algorithm. We test it on several GPUs or CPUs in order to assess its good performances on different hardware. We also apply our implementation to the Particle-In-Cell (PIC) sorting, which is useful in plasma physics simulations.
Sep, 24

Analyzing Use of OpenCL on the Cell Broadband Engine and a Proposal for OpenCL Extensions

Current processor architectures are diverse and heterogeneous. Examples include multicore chips, GPUs and the Cell Broadband Engine (CBE). The recent Open Compute Language (OpenCL) standard aims at efficiency and portability. This paper explores its efficiency when implemented on the CBE, without using CBE-specific features such as explicit asynchronous memory transfers. We based our experiments on […]
Sep, 24

Automatic Translation of CUDA to OpenCL and Comparison of Performance Optimizations on GPUs

As an open, royalty-free framework for writing programs that execute across heterogeneous platforms, OpenCL gives programmers access to a variety of data parallel processors including CPUs, GPUs, the Cell and DSPs. All OpenCL-compliant implementations support a core specification, thus ensuring robust functional portability of any OpenCL program. This thesis presents the CUDAtoOpenCL source-to-source tool that […]
Sep, 24

OpenCL: a viable solution for high-performance medical image reconstruction?

Reconstruction of 3-D volumetric data from C-arm CT projections is a computationally demanding task. For interventional image reconstruction, hardware optimization is mandatory. Manufacturers of medical equipment use a variety of high-performance computing (HPC) platforms, like FPGAs, graphics cards, or multi-core CPUs. A problem of this diversity is that many different frameworks and (vendor-specific) programming languages […]
Sep, 24

Single Scattering of Aspherical Particles in DDA Calculations on GPUs Using OpenCL

The global distribution and climatology of ice clouds are among the main uncertainties in climate modelling and prediction. In order to retrieve ice cloud properties from remote sensing measurements, the scattering properties of all cloud ice particle types must be known. The Discrete Dipole Approximation (DDA) simulates scattering of radiation by arbitrarily shaped particles and […]
Sep, 24

Dynamic Data Structures for Taskgraph Scheduling Policies with Applications in OpenCL Accelerators

OpenCL is an emerging open framework for parallel programming in heterogenous systems. OpenCL accelerators need to schedule the execution of submitted jobs with no (or only very imprecise) estimates of execution times, but respecting dependencies among them, which are given in the form of directed acyclic graph. This problem is known as stochastic taskgraph scheduling, […]
Sep, 24

CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures

The use of graphics processing units (GPUs) in high-performance parallel computing continues to become more prevalent, often as part of a heterogeneous system. For years, CUDA has been the de facto programming environment for nearly all general-purpose GPU (GPGPU) applications. In spite of this, the framework is available only on NVIDIA GPUs, traditionally requiring reimplementation […]
Sep, 24

An MDE Approach for Automatic Code Generation from MARTE to OpenCL

Advanced engineering and scientific communities have used parallel programming to solve their large scale complex problems. Achieving high performance is the main advantage for this choice. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: