Posts
Sep, 25
Algorithm Acceleration from GPGPUs for the ATLAS Upgrade
Feasibility studies into the use of GPUs have been performed on two key algorithms in the ATLAS High Level Trigger. A GPU-based version of the Z-finder routine was found to give up to 35 times speedup in the best case scenario, while a speed-up of over 5 times was observed in a GPU-based Kalman Filter […]
Sep, 25
Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU
High performance and relatively low cost of GPU-based platforms provide an attractive alternative for general purpose high performance computing (HPC). However, the emerging HPC applications have usually stricter output correctness requirements than typical GPU applications (i.e., 3D graphics). This paper first analyzes the error resiliency of GPGPU platforms using a fault injection tool we have […]
Sep, 24
Utilising OpenCL Framework for Ray-Tracing Acceleration
Modern graphics accelerators do not serve for classic computer games graphics computation accelerations only any more. Their highly parallel architectures enable their use in a broad spectrum of calculations. Because of the release of the OpenCL library and our interest in ray-tracing, we decided to show that ray-tracing is feasible not only on a multi-core […]
Sep, 24
A portable implementation of the radix sort algorithm in OpenCL
We present a portable OpenCL implementation of the radix sort algorithm. We test it on several GPUs or CPUs in order to assess its good performances on different hardware. We also apply our implementation to the Particle-In-Cell (PIC) sorting, which is useful in plasma physics simulations.
Sep, 24
Analyzing Use of OpenCL on the Cell Broadband Engine and a Proposal for OpenCL Extensions
Current processor architectures are diverse and heterogeneous. Examples include multicore chips, GPUs and the Cell Broadband Engine (CBE). The recent Open Compute Language (OpenCL) standard aims at efficiency and portability. This paper explores its efficiency when implemented on the CBE, without using CBE-specific features such as explicit asynchronous memory transfers. We based our experiments on […]
Sep, 24
Automatic Translation of CUDA to OpenCL and Comparison of Performance Optimizations on GPUs
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms, OpenCL gives programmers access to a variety of data parallel processors including CPUs, GPUs, the Cell and DSPs. All OpenCL-compliant implementations support a core specification, thus ensuring robust functional portability of any OpenCL program. This thesis presents the CUDAtoOpenCL source-to-source tool that […]
Sep, 24
OpenCL: a viable solution for high-performance medical image reconstruction?
Reconstruction of 3-D volumetric data from C-arm CT projections is a computationally demanding task. For interventional image reconstruction, hardware optimization is mandatory. Manufacturers of medical equipment use a variety of high-performance computing (HPC) platforms, like FPGAs, graphics cards, or multi-core CPUs. A problem of this diversity is that many different frameworks and (vendor-specific) programming languages […]
Sep, 24
Single Scattering of Aspherical Particles in DDA Calculations on GPUs Using OpenCL
The global distribution and climatology of ice clouds are among the main uncertainties in climate modelling and prediction. In order to retrieve ice cloud properties from remote sensing measurements, the scattering properties of all cloud ice particle types must be known. The Discrete Dipole Approximation (DDA) simulates scattering of radiation by arbitrarily shaped particles and […]
Sep, 24
Functional Signal Processing with Pure and Faust Using the LLVM Toolkit
Pure and Faust are two functional programming languages useful for programming computer music and other multimedia applications. Faust is a domain-specific language specifically designed for synchronous signal processing, while Pure is a general-purpose language which aims to facilitate symbolic processing of complicated data structures in a variety of application areas. Pure is based on the […]
Sep, 24
Dynamic Data Structures for Taskgraph Scheduling Policies with Applications in OpenCL Accelerators
OpenCL is an emerging open framework for parallel programming in heterogenous systems. OpenCL accelerators need to schedule the execution of submitted jobs with no (or only very imprecise) estimates of execution times, but respecting dependencies among them, which are given in the form of directed acyclic graph. This problem is known as stochastic taskgraph scheduling, […]
Sep, 24
CU2CL: A CUDA-to-OpenCL Translator for Multi-and Many-core Architectures
The use of graphics processing units (GPUs) in high-performance parallel computing continues to become more prevalent, often as part of a heterogeneous system. For years, CUDA has been the de facto programming environment for nearly all general-purpose GPU (GPGPU) applications. In spite of this, the framework is available only on NVIDIA GPUs, traditionally requiring reimplementation […]
Sep, 24
An MDE Approach for Automatic Code Generation from MARTE to OpenCL
Advanced engineering and scientific communities have used parallel programming to solve their large scale complex problems. Achieving high performance is the main advantage for this choice. However, as parallel programming requires a non-trivial distribution of tasks and data, developers find it hard to implement their applications effectively. Thus, in order to reduce design complexity, we […]