Posts
Sep, 25
GPGPU workload analysis and media performance studies
This project was done with the Mobile Microprocessor Group at Intel Corporation as a part of a six month internship. The primay objective of this project was to study the performance of GPGPUs (General purpose computation on Graphics Processing Units) for various benchmark applications. GPGPUs have gained wide spread importance in recent years because of […]
Sep, 25
Numerical Accuracy Differences in CPU and GPGPU Codes
This thesis presents an analysis of numerical accuracy issues that are found in many scientific GPU applications due to floating-point computation. Two widely held myths about floating-point on GPUs are that the CPU’s answer is more precise than the GPU version and that computations on the GPU are unavoidably different from the same computations on […]
Sep, 25
The Test and Evaluation Uses of Heterogeneous Computing: GPGPUs and Other Approaches
The test and evaluation community faces conflicting pressures: Provide more computing power and reduce electrical power requirements, both on the range and in the laboratory. The authors present some quantifiable benefits from the implementation of General Purpose Graphics Processing Units (GPGPUs) as heterogeneous processors. This produces power, space, cooling, and maintenance benefits that they have […]
Sep, 25
GPU-Based Acceleration of the MLEM Algorithm for SPECT Parallel Imaging with Attenuation Correction and Compensation for Detector Response
Parallel projection based Single Photon Emission Computed Tomography (SPECT) is one of the most widely used nuclear imaging technique even nowadays. Serious artefacts are produced in the reconstructed images due to the non-homogeneous attenuation medium and the distance dependent spatial resolution (DDSR) of the parallel imaging. Effective non-uniform attenuation correction and DDSR reduction procedures should […]
Sep, 25
Algorithm Acceleration from GPGPUs for the ATLAS Upgrade
Feasibility studies into the use of GPUs have been performed on two key algorithms in the ATLAS High Level Trigger. A GPU-based version of the Z-finder routine was found to give up to 35 times speedup in the best case scenario, while a speed-up of over 5 times was observed in a GPU-based Kalman Filter […]
Sep, 25
Hauberk: Lightweight Silent Data Corruption Error Detector for GPGPU
High performance and relatively low cost of GPU-based platforms provide an attractive alternative for general purpose high performance computing (HPC). However, the emerging HPC applications have usually stricter output correctness requirements than typical GPU applications (i.e., 3D graphics). This paper first analyzes the error resiliency of GPGPU platforms using a fault injection tool we have […]
Sep, 24
Utilising OpenCL Framework for Ray-Tracing Acceleration
Modern graphics accelerators do not serve for classic computer games graphics computation accelerations only any more. Their highly parallel architectures enable their use in a broad spectrum of calculations. Because of the release of the OpenCL library and our interest in ray-tracing, we decided to show that ray-tracing is feasible not only on a multi-core […]
Sep, 24
A portable implementation of the radix sort algorithm in OpenCL
We present a portable OpenCL implementation of the radix sort algorithm. We test it on several GPUs or CPUs in order to assess its good performances on different hardware. We also apply our implementation to the Particle-In-Cell (PIC) sorting, which is useful in plasma physics simulations.
Sep, 24
Analyzing Use of OpenCL on the Cell Broadband Engine and a Proposal for OpenCL Extensions
Current processor architectures are diverse and heterogeneous. Examples include multicore chips, GPUs and the Cell Broadband Engine (CBE). The recent Open Compute Language (OpenCL) standard aims at efficiency and portability. This paper explores its efficiency when implemented on the CBE, without using CBE-specific features such as explicit asynchronous memory transfers. We based our experiments on […]
Sep, 24
Automatic Translation of CUDA to OpenCL and Comparison of Performance Optimizations on GPUs
As an open, royalty-free framework for writing programs that execute across heterogeneous platforms, OpenCL gives programmers access to a variety of data parallel processors including CPUs, GPUs, the Cell and DSPs. All OpenCL-compliant implementations support a core specification, thus ensuring robust functional portability of any OpenCL program. This thesis presents the CUDAtoOpenCL source-to-source tool that […]
Sep, 24
OpenCL: a viable solution for high-performance medical image reconstruction?
Reconstruction of 3-D volumetric data from C-arm CT projections is a computationally demanding task. For interventional image reconstruction, hardware optimization is mandatory. Manufacturers of medical equipment use a variety of high-performance computing (HPC) platforms, like FPGAs, graphics cards, or multi-core CPUs. A problem of this diversity is that many different frameworks and (vendor-specific) programming languages […]
Sep, 24
Single Scattering of Aspherical Particles in DDA Calculations on GPUs Using OpenCL
The global distribution and climatology of ice clouds are among the main uncertainties in climate modelling and prediction. In order to retrieve ice cloud properties from remote sensing measurements, the scattering properties of all cloud ice particle types must be known. The Discrete Dipole Approximation (DDA) simulates scattering of radiation by arbitrarily shaped particles and […]