5771

Posts

Sep, 26

From CUDA to OpenCL: Towards a Performance-portable Solution for Multi-platform GPU Programming

In this work, we evaluate OpenCL as a programming tool for developing performance-portable applications for GPGPU. While the Khronos group developed OpenCL with programming portability in mind, performance is not necessarily portable. OpenCL has required performance-impacting initializations that do not exist in other languages such as CUDA. Understanding these implications allows us to provide a […]
Sep, 26

Identifying scalar behavior in CUDA kernels

We propose a compiler analysis pass for programs expressed in the Single Program, Multiple Data (SPMD) programming model. It identifies statically several kinds of regular patterns that can occur between adjacent threads, including common computations, memory accesses at consecutive locations or at the same location and uniform control flow. This knowledge can be exploited by […]
Sep, 26

Putting Automatic Polyhedral Compilation for GPGPU to Work

Automatic parallelization is becoming more important as parallelism becomes ubiquitous. The first step for achieving automation is to develop a theoretical foundation, for example, the polyhedron model. The second step is to implement the algorithms studied in the theoretical framework and getting them to work in a compiler that can be used to parallelize real […]
Sep, 26

Running unstructured grid-based CFD solvers on modern graphics hardware

Techniques used to implement an unstructured grid solver on modern graphics hardware are described. The three-dimensional Euler equations for inviscid, compressible flow are considered. Effective memory bandwidth is improved by reducing total global memory access and overlapping redundant computation, as well as using an appropriate numbering scheme and data layout. The applicability of per-block shared […]
Sep, 25

Optimizing OpenCL Kernels for Iterative Statistical Applications on GPUs

We present a study of three important kernels that occur frequently in iterative statistical applications: K-Means, Multi-Dimensional Scaling (MDS), and PageRank. We implemented each kernel using OpenCL and evaluated their performance on an NVIDIA Tesla GPGPU card. By examining the underlying algorithms and empirically measuring the performance of various components of the kernel we explored […]
Sep, 25

Exploiting Heterogeneous Computing Platforms By Cataloging Best Solutions For Resource Intensive Seismic Applications

Large heterogeneous data centers of today lack methods to appraise the best fitting solutions regarding, among others, hardware acquisition cost, development time, and performance. Especially resource intensive applications benefit from increased data center utilization to leverage heterogeneous resources and accelerators. In this paper, we implement various methods to accelerate a seismic modeling application, which is […]
Sep, 25

Harnessing the Power of GPUs without Losing Abstractions in SaC and ArrayOL: A Comparative Study

Over recent years, using Graphics Processing Units (GPUs) has become as an effective method for increasing the performance of many applications. However, these performance benefits from GPUs come at a price. Firstly extensive programming expertise and intimate knowledge of the underlying hardware are essential for gaining good speedups. Secondly, the expressibility of GPU-based programs are […]
Sep, 25

Accelerating image recognition on mobile devices using GPGPU

The future multi-modal user interfaces of battery-powered mobile devices are expected to require computationally costly image analysis techniques. The use of Graphic Processing Units for computing is very well suited for parallel processing and the addition of programmable stages and high precision arithmetic provide for opportunities to implement energy-efficient complete algorithms. At the moment the […]
Sep, 25

GPGPU workload analysis and media performance studies

This project was done with the Mobile Microprocessor Group at Intel Corporation as a part of a six month internship. The primay objective of this project was to study the performance of GPGPUs (General purpose computation on Graphics Processing Units) for various benchmark applications. GPGPUs have gained wide spread importance in recent years because of […]
Sep, 25

Numerical Accuracy Differences in CPU and GPGPU Codes

This thesis presents an analysis of numerical accuracy issues that are found in many scientific GPU applications due to floating-point computation. Two widely held myths about floating-point on GPUs are that the CPU’s answer is more precise than the GPU version and that computations on the GPU are unavoidably different from the same computations on […]
Sep, 25

The Test and Evaluation Uses of Heterogeneous Computing: GPGPUs and Other Approaches

The test and evaluation community faces conflicting pressures: Provide more computing power and reduce electrical power requirements, both on the range and in the laboratory. The authors present some quantifiable benefits from the implementation of General Purpose Graphics Processing Units (GPGPUs) as heterogeneous processors. This produces power, space, cooling, and maintenance benefits that they have […]
Sep, 25

GPU-Based Acceleration of the MLEM Algorithm for SPECT Parallel Imaging with Attenuation Correction and Compensation for Detector Response

Parallel projection based Single Photon Emission Computed Tomography (SPECT) is one of the most widely used nuclear imaging technique even nowadays. Serious artefacts are produced in the reconstructed images due to the non-homogeneous attenuation medium and the distance dependent spatial resolution (DDSR) of the parallel imaging. Effective non-uniform attenuation correction and DDSR reduction procedures should […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: