high performance computing on graphics processing units: hgpu.org

Posts

Aug, 21

A Computational Realization of a Semi-Lagrangian Method for Solving the Advection Equation

A parallel implementation of a method of the semi-Lagrangian type for the advection equation on a hybrid architecture com-putation system is discussed. The difference scheme with variable stencil is constructed on the base of an integral equality between the neighboring time levels. The proposed approach allows one to avoid the Courant-Friedrichs-Lewy restriction on the relation […]

CUDA

Aug, 21

Volumetric Rendering Techniques for Scientific Visualization

Direct volume rendering is widely used in many applications where the inside of a transparent or a partially transparent material should be visualized. We have explored several aspects of the problem. First, we proposed a view-dependent selective refinement scheme in order to reduce the high computational requirements without affecting the image quality significantly. Then, we […]

CUDA

Aug, 21

Error Resilience Evaluation on GPGPU Applications

While graphics processing units (GPUs) have gained wide adoption as accelerators for general-purpose applications (GPGPU), the end-to-end reliability implications of their use have not been quantified. Fault injection is a widely used method for evaluating the reliability of applications. However, building a fault injector for GPGPU applications is challenging due to their massive parallelism, which […]

CUDA

Aug, 19

GPU Accelerated Range Trees with Applications

Range searching is a primal problem in computational geometry with applications to database systems, mobile computing, geographical information systems, and the like. Defined simply, the problem is to preprocess a given a set of points in a d-dimensional space so that the points that lie inside an orthogonal query rectangle can be efficiently reported. Many […]

CUDA

Aug, 19

Parallel Graph Mining with GPUs

Frequent graph mining is an important though computationally hard problem because it requires enumerating possibly an exponential number of candidate subgraph patterns, and checking their presence in a database of graphs. In this paper, we propose a novel approach for parallel graph mining on GPUs, which have emerged as a relatively cheap but powerful architecture […]

CUDA

Aug, 19

Parallel Outlier Detection on Uncertain Data for GPUs

Outlier detection, also known as anomaly detection, is a common data mining task in identifying data points that are outside expected patterns in a given dataset. It has useful applications such as network intrusion, system faults, and fraudulent activity. In addition, real world data are uncertain in nature and they may be represented as uncertain […]

OpenCL

Aug, 19

Practical Symbolic Race Checking of GPU Programs

Even the careful GPU programmer can inadvertently introduce data races while writing and optimizing code. Currently available GPU race checking methods fall short either in terms of their formal guarantees, ease of use, or practicality. Existing symbolic methods: (1) do not fully support existing CUDA kernels; (2) may require user-specified assertions or invariants; (3) often […]

CUDA

Aug, 19

An Efficient Cell List Implementation for Monte Carlo Simulation on GPUs

Maximizing the performance potential of the modern day GPU architecture requires judicious utilization of available parallel resources. Although dramatic reductions can often be obtained through straightforward mappings, further performance improvements often require algorithmic redesigns to more closely exploit the target architecture. In this paper, we focus on efficient molecular simulations for the GPU and propose […]

CUDA

Aug, 19

Accelerated composite distribution function methods for computational fluid dynamics using GPU

The Kinetic Theory of Gases has long been established as a useful tool for the solution of modern Computational Fluid Dynamics (CFD) problems. Together with the Finite Volume Method, such approaches have been popular in CFD for over 30 years, with techniques such as the Equilibrium Flux Method (EFM) or Kinetic Flux Vector Splitting (KFVS), […]

CUDA

Aug, 18

An OpenCL implementation of a forward sampling algorithm for CP-logic

We present an approximate query answering algorithm for the Probabilistic Logic Programming language CP-logic. It complements existing sampling algorithms by using the rules from body to head instead of in the other direction. We present an implementation in OpenCL, which is able to exploit the multicore architecture of modern GPUs to compute a large number […]

OpenCL

Aug, 18

Ocelot/HyPE: Optimized Data Processing on Heterogeneous Hardware

The past years saw the emergence of highly heterogeneous server architectures that feature multiple accelerators in addition to the main processor. Efficiently exploiting these systems for data processing is a challenging research problem that comprises many facets, including how to find an optimal operator placement strategy, how to estimate runtime costs across different hardware architectures, […]

OpenCL

Aug, 18

Quantum Boolean Image Denoising

A quantum Boolean image processing methodology is presented in this work, with special emphasis in image denoising. A new approach for internal image representation is outlined together with two new interfaces: classical-to-quantum and quantum-to-classical. The new quantum-Boolean image denoising called quantum Boolean mean filter (QBMF) works with computational basis states (CBS), exclusively. To achieve this, […]

OpenCL