high performance computing on graphics processing units: hgpu.org

Posts

Feb, 15

Comparing Many-Core Accelerator Frameworks

GPUs as general purpose processors already are well adopted in scientific and high performance computing. Their steadily increasing success caused others than GPU hardware vendors to work on many{core processors as hardware accelerators. With CUDA and OpenCL there are two frameworks available for GPU programming. Apart from potential compatibility problems with the upcoming hardware, both […]

CUDA

•

OpenCL

Feb, 11

SPIRE, a Sequential to Parallel Intermediate Representation Extension

SPIRE is a new, generic, parallel extension for the intermediate representations used in compilation frameworks of sequential languages; it intends to easily leverage their existing infrastructure to address both control and data parallel languages. Since the efficiency and power of the transformations and optimizations performed by compilers are closely related to the presence of a […]

OpenCL

Feb, 11

Task Parallelism and Synchronization: An Overview of Explicit Parallel Programming Languages

Programming parallel machines as effectively as sequential ones would ideally require a language that provides high-level programming constructs in order to avoid the programming errors frequent when expressing parallelism. Since task parallelism is often considered more error-prone than data parallelism, we survey six popular and efficient parallel programming languages that tackle this difficult issue: Cilk, […]

OpenCL

Feb, 11

High-throughput protein crystallization on the World Community Grid and the GPU

We have developed CPU and GPU versions of an automated image analysis and classification system for protein crystallization trial images from the Hauptman Woodward Institute’s High-Throughput Screening lab. The analysis step computes 12,375 numerical features per image. Using these features, we have trained a classifier that distinguishes 11 different crystallization outcomes, recognizing 80% of all […]

OpenCL

Feb, 11

Characterizing and Evaluating a Key-value Store Application on Heterogeneous CPU-GPU Systems

The recent use of graphics processing units (GPUs) in several top supercomputers demonstrate their ability to consistently deliver positive results in high-performance computing (HPC). GPU support for significant amounts of parallelism would seem to make them strong candidates for non-HPC applications as well. Server workloads are inherently parallel; however, at first glance they may not […]

OpenCL

Feb, 10

Automatic Performance Optimization in ViennaCL for GPUs

Highly parallel computing architectures such as graphics processing units (GPUs) pose several new challenges for scientific computing, which have been absent on single core CPUs. However, a transition from existing serial code to parallel code for GPUs often requires a considerable amount of effort. The Vienna Computing Library (ViennaCL) presented in the beginning of this […]

OpenCL

Feb, 10

Implementing Molecular Dynamics on Hybrid High Performance Computers – Particle-Particle Particle-Mesh

The use of accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), are now becoming more […]

CUDA

•

OpenCL

Feb, 7

Enabling Traceability in MDE to Improve Performance of GPU Applications

Graphics Processor Units (GPUs) are known for offering high performance and power efficiency for processing algorithms that suit well to their massively parallel architecture. Unfortunately, as parallel programming for this kind of architecture requires a complex distribution of tasks and data, developers find it difficult to implement their applications effectively. Although approaches based on source-to-source […]

OpenCL

Feb, 5

Accelerating Outlier Detection with Uncertain Data using Graphics Processors

Outlier detection (also known as anomaly detection) is a common data mining task in which data points that lie outside expected patterns in a given dataset are identified. This is useful in areas such as fault detection, intrusion detection and in pre-processing before further analysis. There are many approaches already in use for outlier detection, […]

OpenCL

Jan, 31

Decompilation of LLVM IR

Recently, in many important domains, high-level languages have become the code representations with widest platform support surpassing any low-level language in their area with respect to completeness and importance as exchange format (e.g. OpenCL for data-parallel computing, GLSL/HLSL for shader programs, JavaScript for the web). The code representations of many actively-developed compiler frameworks [JVM,LLVM,FIRM] are […]

OpenCL

Jan, 30

Performance Evaluation of Query Processing Algorithms on GPGPUs

Modern Graphical Processing Units (GPUs) can perform general purpose computing, next to standard graphical processing. Open frameworks, such as the OpenCL standard by the Khronos Group, enable developers to easily harness the computational power of GPUs. While in certain aspects, these are more powerful than standard CPUs, the latter are still a more suitable solution […]

OpenCL

Jan, 30

Towards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems

SkePU is a C++ template library that provides a simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA and OpenCL. The interface is also general enough to support other architectures, and SkePU implements both a sequential CPU and a parallel OpenMP backend. It also supports multi-GPU systems. […]

CUDA

•

OpenCL

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Comparing Many-Core Accelerator Frameworks

SPIRE, a Sequential to Parallel Intermediate Representation Extension

Task Parallelism and Synchronization: An Overview of Explicit Parallel Programming Languages

High-throughput protein crystallization on the World Community Grid and the GPU

Characterizing and Evaluating a Key-value Store Application on Heterogeneous CPU-GPU Systems

Automatic Performance Optimization in ViennaCL for GPUs

Implementing Molecular Dynamics on Hybrid High Performance Computers – Particle-Particle Particle-Mesh

Enabling Traceability in MDE to Improve Performance of GPU Applications

Accelerating Outlier Detection with Uncertain Data using Graphics Processors

Decompilation of LLVM IR

Performance Evaluation of Query Processing Algorithms on GPGPUs

Towards a Tunable Multi-Backend Skeleton Programming Framework for Multi-GPU Systems

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)