high performance computing on graphics processing units: hgpu.org

Posts

Dec, 11

Displacement Mapping on the GPU – State of the Art

This paper reviews the latest developments of displacement mapping algorithms implemented on the vertex, geometry, and fragment shaders of graphics cards. Displacement mapping algorithms are classified as per-vertex and per-pixel methods. Per-pixel approaches are further categorized as safe algorithms that aim at correct solutions in all cases, to unsafe techniques that may fail in extreme […]

Dec, 11

GPU-boosted online image matching

Matching feature points between images is a key point in many computer vision tasks. As the number of images increases, this rapidly becomes a bottleneck. We here present how to use the power of GPUs to obtain image matching in typically 20 ms for one thousand points. This speedup makes applications like interactive image matching […]

OpenGL

Dec, 11

Future graphics architectures

Graphics architectures are in the midst of a major transition. In the past, these were specialized architectures designed to support a single rendering algorithm: the standard Z buffer. Realtime 3D graphics has now advanced to the point where the Z-buffer algorithm has serious shortcomings for generating the next generation of higher-quality visual effects demanded by […]

Dec, 11

The impact of accelerator processors for high-throughput molecular modeling and simulation

The recent introduction of cost-effective accelerator processors (APs), such as the IBM Cell processor and Nvidia’s graphics processing units (GPUs), represents an important technological innovation which promises to unleash the full potential of atomistic molecular modeling and simulation for the biotechnology industry. Present APs can deliver over an order of magnitude more floating-point operations per […]

Dec, 10

The 20th International ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2011

HPDC is the premier computer science conference for presenting new results relating to large scale high performance and distributed systems used in science and industry. For twenty years, HPDC has been at the center of new discoveries in clusters, grids, clouds, and parallel and multicore computers.

Dec, 10

3rd Workshop on using Emerging Parallel Architectures (WEPA) in conjunction with International Conference on Computational Science, ICCS 2011

The computing landscape has undergone significant transformation with the emergence of more powerful processing elements such as GPUs, FPGAs, Cell B.E., multi-cores, etc. On the multi-core front, Moore’s Law has transcended beyond the single processor boundary with the prediction that the number of cores will double every 18 months. Going forward, the primary method of […]

Dec, 10

Biomedical image analysis on a cooperative cluster of GPUs and multicores

We are currently witnessing the emergence of two paradigms in parallel computing: streaming processing and multi-core CPUs. Represented by solid commercial products widely available in commodity PCs, GPUs and multi-core CPUs bring together an unprecedented combination of high performance at low cost. The scientific computing community needs to keep pace with application models and middleware […]

CUDA

Dec, 10

Merge: a programming model for heterogeneous multi-core systems

In this paper we propose the Merge framework, a general purpose programming model for heterogeneous multi-core systems. The Merge framework replaces current ad hoc approaches to parallel programming on heterogeneous platforms with a rigorous, library-based methodology that can automatically distribute computation across heterogeneous cores to achieve increased energy and performance efficiency. The Merge framework provides […]

Dec, 10

Real-Time Simulation of Medical Ultrasound from CT Images

Medical ultrasound interpretation requires a great deal of experience. Real-time simulation of medical ultrasound provides a cost-effective tool for training and easy access to a variety of cases and exercises. However, fully synthetic and realistic simulation of ultrasound is complex and extremely time-consuming. In this paper, we present a novel method for simulation of ultrasound […]

Dec, 10

Cost-effective medical image reconstruction: from clusters to graphics processing units

We demonstrate that for modern medical imaging applications, parallel implementations on traditional parallel architectures (clusters and multiprocessor servers) can be outperformed, both in terms of speed and cost-effectiveness, by new implementations on next-generation architectures like GPUs (Graphics Processing Units). Although, compared to clusters and multiprocessor servers, GPUs are rather small and much less expensive, they […]

Dec, 10

Larrabee: a many-core x86 architecture for visual computing

This paper presents a many-core visual computing architecture code named Larrabee, a new software rendering pipeline, a manycore programming model, and performance analysis for several applications. Larrabee uses multiple in-order x86 CPU cores that are augmented by a wide vector processor unit, as well as some fixed function logic blocks. This provides dramatically higher performance […]

Dec, 10

Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration

General-purpose computing on graphics processing units (GPGPU) is shown to dramatically increase the speed of Monte Carlo simulations of photon migration. In a standard simulation of time-resolved photon migration in a semi-infinite geometry, the proposed methodology executed on a low-cost graphics processing unit (GPU) is a factor 1000 faster than simulation performed on a single […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

Displacement Mapping on the GPU – State of the Art

GPU-boosted online image matching

Future graphics architectures

The impact of accelerator processors for high-throughput molecular modeling and simulation

The 20th International ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2011

3rd Workshop on using Emerging Parallel Architectures (WEPA) in conjunction with International Conference on Computational Science, ICCS 2011

Biomedical image analysis on a cooperative cluster of GPUs and multicores

Merge: a programming model for heterogeneous multi-core systems

Real-Time Simulation of Medical Ultrasound from CT Images

Cost-effective medical image reconstruction: from clusters to graphics processing units

Larrabee: a many-core x86 architecture for visual computing

Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)