high performance computing on graphics processing units: hgpu.org

Posts

Apr, 25

Practical craniofacial surgery simulator based on GPU accelerated lattice shape matching

This paper presents an intuitive and practical craniofacial surgery simulation system, which is suitable for daily clinical practice. The key component of the system is a GPU accelerated lattice shape matching method, to estimate individual patient post-operative appearance interactively. The lattice model can be set up from individual CT data in an easy and robust […]

Apr, 25

GPU-friendly shape interpolation based on trajectory warping

In this paper, we propose a GPU-friendly shape interpolation method. In contrast with state-of-the-art interpolation algorithms, our method computes the trajectory of each vertex independently instead of solving large linear systems in every interpolation step. Given two poses being interpolated, we find trajectory parameters for each vertex by optimization with the consideration of the key […]

CUDA

Apr, 25

Interactive soft-fabrics watering simulation on GPU

Physics-based simulation is usually complex and time consuming, and consequently not suitable for real-time applications. In this paper, we propose the efficient dynamics models for the real-time simulation of soft fabrics interacting with water, including multi-soaking effect and underwater dynamics. The multi-soaking effect of soft fabrics is modeled based on the physics processes. Further, we […]

CUDA

Apr, 25

GPU simulations for risk assessment in CO2 geologic sequestration

A main concern for any CO2 sequestration system is whether it may leak CO2 over a long-term time horizon. The outcome depends on the competition between sequestration and leakage processes. Leak- ages may occur from failure of manmade material or through faults in the formations above the reservoir. A simple and computationally efficient simulator was […]

Apr, 25

Optimized HPL for AMD GPU and multi-core CPU usage

The installation of the LOEWE-CSC (http://csc.uni-frankfurt.de/csc/?51) supercomputer at the Goethe University in Frankfurt lead to the development of a Linpack which can fully utilize the installed AMD Cypress GPUs. At its core, a fast DGEMM for combined GPU and CPU usage was created. The DGEMM library is tuned to hide all DMA transfer times and […]

Apr, 25

Tracking and Clustering Salient Features in Image Sequences

Many applications in media production need information about moving objects in the scene, e.g. insertion of computer-generated objects, association of sound sources to these objects or visualization of object trajectories in broadcasting. We present a GPU accelerated approach for detecting and tracking salient features in image sequences and we propose an algorithm for clustering the […]

CUDA

Apr, 23

Software-Based Algorithm for Modeling and Correction of Gradient Nonlinearity Distortions in Magnetic Resonance Imaging

Functional radiosurgery is a noninvasive stereotactic technique that requires magnetic resonance image (MRI) sets with high spatial resolution. Gradient nonlinearities introduce geometric distortions that compromise the accuracy of MRI-based stereotactic localization. We present a gradient nonlinearity correction method based on a cubic phantom MRI data set. The approach utilizes a sum of spherical harmonics to […]

CUDA

Apr, 23

Towards smart-pixel-based implementation of wideband active sonar echolocation system for multi-target detection

A modified adaptive noise cancelling (ANC) hybrid algorithm incorporating techniques of optimal wideband FIR filtering operation and the trimmed mean (TM)-levelization, a dynamic level-based thresholding process, is described in this paper for wideband active sonar echolocation system. This novel technique performs effective suppression of unwanted part of the training data thus localizing potentials of target […]

Apr, 23

GPU-S2S: A Compiler for Source-to-Source Translation on GPU

CUDA facilitates the development of General Purpose computing on Graphics Processing Units (GPGPU), however, its complex memory system, thread-level structure, and data transmission control between memories have brought great challenges for programming on GPU. In order to facilitate the development of parallel programs on GPU and reuse existing sequential codes, in this paper we propose […]

CUDA

Apr, 23

Multimodal collaboration and human-computer interaction

The research effort at Microsoft research on multimodal collaboration and human-computer interaction aims at developing tools that allow people across geographically distributed sites to interact collaboratively with immersive experience. Our prototype systems consist of cameras, displays, speakers, microphones, computer controllable lights, and/or input devices such as touch sensitive surface, stylus, keyboard, and mouse. They require […]

Apr, 23

Physically-Based Interactive Flow Visualization Based on Schlieren and Interferometry Experimental Techniques

Understanding fluid flow is a difficult problem and of increasing importance as computational fluid dynamics produces an abundance of simulation data. Experimental flow analysis has employed techniques such as shadowgraph, interferometry and schlieren imaging for centuries which allow empirical observation of inhomogeneous flows. Shadowgraphs provide an intuitive way of looking at small changes in flow […]

CUDA

Apr, 23

Introduction to GPU Computing and CUDA Programming: A Case Study on FDTD [EM Programmer’s Notebook]

The recent advent of general-purpose graphics-processing units (GPGPUs) as inexpensive arithmetic-processing units brings a relevant amount of computing power to modern desktop PCs. This thus providing an interesting pathway to the acceleration of several numerical electromagnetic methods. In this paper, we explain how to exploit GPGPU features by examining how the computational time of the […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

Practical craniofacial surgery simulator based on GPU accelerated lattice shape matching

GPU-friendly shape interpolation based on trajectory warping

Interactive soft-fabrics watering simulation on GPU

GPU simulations for risk assessment in CO2 geologic sequestration

Optimized HPL for AMD GPU and multi-core CPU usage

Tracking and Clustering Salient Features in Image Sequences

Software-Based Algorithm for Modeling and Correction of Gradient Nonlinearity Distortions in Magnetic Resonance Imaging

Towards smart-pixel-based implementation of wideband active sonar echolocation system for multi-target detection

GPU-S2S: A Compiler for Source-to-Source Translation on GPU

Multimodal collaboration and human-computer interaction

Physically-Based Interactive Flow Visualization Based on Schlieren and Interferometry Experimental Techniques

Introduction to GPU Computing and CUDA Programming: A Case Study on FDTD [EM Programmer’s Notebook]

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)