high performance computing on graphics processing units: hgpu.org

Posts

May, 16

Fast GPU-Based Automatic Time Gain Compensation for Ultrasound Imaging

The Compute Unified Device Architecture (CUDA) is a new programming platform making use of the unified shader design of the most current Graphics Processing Units (GPUs) from NVIDIA. In this paper, we apply this revolutionary new technology to implement the automatic time gain compensation (ATGC) for medical ultrasound imaging. The parallel box filtering method and […]

CUDA

May, 16

Muscle pushing based skin deformation on GPU

Muscle-based skin deformation could produce detailed visual quality. However, it always requires tedious labor work and the rendering speed is usually slow. In this paper, we propose a fast skin deformation method based on the common three-layer model, including skeleton, muscle and skin layer. First, Skeleton-driven smooth skinning technique is adopted to get the basic […]

May, 16

CUDA Based GPU Programming to Simulate 3D Tissue Deformation

The medical training systems based on virtual simulation are highly desired since minimally invasive surgical techniques have become popular to patients. The training system helps surgeon trainees to acquire, practice and evaluate their surgical skills, and the key component of such a system is to simulate the dynamic procedure such as 3D biological tissue deformation […]

CUDA

May, 16

Shape-merging and interpolation using class estimation for unseen voxels with a GPU-based efficient implementation

The merging of multiple range images obtained by 3D measurement systems for generating a single polygon mesh, and processing for filling holes caused by unmeasured data or insufficient range images are essential processes for CAD, digital archiving of shapes, and CG rendering. Many of the existing processes that have been proposed for merging and interpolating […]

May, 16

The GPU enters computing’s mainstream

The Siggraph/Eurographics Graphics Hardware 2003 workshop, held in San Diego, will likely be remembered as a turning point in modern computing. In one of those rare moments when a new paradigm visibly begins changing general-purpose computing’s course, what has traditionally been a graphics-centric workshop shifted its attention to the nongraphics applications of the graphics processing […]

May, 16

Incremental Raycasting of Piecewise Quadratic Surfaces on the GPU

To overcome the limitations of triangle and point based surfaces several authors have recently investigated surface representations that are based on higher order primitives. Among these are MPU, SLIM surfaces, dynamic skin surfaces and higher order iso-surfaces. Up to now these representations were not suitable for interactive applications because of the lack of an efficient […]

OpenGL

May, 15

High dimensional pricing of exotic European contracts on a GPU Cluster, and comparison to a CPU cluster

The aim of this paper is the efficient use of CPU and GPU clusters for a general path-dependent exotic European pricing, and their comparison in terms of speed and energy consumption. To reach our goal, we propose a parallel random number generator which is well suited to the parallelization paradigm, then, we implement a multidimensional […]

CUDA

May, 15

A parallel Ant Colony Optimization algorithm with GPU-acceleration based on All-In-Roulette selection

Ant Colony Optimization is computationally expensive when it comes to complex problems. The Jacket toolbox allows implementation of MATLAB programs in Graphics Processing Unit (GPU). This paper presents and implements a parallel MAX-MIN Ant System (MMAS) based on a GPU+CPU hardware platform under the MATLAB environment with Jacket toolbox to solve Traveling Salesman Problem (TSP). […]

May, 15

K3 Moore’s Law in the Era of GPU Computing

The history of humanity is that we strive to use better tools and knowledge to build even better tools, and extend further the border of knowledge. In the past 50 years, CPU, as a dominant paradigm for computing, has provided exponential growth as predicted by Moore’s Law with remarkable accuracy. We have been leveraging CPUs […]

May, 15

Object oriented framework for real-time image processing on GPU

In this paper, we present a framework for efficiently integrating programming resources of both GPU and CPU. We introduce an object oriented framework for GPGPU-based image processing. We illustrate a set of classes exploiting the design and programming advantages of an object oriented language, such as code reusability/extensibility, flexibility, information hiding, and complexity hiding. This […]

CUDA

May, 15

Fermi GF100 GPU Architecture

The Fermi GF100 is a GPU architecture that provides several new capabilities beyond the Nvidia GT200 or Tesla architecture. The Fermi architecture offers up to 512 CUDA cores and special features for gaming and high-performance computing. This article describes the GPU’s new capabilities for tessellation, physics processing, and computational graphics.

CUDA

May, 15

Investigating the use of GPU-accelerated nodes for SAR image formation

The computation of an electromagnetic reflectivity image from a set of radar returns is a computationally intensive process. Therefore, the use of high performance computing is required to form images from radar signals in a short time frame. This paper explores the use of distributed memory cluster computers and accelerator technologies such as GPUs for […]

high performance computing on graphics processing units: hgpu.org

Posts

Fast GPU-Based Automatic Time Gain Compensation for Ultrasound Imaging

Muscle pushing based skin deformation on GPU

CUDA Based GPU Programming to Simulate 3D Tissue Deformation

Shape-merging and interpolation using class estimation for unseen voxels with a GPU-based efficient implementation

The GPU enters computing’s mainstream

Incremental Raycasting of Piecewise Quadratic Surfaces on the GPU

High dimensional pricing of exotic European contracts on a GPU Cluster, and comparison to a CPU cluster

A parallel Ant Colony Optimization algorithm with GPU-acceleration based on All-In-Roulette selection

K3 Moore’s Law in the Era of GPU Computing

Object oriented framework for real-time image processing on GPU

Fermi GF100 GPU Architecture

Investigating the use of GPU-accelerated nodes for SAR image formation

Recent source codes

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

LC Framework

pplx-garden: Perplexity open source garden for inference technology

Atlas CLI: Machine Learning (ML) Lifecycle & Transparency Manager

transformers_tvm: Implementation of Encoder Decoder transformer on TVM

OpScanner

INT v.s. FP: A framework to compare low-bit integer and float-point formats

AutoDock-GPU: AutoDock for GPUs and other accelerators

NCCLX: collective communication framework

Tutoring LLM into a Better CUDA Optimizer

Most viewed papers (last 30 days)