high performance computing on graphics processing units: hgpu.org

Posts

May, 8

GPU Acceleration of Multilevel Solvers for Analysis of Microwave Components With Finite Element Method

The letter discusses a fast implementation of the conjugate gradient iterative method with E-field multilevel preconditioner applied to solving real symmetric and sparse systems obtained with vector finite element method. In order to accelerate computations, a graphics processing unit (GPU) was used and significant speed-up (2.61 fold) was achieved comparing to a central processing unit […]

CUDA

May, 8

GPU-based real-time simulation and rendering of unbounded ocean surface

We present a multi-resolution mesh model of the ocean surface based on a straightforward terrain LOD scheme, tiled quad-tree, with that the region of ocean surface can be extended limitlessly and readily adapted for GPU acceleration. We have introduced the concept of wrapped fractal surface (WFS) for generating height field map of the ocean. Through […]

May, 8

A GPU-inspired soft processor for high-throughput acceleration (thesis)

In this thesis a soft processor programming model and architecture is proposed that is inspired by graphics processing units (GPUs) and well-matched to the strengths of FPGAs, namely highly-parallel and pipelinable computation. The proposed soft processor architecture exploits multithreading, vector operations, and predication to supply a floating-point pipeline of up to 60 stages via hardware […]

May, 8

A GPU-inspired soft processor for high-throughput acceleration

There is building interest in using FPGAs as accelerators for high-performance computing, but existing systems for programming them are so far inadequate. In this paper we propose a soft processor programming model and architecture inspired by graphics processing units (GPUs) that are well-matched to the strengths of FPGAs, namely highly-parallel and pipelinable computation. In particular, […]

May, 8

GPU acceleration of the iterative physical optics (IPO) method

In this paper, we employ the programmable graphics processing unit (GPU) to accelerate the IPO computation for analyzing the scattering of open cavities. Since the iterative strategy accounts for multiple reflections on the inner wall, the IPO method provides a more accurate solution than the other high frequency asymptotic methods. However, it suffers from a […]

May, 8

GPU Acceleration of 2D-DWT Image Compression in MATLAB with CUDA

This article presents the details about the acceleration of 2D wavelet-based medical data (image) compression on MATLAB with CUDA. It is obvious that the diagnostic materials (mostly as acertain type of image) are increasingly acquired in a digital format. Therefore, common need to daily manipulate huge amount of data brought about the issue of compression […]

CUDA

May, 8

Experiments with Single Core, Multi-core, and GPU Based Computation of Cellular Automata

Cellular automata are a well-known modeling formalism exploited in a wide range of application areas. In many of those, the complexity of models hampers a thorough analysis of the system under study. Therefore, efficient simulation algorithms are required. We present here a comparison of seven different simulation algorithms for cellular automata: the classical ldquofullrdquo simulator, […]

May, 8

A GPU-based architecture for real-time data assessment at synchrotron experiments

Current imaging experiments at synchrotron beam lines often lack a real-time data assessment. X-ray imaging cameras installed at synchrotron facilities like ANKA provide millions of pixels, each with a resolution of 12 bits or more, and take up to several thousand frames per second. A given experiment can produce data sets of multiple gigabytes in […]

May, 7

Parallel ID Shadow-Map Decompression on GPU

ID shadow-maps are used for robust real-time rendering of shadows. The primary disadvantage of using shadow-maps is their excessive size for large scenes in case high quality shadows are needed. To eliminate large memory requirements and texture-size limitations of the current generation GPUs, texture compression is an important tool. We present a framework where compressed […]

OpenGL

May, 7

Cluster versus GPU implementation of an Orthogonal Target Detection Algorithm for Remotely Sensed Hyperspectral Images

Remotely sensed hyperspectral imaging instruments provide high-dimensional data containing rich information in both the spatial and the spectral domain. In many surveillance applications, detecting objects (targets) is a very important task. In particular, algorithms for detecting (moving or static) targets, or targets that could expand their size (such as propagating fires) often require timely responses […]

CUDA

May, 7

Visual cortex on the GPU: Biologically inspired classifier and feature descriptor for rapid recognition

We present a biologically motivated classifier and feature descriptors that are designed for execution on single instruction multi data hardware and are applied to high speed multiclass object recognition. Our feature extractor uses a cellular tuning approach to select the optimal Gabor filters to process a given input, followed by the computation of scale and […]

OpenGL

May, 7

GPU Based Parallel Computing on Blast Program

Sequence alignment is one of the most fundamental and important operation in Bioinformatics. Among lots of Sequence alignment tools, Blast is one of the most popular algorithms. In this paper, we describe the primary strategy of a GPU-based parallel computing on Blast program.

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GPU Acceleration of Multilevel Solvers for Analysis of Microwave Components With Finite Element Method

GPU-based real-time simulation and rendering of unbounded ocean surface

A GPU-inspired soft processor for high-throughput acceleration (thesis)

A GPU-inspired soft processor for high-throughput acceleration

GPU acceleration of the iterative physical optics (IPO) method

GPU Acceleration of 2D-DWT Image Compression in MATLAB with CUDA

Experiments with Single Core, Multi-core, and GPU Based Computation of Cellular Automata

A GPU-based architecture for real-time data assessment at synchrotron experiments

Parallel ID Shadow-Map Decompression on GPU

Cluster versus GPU implementation of an Orthogonal Target Detection Algorithm for Remotely Sensed Hyperspectral Images

Visual cortex on the GPU: Biologically inspired classifier and feature descriptor for rapid recognition

GPU Based Parallel Computing on Blast Program

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)