high performance computing on graphics processing units: hgpu.org

Posts

Jul, 12

Harnessing the power of idle GPUs for acceleration of biological sequence alignment

This paper presents a parallel system capable of accelerating biological sequence alignment on the graphics processing unit (GPU) grid. The GPU grid in this paper is a desktop grid system that utilizes idle GPUs and CPUs in the office and home. Our parallel implementation employs a master-worker paradigm to accelerate Liu’s OpenGL-based algorithm that runs […]

CUDA

•

OpenGL

Jul, 12

Design and performance evaluation of a digital wideband receiver on a hybrid computing platform

Design and implementation of a modern radar receiver that is capable of rapidly searching a large frequency range with maximum sensitivity in real time presents a challenge. Such a receiver not only has stringent operational requirements like high instantaneous dynamic range (IDR), multiple signal detection capability, wider bandwidth and also high frequency resolution. Currently, operating […]

Jul, 12

GPU-based acoustic feature extraction for electronic media processing

Multicore architectures are frequently utilized if very high computation power is required. At the same time current multicore graphic processing units (GPUs), designed for parallel data processing, have become applicable for general purpose computation. Thus, in current research projects the usage of GPUs is examined for a variety of applications. Thereby, GPUs are attractive for […]

Jul, 12

Cloudlet-screen computing: A multi-core-based, cloud-computing-oriented, traditional-computing-compatible parallel computing Paradigm for the masses

This paper proposes a computing paradigm where many users share a host platform consisting of one or more multicore CPU(s) and GPU(s). Each user is connected to the host by a link transferring primarily compressed compound video of screen, mouse and keyboard data. All computation and processing tasks including but not limited to 2D/3D graphics […]

Jul, 12

Towards a robust, real-time face processing system using CUDA-enabled GPUs

Processing of human faces finds application in various domains like law enforcement and surveillance, entertainment (interactive video games), information security, smart cards etc. Several of these applications are interactive and require reliable and fast face processing. A generic face processing system may comprise of face detection, recognition, tracking and rendering. In this paper, we develop […]

CUDA

Jul, 12

A Program Behavior Study of Block Cryptography Algorithms on GPGPU

Recently many studies have been made to map cryptography algorithms onto graphics processors (GPU), and gained great performances. This paper does not focus on the performance of a specific program exploited by using all kinds of optimization methods algorithmically, but the intrinsic reason which lies in GPU architectural features for this performance improvement. Thus we […]

CUDA

Jul, 12

Improved Real-Time Stereo on Commodity Graphics Hardware

This paper presents a detailed description of an advanced real-time correlation-based stereo algorithm running completely on the graphics processing unit (GPU). This is important since it allows to free up the main processor for other tasks including high-level interpretation of the stereo results. Compared to previous GPU-based stereo implementations our implementation includes some advanced features […]

OpenGL

Jul, 12

Graphics processing unit parallel accelerated solution of the discrete ordinates for photon transport in biological tissues

As a widely used numerical solution for the radiation transport equation (RTE), the discrete ordinates can predict the propagation of photons through biological tissues more accurately relative to the diffusion equation. The discrete ordinates reduce the RTE to a serial of differential equations that can be solved by source iteration (SI). However, the tremendous time […]

CUDA

Jul, 12

Real-time simulation of a spiking neural network model of the basal ganglia circuitry using general-purpose computing on graphics processing units

Real-time simulation of a biologically realistic spiking neural network is necessary for evaluation of its capacity to interact with real environments. However, the real-time simulation of such a neural network is difficult due to its high computational costs that arise from two factors: (1) vast network size and (2) the complicated dynamics of biologically realistic […]

Jul, 11

MATLAB Parallelization through Scalarization

While the popularity of using high-level programming languages such as MATLAB for scientific and engineering applications continues to grow, its poor performance compared to traditional languages such as Fortran or C continues to impede its deployment in full-scale simulations and data analysis. Additionally, its poor memory performance limits its performance. To ameliorate performance, we have […]

CUDA

Jul, 11

Interactive free form deformer for point-based objects by GPU acceleration

The point-based representation constitutes a recent useful alternative approach for modeling enormous and large 3D polygonal models which contain usually millions of faces. However, and in order to allow the artists and the designers to work with such representation, a new mechanisms of FFD (Free Form Deformation) must be offered and that especially with an […]

Jul, 11

Pseudo-Random Number Generation on GP-GPU

Random number generation is a key element of stochastic simulations. It has been widely studied for sequential applications purposes, enabling us to reliably use pseudo-random numbers in this case. Unfortunately, we cannot be so enthusiastic when dealing with parallel stochastic simulations. Many applications still neglect random stream parallelization, leading to potentially biased results. Particular parallel […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Harnessing the power of idle GPUs for acceleration of biological sequence alignment

Design and performance evaluation of a digital wideband receiver on a hybrid computing platform

GPU-based acoustic feature extraction for electronic media processing

Cloudlet-screen computing: A multi-core-based, cloud-computing-oriented, traditional-computing-compatible parallel computing Paradigm for the masses

Towards a robust, real-time face processing system using CUDA-enabled GPUs

A Program Behavior Study of Block Cryptography Algorithms on GPGPU

Improved Real-Time Stereo on Commodity Graphics Hardware

Graphics processing unit parallel accelerated solution of the discrete ordinates for photon transport in biological tissues

Real-time simulation of a spiking neural network model of the basal ganglia circuitry using general-purpose computing on graphics processing units

MATLAB Parallelization through Scalarization

Interactive free form deformer for point-based objects by GPU acceleration

Pseudo-Random Number Generation on GP-GPU

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)