high performance computing on graphics processing units: hgpu.org

Posts

Apr, 25

A GPU implementation for two MIMO-OFDM detectors

Two real-valued signal models based on selective spanning with fast enumeration (SSFE) and layered orthogonal lattice detector (LORD) algorithms are implemented on a Nvidia graphics processing unit (GPU). A 2×2 multiple-input multiple-output (MIMO) antenna system with 16-quadrature amplitude modulation (16-QAM) is assumed. The chosen level update vector for SSFE is based on computer simulation results […]

Apr, 25

Parallel 3D Finite Difference Time Domain Simulations on Graphics Processors with Cuda

Parallel Finite Difference Time Domain (FDTD) method has been explored over past few years because of the expensive computation needed for its application. And General Purpose Graphics Processing Units (GPGPU), especially Computer Unit Device Architecture (CUDA) model, has been offered an efficient and simple solution. This paper analyzes parallel FDTD method and CUDA architecture, presents […]

CUDA

Apr, 25

MultiGPU computing using MPI or OpenMP

The GPU computing follows the trend of GPGPU, driven by the innovations in both hardware and programming languages made available to nongraphic programmers. Since some problems require an important time to solve or data quantities that do not fit on one single GPU, the logical continuation was to make use of multiple GPUs. In order […]

Apr, 25

A real time Breast Microwave Radar imaging reconstruction technique using simt based interpolation

Breast Microwave Radar(BMR) is a novel imaging modality that is capable of producing high contrast images and can detect tumors of at least 4mm. To properly visualize the responses from the breast structures, BMR data sets must be reconstructed. In this paper, a real time BMR image formation technique is proposed. This approach is based […]

CUDA

Apr, 25

Improving numerical reproducibility and stability in large-scale numerical simulations on GPUs

The advent of general purpose graphics processing units (GPGPU’s) brings about a whole new platform for running numerically intensive applications at high speeds. Their multi-core architectures enable large degrees of parallelism via a massively multi-threaded environment. Molecular dynamics (MD) simulations are particularly well-suited for GPU’s because their computations are easily parallelizable. Significant performance improvements are […]

CUDA

Apr, 25

Parallelizing Motion JPEG 2000 with CUDA

Due to the rapid growth of graphics processing unit (GPU) processing capability, using GPU as a coprocessor for assisting the CPU in computing massive data has become indispensable. Nvidia’s CUDA general-purpose graphical processing unit (GPGPU) architecture can greatly benefit single instruction multiple thread (SIMT) styled, computationally expensive programs. Video encoding, to an extent, is an […]

CUDA

Apr, 25

Financial Derivatives Modeling Using GPU’s

The architecture of the latest graphic processing unit (GPU) has surpassed the previous application-specific stream architecture. This has led to an architecture consisting of a number of uniform programmable units integrated on the same chip which facilitate the general-purpose computing beyond the graphic processing. With the multiple programmable units executing in parallel, the latest GPU […]

Apr, 25

GPU-Based Background Illumination Correction for Blue Screen Matting

Separation of foreground objects from an almost constant backing color for video applications is still a common problem ([1]). For non-realtime situations there is a wide variety of different powerful mathematical approaches that can deal with most of the matting problems. For SD/HD studio realtime keyers most solutions are not applicable due to their algorithm […]

OpenGL

Apr, 25

Scalable Software Defined FM-radio receiver running on desktop computers

Software Defined Radios (SDRs) are increasingly attractive to replace common hardware solutions. Current SDRs are mostly part of communication systems using hardware front ends containing DSPs or FPGAs. Processing on CPUs only is not common due to the huge amount of processing resources required. Most current CPUs are not able to handle this. The goal […]

Apr, 25

Flexible Pixel Compositor for Plug-and-Play Multi-Projector Displays

In summary, we are developing the next generation compositor to satisfy the demanding needs from emerging applications. It can be used beyond multi-projector displays. The first is auto-stereoscopic (multi-view) displays, in particular lenticular-based displays. These 3D displays in fact display many views simultaneously and therefore require orders of magnitude more pixels to provide an observer […]

Apr, 25

Exploiting SPMD Horizontal Locality to Improve Memory Efficiency

In this paper, we analyze a particular spatial locality case (called horizontal locality) inherent to manycore accelerator architectures employing barrel execution of SPMD kernels, such as GPUs. We then propose an adaptive memory access granularity framework to exploit and enforce the horizontal locality in order to reduce the interferences among accelerator cores memory accesses and […]

Apr, 25

High Performance Computing via a GPU

Graphics processor units (GPUs), such as the AMD FireStream series, offer a tremendous computing power that is frequently an order of magnitude larger than even the most modern multi-core CPUs, making them an attractive platform for high performance computing due to their relative cheapness compared with conventional PC clusters. General-purpose computing on GPUs (GPGPU) is […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

A GPU implementation for two MIMO-OFDM detectors

Parallel 3D Finite Difference Time Domain Simulations on Graphics Processors with Cuda

MultiGPU computing using MPI or OpenMP

A real time Breast Microwave Radar imaging reconstruction technique using simt based interpolation

Improving numerical reproducibility and stability in large-scale numerical simulations on GPUs

Parallelizing Motion JPEG 2000 with CUDA

Financial Derivatives Modeling Using GPU’s

GPU-Based Background Illumination Correction for Blue Screen Matting

Scalable Software Defined FM-radio receiver running on desktop computers

Flexible Pixel Compositor for Plug-and-Play Multi-Projector Displays

Exploiting SPMD Horizontal Locality to Improve Memory Efficiency

High Performance Computing via a GPU

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)