high performance computing on graphics processing units: hgpu.org

Posts

Jul, 12

Real-time simulation of a spiking neural network model of the basal ganglia circuitry using general-purpose computing on graphics processing units

Real-time simulation of a biologically realistic spiking neural network is necessary for evaluation of its capacity to interact with real environments. However, the real-time simulation of such a neural network is difficult due to its high computational costs that arise from two factors: (1) vast network size and (2) the complicated dynamics of biologically realistic […]

Jul, 11

MATLAB Parallelization through Scalarization

While the popularity of using high-level programming languages such as MATLAB for scientific and engineering applications continues to grow, its poor performance compared to traditional languages such as Fortran or C continues to impede its deployment in full-scale simulations and data analysis. Additionally, its poor memory performance limits its performance. To ameliorate performance, we have […]

CUDA

Jul, 11

Interactive free form deformer for point-based objects by GPU acceleration

The point-based representation constitutes a recent useful alternative approach for modeling enormous and large 3D polygonal models which contain usually millions of faces. However, and in order to allow the artists and the designers to work with such representation, a new mechanisms of FFD (Free Form Deformation) must be offered and that especially with an […]

Jul, 11

Pseudo-Random Number Generation on GP-GPU

Random number generation is a key element of stochastic simulations. It has been widely studied for sequential applications purposes, enabling us to reliably use pseudo-random numbers in this case. Unfortunately, we cannot be so enthusiastic when dealing with parallel stochastic simulations. Many applications still neglect random stream parallelization, leading to potentially biased results. Particular parallel […]

Jul, 11

GPU-accelerated MoM-based broadband simulations using Stoer-Bulirsch algorithm

This communication introduces a CUDA-enabled GPU accelerated technique for broadband analysis of arbitrary 3D conducting body-wire radiating/scattering structure. The solution is based on the integral equation discretized by the method of moments (MoM) with the use of RWG-type basis functions. Wide-band data are generated from MoM simulation employing an adaptive frequency sampling of the observed […]

CUDA

Jul, 11

Parallelizing of digital signal processing with using GPU

In this paper we show the process of a class of algorithms parallelization which are used in digital signal processing. We present this approach on the instance of the popular LMS algorithm which is used in noise reduction, echo cancelation problems and digital signal processing in general. We propose an approach which uses a GPGPU […]

CUDA

Jul, 11

3D vision of electromagnetic fields in antenna and microwave technique

Two approaches, Virtual Reality Modeling Language (VRML) and graphic library OpenGL, are used for active stereoscopic vision of electromagnetic fields surrounding selected antenna or microwave elements. Input data are generated by analytical relations or acquired by electromagnetic field simulators and visualized in the form of vector fields and field lines by active stereoscopy. In case […]

OpenGL

Jul, 11

The GeForce 6800

Graphics processing units (GPUs) continue to take on increasing computational workloads and support interactive rendering that approaches cinematic quality. The architectural drivers for GPUs are programmability, parallelism, bandwidth, and memory characteristics. This article describes how one team approached the design problem.

Jul, 11

Implementation of large-scale FIR adaptive filters on NVIDIA GeForce graphics processing unit

This paper presents implementations of an FIR adaptive filter with a large number of taps on nVIDIA GeForce graphics processing unit (GPU) and CUDA software development environment. In order to overcome a long access latency for slow off-chip memory access, reduction of memory accesses by re-ordering and vector load/store operations and an increase of the […]

CUDA

Jul, 11

The computer graphics wars heat up

The author describes the present status of computer graphics hardware. While console games flourished, PC game sales have plummeted. The plans of Nvidia and ATI, the 3D graphics world’s two dominant players are described.

Jul, 11

Accelerating Lossless Data Compression with GPUs

Huffman compression is a statistical, lossless, data compression algorithm that compresses data by assigning variable length codes to symbols, with the more frequently appearing symbols given shorter codes than the less. This work is a modification of the Huffman algorithm which permits uncompressed data to be decomposed into independently compressible and decompressible blocks, allowing for […]

CUDA

Jul, 10

Testing Tesla architecture for scientific computing: The performance of matrix-vector product

The paper presents results of several experiments evaluating the performance of NVIDIA processors, implementing a new Tesla architecture, in matrix-vector multiplication. Three matrix forms, dense, banded and sparse, are considered together with three hardware platforms: NVIDIA Tesla C870 computing board, NVIDIA GeForce 8800 GTX graphics card and one of the newest Intel Xeon processors, E5462, […]

CUDA

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Real-time simulation of a spiking neural network model of the basal ganglia circuitry using general-purpose computing on graphics processing units

MATLAB Parallelization through Scalarization

Interactive free form deformer for point-based objects by GPU acceleration

Pseudo-Random Number Generation on GP-GPU

GPU-accelerated MoM-based broadband simulations using Stoer-Bulirsch algorithm

Parallelizing of digital signal processing with using GPU

3D vision of electromagnetic fields in antenna and microwave technique

The GeForce 6800

Implementation of large-scale FIR adaptive filters on NVIDIA GeForce graphics processing unit

The computer graphics wars heat up

Accelerating Lossless Data Compression with GPUs

Testing Tesla architecture for scientific computing: The performance of matrix-vector product

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)