high performance computing on graphics processing units: hgpu.org

Posts

Feb, 16

GpuC: Data parallel language extension to CUDA

In recent years, Graphics Processing Units (GPUs) have emerged as a powerful accelerator for general-purpose computations. Current approaches to program GPUs are still relatively low-level programming models such as Compute Unified Device Architecture (CUDA), a programming model from NVIDIA, and Open Compute Language (OpenCL), created by Apple in cooperation with others. These two programming models […]

CUDA

Feb, 16

Enhancing the simulation of P systems for the SAT problem on GPUs

GPUs constitute nowadays a solid alternative for high performance computing, and the advent of CUDA/OpenCL allow programmers a friendly model to accelerate a broad range of applications. The way GPUs exploit parallelism differ from multi-core CPUs, which raises new challenges to take advantage of its tremendous computing power. In this respect, P systems or Membrane […]

CUDA

Feb, 16

Accelerating the Stochastic Simulation Algorithm using Emerging Architectures

In order for scientists to learn more about molecular biology, it is imperative that they have the ability to construct and evaluate models. Model statistics consistent with the chemical master equation can be obtained using Gillespie’s stochastic simulation algorithm (SSA). Due to the stochastic nature of the Monte Carlo simulations, large numbers of simulations must […]

CUDA

Feb, 16

GPU Accelerated Stochastic Simulation

Through computational methods, biologists are able learn more about molecular biology by building accurate models. These models represent and predict the reactions among species populations within a system. One popular method to develop predictive models is to use a stochastic, Monte Carlo method developed by Gillespie called the stochastic simulation algorithm (SSA). Since this algorithm […]

CUDA

Feb, 16

A GPU-based Flood Simulation Framework

We present a multi-core, GPU-based framework for simulation and visualization of two-dimensional floods, based on the full implementation of Saint Venant equations. A validated CPU-based flood model was converted to NVIDIA’s CUDA architecture. The model was run on two different NVIDIA graphics cards, a GeForce 8400 GS and a Tesla T10. The model was tested […]

CUDA

Feb, 16

Static Memory Access Pattern Analysis on a Massively Parallel GPU

The performance of data-parallel processing can be highly sensitive to any contention in memory. In contrast to multi-core CPUs which employ a number of memory latency minimization techniques such as multi-level caching and prefetching, Graphics Processing Units (GPUs) require that the data-parallel computations reference memory in a deterministic pattern in order to reap the benefits […]

OpenCL

Feb, 16

Using Graphics Processors to Accelerate Synthetic Aperture Sonar Imaging via Backpropagation

This paper describes the use of graphics processors to accelerate the backpropagation method of forming images in Synthetic Aperture Sonar (SAS) systems. SAS systems coherently process multiple pulses to provide a higher level of detail in the resolved image than is otherwise possible with a single pulse. Several models are available to resolve an image […]

CUDA

Feb, 16

An experimental study on performance portability of OpenCL kernels

Accelerator processors allow energy-efficient computation at high performance, especially for computationintensive applications. There exists a plethora of different accelerator architectures, such as GPUs and the Cell Broadband Engine. Each accelerator has its own programming language, but the recently introduced OpenCL language unifies accelerator programming languages. Hereby, OpenCL achieves functional protability, allowing to reduce the development […]

OpenCL

Feb, 16

Multi-agent traffic simulation with CUDA

Today’s graphics processing units (GPU) have tremendous resources when it comes to raw computing power. The simulation of large groups of agents in transport simulation has a huge demand of computation time. Therefore it seems reasonable to try to harvest this computing power for traffic simulation. Unfortunately simulating a network of traffic is inherently connected […]

CUDA

Feb, 16

MuMax: a new high-performance micromagnetic simulation tool

We present MuMax, a general-purpose micromagnetic simulation tool running on Graphical Processing Units (GPUs). MuMax is designed for high performance computations and specifically targets large simulations. In that case speedups of over a factor 100x can easily be obtained compared to the CPU-based OOMMF program developed at NIST. MuMax aims to be general and broadly […]

CUDA

Feb, 15

Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid

We have previously suggested mixed precision iterative solvers specifically tailored to the iterative solution of sparse linear equation systems as they typically arise in the finite element discretization of partial differential equations. These schemes have been evaluated for a number of hardware platforms, in particular, single-precision GPUs as accelerators to the general purpose CPU. This […]

CUDA

Feb, 15

Accelerating Cosmological Data Analysis with Graphics Processors

In this paper we describe a successful effort to accelerate the two-point angular correlation function—a basic statistics tool used in the field of cosmology to characterize the distribution of the matter and energy in the Universe—by using an NVIDIA GPU-based system. We demonstrate the use of GPUs to accelerate the calculation of histograms of angular […]

CUDA

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GpuC: Data parallel language extension to CUDA

Enhancing the simulation of P systems for the SAT problem on GPUs

Accelerating the Stochastic Simulation Algorithm using Emerging Architectures

GPU Accelerated Stochastic Simulation

A GPU-based Flood Simulation Framework

Static Memory Access Pattern Analysis on a Massively Parallel GPU

Using Graphics Processors to Accelerate Synthetic Aperture Sonar Imaging via Backpropagation

An experimental study on performance portability of OpenCL kernels

Multi-agent traffic simulation with CUDA

MuMax: a new high-performance micromagnetic simulation tool

Cyclic Reduction Tridiagonal Solvers on GPUs Applied to Mixed-Precision Multigrid

Accelerating Cosmological Data Analysis with Graphics Processors

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)