high performance computing on graphics processing units: hgpu.org

Posts

Oct, 17

Numerical Accuracy Analysis Based on the Discrete Stochastic Arithmetic on Multiprocessor Platforms

Simulating the real world has become one of the most widely used techniques in engineering today. Multiprocessor platforms play a key role in this development since bigger and bigger problems need more computing power to be solved. When the floating point standard was adopted in the early eighties of the 20th century, the amount of […]

CUDA

Oct, 17

Data-Driven Programming Abstractions and Optimization for Multi-Core Platforms

Multi-core platforms have spread to all corners of the computing industry, and trends in design and power indicate that the shift to multi-core will become even widerspread in the future. As the number of cores on a chip rises, the complexity of memory systems and on-chip interconnects increases drastically. The programmer inherits this complexity in […]

CUDA

Oct, 17

Implementing Stereo Vision of GPU-Accelerated Scientific Simulations using Commodity Hardware

Stereo vision technology is becoming more and more commonplace in the movie and gaming industries. It has applications in many other fields as well, one of these is viewing scientific data. We develop a stereo vision system using commodity priced hardware and portable graphics software. Hardware and software details are described, as well as some […]

OpenGL

Oct, 17

Magneto-hydrodynamics simulation in astrophysics

Magnetohydrodynamics (MHD) studies the dynamics of an electrically conducting fluid under the influence of a magnetic field. Many astrophysical phenomena are related to MHD, and computer simulations are used to model these dynamics. In this thesis, we conduct MHD simulations of non-radiative black hole accretion as well as fast magnetic reconnection. By performing large scale […]

CUDA

•

OpenCL

Oct, 17

The Lattice Boltzmann Simulation on Multi-GPU Systems

The Lattice Boltzmann Method (LBM) is widely used to simulate different types of flow, such as water, oil and gas in porous reservoirs. In the oil industry it is commonly used to estimate petrophysical properties of porous rocks, such as the permeability. To achieve the required accuracy it is necessary to use big simulation models […]

OpenCL

Oct, 17

An Optimization for Fast Generation of Digital Hologram

Digital hologram generation methods commonly use computer generated hologram (CGH) algorithm. However, it requires complicated computation. Thus, this paper proposes an optimization method for a fast generation of digital hologram. The proposed method uses CUDA and OpenMP for multi-GPU. Also, it applies various optimization methods (variable fixation, vectorization, and loop unrolling) to a CGH algorithm. […]

CUDA

Oct, 17

Dynamic Fine-Grain Scheduling of Pipeline Parallelism

Scheduling pipeline-parallel programs, defined as a graph of stages that communicate explicitly through queues, is challenging. When the application is regular and the underlying architecture can guarantee predictable execution times, several techniques exist to compute highly optimized static schedules. However, these schedules do not admit run-time load balancing, so variability introduced by the application or […]

CUDA

Oct, 17

Programming with Explicit Dependencies. A Framework for Portable Parallel Programming

Computational devices are rapidly evolving into massively parallel systems. Multicore processors are already standard; high performance processors such as the Cell/BE processor, graphics processing units (GPUs) featuring hundreds of on-chip processors, and reconfigurable devices such as FPGAs are all developed to deliver high computing power. They make parallelism commonplace, not only the privilege of expensive […]

CUDA

Oct, 17

A High Performance Parallel Sparse Linear Equation Solver Using CUDA

The management of electric power systems requires continuously computing the powerflow of a power system in real-time. For large power systems, this task is often beyond the capabilities of modern CPUs. Concurrent computation is an attractive approach to accelerating it. However, the powerflow computation requires solving a large system of sparse linear equations. This problem […]

CUDA

Oct, 16

Hard-Sphere Collision Simulations with Multiple GPUs, PCIe Extension Buses and GPU-GPU Communications

Simulating particle collisions is an important application for physics calculations as well as for various effects in computer games and movie animations. Increasing demand for physical correctness and hence visual realism demands higher order time-integration methods and more sophisticated collision management algorithms. We report on the use of singe and multiple Graphical Processing Units (GPUs) […]

CUDA

Oct, 16

Bit-Packed Damaged Lattice Potts Model Simulations with CUDA and GPUs

Models such as the Ising and Potts systems lend themselves well to simulating the phase transitions that commonly arise in materials science. A particularly interesting variation is when the material being modelled has lattice defects, dislocations or broken bonds and the material experiences a Griffiths phase. The damaged Potts system consists of a set of […]

CUDA

Oct, 16

Asynchronous Communication for Finite-Difference Simulations on GPU Clusters using CUDA and MPI

Graphical processing Units (GPUs) are finding widespread use as accelerators in computer clusters. It is not yet trivial to program applications that use multiple GPU-enabled cluster nodes efficiently. A key aspect of this is managing effective communication between GPU memory on separate devices on separate nodes. We develop an algorithmic framework for Finite-Difference numerical simulations […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Numerical Accuracy Analysis Based on the Discrete Stochastic Arithmetic on Multiprocessor Platforms

Data-Driven Programming Abstractions and Optimization for Multi-Core Platforms

Implementing Stereo Vision of GPU-Accelerated Scientific Simulations using Commodity Hardware

Magneto-hydrodynamics simulation in astrophysics

The Lattice Boltzmann Simulation on Multi-GPU Systems

An Optimization for Fast Generation of Digital Hologram

Dynamic Fine-Grain Scheduling of Pipeline Parallelism

Programming with Explicit Dependencies. A Framework for Portable Parallel Programming

A High Performance Parallel Sparse Linear Equation Solver Using CUDA

Hard-Sphere Collision Simulations with Multiple GPUs, PCIe Extension Buses and GPU-GPU Communications

Bit-Packed Damaged Lattice Potts Model Simulations with CUDA and GPUs

Asynchronous Communication for Finite-Difference Simulations on GPU Clusters using CUDA and MPI

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)