high performance computing on graphics processing units: hgpu.org

Posts

Apr, 23

Multimodal collaboration and human-computer interaction

The research effort at Microsoft research on multimodal collaboration and human-computer interaction aims at developing tools that allow people across geographically distributed sites to interact collaboratively with immersive experience. Our prototype systems consist of cameras, displays, speakers, microphones, computer controllable lights, and/or input devices such as touch sensitive surface, stylus, keyboard, and mouse. They require […]

Apr, 23

Physically-Based Interactive Flow Visualization Based on Schlieren and Interferometry Experimental Techniques

Understanding fluid flow is a difficult problem and of increasing importance as computational fluid dynamics produces an abundance of simulation data. Experimental flow analysis has employed techniques such as shadowgraph, interferometry and schlieren imaging for centuries which allow empirical observation of inhomogeneous flows. Shadowgraphs provide an intuitive way of looking at small changes in flow […]

CUDA

Apr, 23

OpenMPC: Extended OpenMP Programming and Tuning for GPUs

General-Purpose Graphics Processing Units (GPGPUs) are promising parallel platforms for high performance computing. The CUDA (Compute Unified Device Architecture) programming model provides improved programmability for general computing on GPGPUs. However, its unique execution model and memory model still pose significant challenges for developers of efficient GPGPU code. This paper proposes a new programming interface, called […]

CUDA

Apr, 23

Introduction to GPU Computing and CUDA Programming: A Case Study on FDTD [EM Programmer’s Notebook]

The recent advent of general-purpose graphics-processing units (GPGPUs) as inexpensive arithmetic-processing units brings a relevant amount of computing power to modern desktop PCs. This thus providing an interesting pathway to the acceleration of several numerical electromagnetic methods. In this paper, we explain how to exploit GPGPU features by examining how the computational time of the […]

CUDA

Apr, 23

Accelerating Multi-Sensor Image Fusion Using Graphics Hardware

This paper shows approaches to accelerate pixel-level image fusion speed using graphics hardware. Recently, to improve visibility through maximization of information collected through development of various sensors and improvement of sensing technology, the importance of not only development of new fusion algorithm but speed of fusion process is increasing. Though specialized fusion boards for real […]

CUDA

Apr, 23

Design optimization of automotive electronic control unit using the analysis of common-mode current by fast electromagnetic field solver

In this paper, we propose an optimization system based on the fast electromagnetic field solver and metaheuristics for reducing electromagnetic interference (EMI) on electronic control unit (ECU). We adopt simulated annealing (SA), genetic algorithm (GA) and taboo search (TS) to seek optimal solutions, and the finite difference time domain (FDTD) method with general purpose computing […]

Apr, 23

Using parallel GPU architecture for simulation of planar I/F networks

Our work describes the simulation of a planar network of spiking I/F neurons on graphics processing hardware. The described approach adds to the fast-growing field of general-purpose computation on GPUs (GPGPU). We provide an in-depth explanation of the steps involved in implementing the network using programmable shading hardware. We replicated simulation results by Hopfield et […]

Apr, 22

Scalable software defined receivers running on desktop computers using General Purpose Graphics Processing Units

Software defined radios (SDRs) are increasingly attractive to replace common hardware solutions. Using desktop computers for executing software defined radio receivers in real time is not common or even achievable due to the huge amount of processing resources required. In this paper we present some measurement results using an already available general purpose graphics processing […]

Apr, 22

High-Speed Private Information Retrieval Computation on GPU

A Private Information Retrieval (PIR) scheme is a protocol in which a user retrieves a record out of n from a replicated database, while hiding from the database which record has been retrieved, as long as the different replicas do not collude. A specially interesting sub-field of research, called single-database PIR, deals with the schemes […]

CUDA

Apr, 22

Challenges of mapping financial analytics to many-core architecture

Summary form only given. In the past 20 years there has been an explosive growth of the variety of traded financial instruments, from European and American options to a more complex, alas ill-fated, credit derivatives. The rapid increase in computational power coupled with the use of mathematical tools for valuing these instruments and estimating the […]

Apr, 22

MITHRA: Multiple data independent tasks on a heterogeneous resource architecture

With the advent of high-performance COTS clusters, there is a need for a simple, scalable and fault-tolerant parallel programming and execution paradigm. In this paper, we show that the popular MapReduce programming model can be utilized to solve many interesting scientific simulation problems with much higher performance than regular cluster computers by leveraging GPGPU accelerators […]

CUDA

Apr, 22

Fast generating of a digital hologram using general-purpose computation on graphics processing units

In this paper, we propose a method for fast generating a digital hologram using General-Purpose Computation on Graphics Processing Units (GPGPU). This method can reduce the computational time of generating a digital hologram by using the parallel processing with CUDA. And we demonstrate the effectiveness of our algorithm through a variety of experiment.

CUDA

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Multimodal collaboration and human-computer interaction

Physically-Based Interactive Flow Visualization Based on Schlieren and Interferometry Experimental Techniques

OpenMPC: Extended OpenMP Programming and Tuning for GPUs

Introduction to GPU Computing and CUDA Programming: A Case Study on FDTD [EM Programmer’s Notebook]

Accelerating Multi-Sensor Image Fusion Using Graphics Hardware

Design optimization of automotive electronic control unit using the analysis of common-mode current by fast electromagnetic field solver

Using parallel GPU architecture for simulation of planar I/F networks

Scalable software defined receivers running on desktop computers using General Purpose Graphics Processing Units

High-Speed Private Information Retrieval Computation on GPU

Challenges of mapping financial analytics to many-core architecture

MITHRA: Multiple data independent tasks on a heterogeneous resource architecture

Fast generating of a digital hologram using general-purpose computation on graphics processing units

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)