high performance computing on graphics processing units: hgpu.org

Posts

Apr, 11

GPU-computing in econophysics and statistical physics

A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today’s GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In […]

CUDA

Apr, 11

Exact and complete short read alignment to microbial genomes using GPU programming

MOTIVATION: The introduction of next generation sequencing techniques and especially the high-throughput systems Solexa (Illumina Inc.) and SOLiD (ABI) made the mapping of short reads to reference sequences a standard application in modern bioinformatics. Short read alignment is needed for reference based re-sequencing of complete genomes as well as for gene expression analysis based on […]

CUDA

Apr, 11

DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

BACKGROUND: Next-generation sequencing technologies have led to the high-throughput production of sequence data (reads) at low cost. However, these reads are significantly shorter and more error-prone than conventional Sanger shotgun reads. This poses a challenge for the de novo assembly in terms of assembly quality and scalability for large-scale short read datasets. RESULTS: We present […]

CUDA

Apr, 11

Simulation of bevel gear cutting with GPGPUs-performance and productivity

The desire for general purpose computation on graphics processing units caused the advance of new programming paradigms, e.g. OpenCL C/C++, CUDA C or the PGI Accelerator Model. In this paper, we apply these programming approaches to the software KegelSpan for simulating bevel gear cutting. This engineering application simulates an important manufacturing process in the automotive […]

CUDA

•

OpenCL

Apr, 11

4th Workshop at ISCA’11 Emerging Applications and Many-core Architectures, EAMA

The goal of the workshop is to bring together application domain experts and computer architects to discuss emerging applications as well as their implications on current- and next-generation many-core architectures. The workshop focuses on the following two areas: Emerging application domains such as recognition/mining/synthesis (RMS), medical imaging, bioinformatics, visual computing, Web3D, datacenter workloads, business analytics, […]

Apr, 11

First ADBIS workshop on GPUs In Databases, GID 2011

The GPUs in Databases workshop is devoted to sharing the knowledge related to applying GPUs in Database environments and to discuss possible future development of this application domain.List of topics of the GID workshop includes (but is not limited to): 1. Data compression on GPUs * lossless/lossy compression and decompression * real time compression and […]

Apr, 10

GPU Accelerated Adams-Bashforth Multirate Discontinuous Galerkin FEM Simulation of High-Frequency Electromagnetic Fields

A multirate Adams-Bashforth (AB) scheme for simulation of electromagnetic wave propagation using the discontinuous Galerkin finite element method (DG-FEM) is presented. The algorithm is adapted such that single-instruction multiple-thread (SIMT) characteristic for the implementation on a graphics processing unit (GPU) is preserved. A domain decomposition strategy respecting the multirate classification for computation on multiple GPUs […]

CUDA

Apr, 10

GPU acceleration of the dynamics routine in the HIRLAM weather forecast model

Programmable graphics processing units (GPUs) nowadays offer very high performance computing power at relatively low hardware cost and power consumption. In this paper, we present the implementation of the dynamics routine of the HIRLAM weather forecast model on the NVIDIA GeForce 9800 GX2 GPU card using the Compute Unified Device Architecture (CUDA) as parallel programming […]

CUDA

Apr, 10

Power-Efficient Work Distribution Method for CPU-GPU Heterogeneous System

As the system scales up continuously, the problem of power consumption for high performance computing (HPC) system becomes more severe. Heterogeneous system integrating two or more kinds of processors, could be better adapted to heterogeneity in applications and provide much higher energy efficiency in theory. Many studies have shown heterogeneous system is preferable on energy […]

Apr, 10

Real-time stereo matching: A cross-based local approach

We propose an area-based local stereo matching algorithm that yields accurate disparity estimates, while achieving the real-time speed completely on the graphics processing unit (GPU). For a local stereo method, the key challenge is to decide an appropriate support window for the pixel under consideration. Our stereo method starts with computing an upright local cross […]

Apr, 10

Accelerating global sequence alignment using CUDA compatible multi-core GPU

The Graphical Processing Unit (GPU) has become a competitive general purpose computational hardware platform in the last few years. Recent improvements in GPUs highly parallel programming capabilities such as Compute Unified Device Architecture(CUDA) has lead to a variety of complex applications with tremendous performance improvements. Genetic Sequence alignment is considered to be one of the […]

CUDA

Apr, 10

A Real-Time Soft Shadow Rendering Algorithm by Occluder-Discretization

This paper presents a real-time soft shadow rendering algorithm based on the shadow-mapping technique. The key idea of this algorithm is to use only a single shadow map for a flat extended light source. The algorithm also uses the single shadow map to discretize the occluders to many flat patches which are parallel with the […]

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GPU-computing in econophysics and statistical physics

Exact and complete short read alignment to microbial genomes using GPU programming

DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI

Simulation of bevel gear cutting with GPGPUs-performance and productivity

4th Workshop at ISCA’11 Emerging Applications and Many-core Architectures, EAMA

First ADBIS workshop on GPUs In Databases, GID 2011

GPU Accelerated Adams-Bashforth Multirate Discontinuous Galerkin FEM Simulation of High-Frequency Electromagnetic Fields

GPU acceleration of the dynamics routine in the HIRLAM weather forecast model

Power-Efficient Work Distribution Method for CPU-GPU Heterogeneous System

Real-time stereo matching: A cross-based local approach

Accelerating global sequence alignment using CUDA compatible multi-core GPU

A Real-Time Soft Shadow Rendering Algorithm by Occluder-Discretization

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)