high performance computing on graphics processing units: hgpu.org

Posts

Jul, 23

GPU-accelerated atom and dynamic bond visualization using hyperballs: A unified algorithm for balls, sticks, and hyperboloids

Ray casting on graphics processing units (GPUs) opens new possibilities for molecular visualization. We describe the implementation and calculation of diverse molecular representations such as licorice, ball-and-stick, space-filling van der Waals spheres, and approximated solvent-accessible surfaces using GPUs. We introduce HyperBalls, an improved ball-and-stick representation replacing tubes, linking the atom spheres by hyperboloids that can […]

Jul, 22

High-Performance Reverse Time Migration on GPU

Partial Differential Equations (PDE) are the heart of most simulations in many scientific fields, from Fluid Mechanics to Astrophysics. One the most popular mathematical schemes to solve a PDE is Finite Difference (FD). In this work we map a PDE-FD algorithm called Reverse Time Migration to a GPU using CUDA. This seismic imaging (Geophysics) algorithm […]

CUDA

Jul, 22

Living Flows: Enhanced Exploration of Edge-Bundled Graphs Based on GPU-Intensive Edge Rendering

This paper describes an approach exploiting the full capabilities of GPU’s to enhance the usability of edge bundling in real applications. Edge bundling, as well as other edge clustering approaches relying on the use of high quality edge rerouting. Typical approach for drawing edge-bundled graph is to render edges as curves. But curves generation can […]

OpenGL

Jul, 22

A novel parallel Tier-1 coder for JPEG2000 using GPUs

The JPEG2000 image compression standard provides superior features to the popular JPEG standard; however, the slow performance of software implementation of JPEG2000 has kept it from being widely adopted. More than 80% of the execution time for JPEG2000 is spent on the Tier-1 coding engine. While much effort over the past decade has been devoted […]

Jul, 22

Parallel implementation of Multi-dimensional Ensemble Empirical Mode Decomposition

In this paper, we propose and evaluate two parallel implementations of Multi-dimensional Ensemble Empirical Mode Decomposition (MEEMD) for multi-core (CPU) and many-core (GPU) architectures. Relative to a sequential C implementation, our double precision GPU implementation, using the CUDA programming model, achieves up to 48.6x speedup on NVIDIA Tesla C2050. Our multi-core CPU implementation, using the […]

CUDA

Jul, 22

GPGPU-Accelerated Parallel and Fast Simulation of Thousand-Core Platforms

The multicore revolution and the ever-increasing complexity of computing systems is dramatically changing sys-tem design, analysis and programming of computing platforms. Future architectures will feature hundreds to thousands of simple processors and on-chip memories connected through a network-on-chip. Architectural simulators will remain primary tools for design space exploration, software development and performance evaluation of these […]

CUDA

Jul, 22

A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

The recently released Rodinia benchmark suite enables users to evaluate heterogeneous systems including both accelerators, such as GPUs, and multicore CPUs. As Rodinia sees higher levels of acceptance, it becomes important that researchers understand this new set of benchmarks, especially in how they differ from previous work. In this paper, we present recent extensions to […]

CUDA

Jul, 22

Real time image reconstruction using GPUs for a surgical PET imaging probe system

We present an on-line list-mode image reconstruction system using GPUs for a surgical PET imaging probe system. We used the nVidia GeForce 9800GTX+ and CUDA to reconstruct images. The proposed system can generate a three-dimensional image from simulated data in 70 msec. We also compared the processing time with respect to the number of LORs […]

CUDA

Jul, 22

Vendors Draw up a New Graphics-Hardware Approach

For the past five years, there have been two major approaches to providing graphics hardware in PCs, notebooks, game consoles, and workstations. The older technique has been to put graphics processing units (GPUs) on video cards. For example, Nvidia places its G92 GPU on its GeForce GTS 250 card, and AMD places its RV770 GPU […]

Jul, 22

Real-time 3D video synthesis from binocular capture system based on commodity graphic hardware

In this paper, a real-time 3D video synthesis method suitable for implementation on commodity graphic hardware is presented. The system consists of pre-calibrated binocular stereo cameras and an NVIDIA GeForce 8 Series graphic card. Recently, most research has focused on improving the quality of depth maps, which is usually time-consuming and unsuitable for real-time reconstruction. […]

Jul, 22

AES Encryption Implementation on CUDA GPU and Its Analysis

GPU has a good performance ratio and exhibits the capability for applications with high level of parallelism despite its inexpensive price. The support of integer and logical instructions on the latest generation of GPU makes us to implement cipher algorithms easier with the same instructions. However the decisions such as parallel processing granularity or memory […]

CUDA

Jul, 22

Large Scale Simulations of the Euler Equations on GPU Clusters

The paper investigates the scalability of a parallel Euler solver, using the Vijayasundaram method, on a GPU cluster with 32 Nvidia Geforce GTX 295 boards. The aim of this research is to enable large scale fluid dynamics simulations with up to one billion elements. We investigate communication protocols for the GPU cluster to compensate for […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GPU-accelerated atom and dynamic bond visualization using hyperballs: A unified algorithm for balls, sticks, and hyperboloids

High-Performance Reverse Time Migration on GPU

Living Flows: Enhanced Exploration of Edge-Bundled Graphs Based on GPU-Intensive Edge Rendering

A novel parallel Tier-1 coder for JPEG2000 using GPUs

Parallel implementation of Multi-dimensional Ensemble Empirical Mode Decomposition

GPGPU-Accelerated Parallel and Fast Simulation of Thousand-Core Platforms

A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

Real time image reconstruction using GPUs for a surgical PET imaging probe system

Vendors Draw up a New Graphics-Hardware Approach

Real-time 3D video synthesis from binocular capture system based on commodity graphic hardware

AES Encryption Implementation on CUDA GPU and Its Analysis

Large Scale Simulations of the Euler Equations on GPU Clusters

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)