high performance computing on graphics processing units: hgpu.org

Posts

Mar, 27

HCW 2009 keynote talk: GPU computing: Heterogeneous computing for future systems

Over the last decade, commodity graphics processors (GPUs) have evolved from fixed-function graphics units into powerful, programmable data-parallel processors. Today’s GPU is capable of sustaining computation rates substantially greater than today’s modern CPUs, with technology trends indicating a widening gap in the future. Researchers in the rapidly evolving field of GPU computing have demonstrated mappings […]

Mar, 27

Scalable and Parallel Implementation of a Financial Application on a GPU: With Focus on Out-of-Core Case

The architecture of the latest Graphic Processing Unit (GPU) consists of a number of uniform programmable units integrated on the same chip, which facilitate the general-purpose computing beyond the graphic processing. With the multiple programmable units executing in parallel, the latest GPU shows superior performance for many non-graphic applications. Furthermore, programmers can have a direct […]

Mar, 25

Introduction to GPU Programming with GLSL

One of the challenging advents in computer science in recent years was the fast evolution of parallel processors, specially the GPU – graphics processing unit. GPUs today play a major role in many computational environments, most notably those regarding real-time graphics applications, such as games. The digital game industry is one of the main driving […]

OpenGL

Mar, 25

Efficient implementation for MD5-RC4 encryption using GPU with CUDA

Benefit from the novel compute unified device architecture (CUDA) introduced by NVIDIA, graphics processing unit (GPU) turns out to be a promising solution for cryptography applications. In this paper we present an efficient implementation for MD5-RC4 encryption using NVIDIA GPU with novel CUDA programming framework. The MD5-RC4 encryption algorithm was implemented on NVIDIA GeForce 9800GTX […]

CUDA

Mar, 25

A GPU Implementation of Fast Parallel Markov Clustering in Bioinformatics Using EllPACK-R Sparse Data Format

The massively parallel computing using graphical processing unit (GPU), which based on tens of thousands of parallel threats within hundreds of GPU’s streaming processors, has gained broad popularity and attracted researchers in a wide range of application areas from finance, computer aided engineering, computational fluid dynamics, game physics, numerics, science, medical imaging, life science, and […]

CUDA

Mar, 25

GPU Acceleration of High-Speed Collision Molecular Dynamics Simulation

We discuss an implementation and optimization of GPU-accelerated Molecular Dynamics (MD) simulation of high-speed collision molecular model in NVIDIA CUDA language. A series of optimization methods are presented: spatial decomposition, use of shared memory and use of blockcell-link structure. These optimization methods effectively improve the performance by reducing data transfer time between CPU and GPU […]

CUDA

Mar, 25

Fast forwarding table lookup exploiting GPU memory architecture

As the traffic of the Internet increases and diversifies, the needs for a fast flexible router have made researchers to work on software routers. The existing software router systems may utilize the cluster structure of multiple machines or GPU systems. Especially, Packet Shader, which uses GPU to exploit GPU’s extensive parallelism, shows higher performance compared […]

Mar, 25

HiAL-Ckpt: A hierarchical application-level checkpointing for CPU-GPU hybrid systems

In light of its powerful computing capacity and high energy efficiency, GPU (graphics processing unit) has become a focus in the research field of HPC (High Performance Computing). CPU-GPU heterogeneous parallel systems have become a new development trend of super-computer. However, the inherent unreliability of the GPU hardware deteriorates the reliability of super-computer. We have […]

Mar, 25

Fast Virus Signature Matching Based on the High Performance Computing of GPU

For the virus signature matching time of traditional the eigenvalue is too long, which can’t meet the need of the information security, a method of fast virus signature matching on the GPU is compelled. The system uses the GPU as a fast filter to quickly identify possible virus signatures for thousands of data objects in […]

Mar, 25

GPU-based real-time small displacement estimation with ultrasound

General purpose computing on graphics processing units (GPUs) has been previously shown to speed up computationally intensive data processing and image reconstruction algorithms for computed tomography (CT), magnetic resonance (MR), and ultrasound images. Although some algorithms in ultrasound have been converted to GPU processing, many investigative ultrasound research systems still use serial processing on a […]

Mar, 25

A Data Communication Scheduler for Stream Programs on CPU-GPU Platform

In recent years, heterogeneous parallel system have become a focus research area in high performance computing field. Generally, in a heterogeneous parallel system, CPU provides the basic computing environment and special purpose accelerator (GPU in this paper) provides high computing performance. However, the overall performance of the system is prone to be limited by the […]

Mar, 25

Optimization and Implementation of LBM Benchmark on Multithreaded GPU

With fast development of transistor technology, Graphic Processing Unit(GPU) is increasingly used in the non-graphics applications, and major GPU hardware vendors have introduced software stacks for their own GPUs, such as Brook+ for AMD GPU. Compared with the traditional parallel systems, heterogeneous systems integerating stream-based multi-threaded GPUs provide higher parallel computing capabilities with lower cost. […]

* * *

high performance computing on graphics processing units: hgpu.org

Posts

HCW 2009 keynote talk: GPU computing: Heterogeneous computing for future systems

Scalable and Parallel Implementation of a Financial Application on a GPU: With Focus on Out-of-Core Case

Introduction to GPU Programming with GLSL

Efficient implementation for MD5-RC4 encryption using GPU with CUDA

A GPU Implementation of Fast Parallel Markov Clustering in Bioinformatics Using EllPACK-R Sparse Data Format

GPU Acceleration of High-Speed Collision Molecular Dynamics Simulation

Fast forwarding table lookup exploiting GPU memory architecture

HiAL-Ckpt: A hierarchical application-level checkpointing for CPU-GPU hybrid systems

Fast Virus Signature Matching Based on the High Performance Computing of GPU

GPU-based real-time small displacement estimation with ultrasound

A Data Communication Scheduler for Stream Programs on CPU-GPU Platform

Optimization and Implementation of LBM Benchmark on Multithreaded GPU

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)