high performance computing on graphics processing units: hgpu.org

Posts

Jul, 7

Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing

This paper provides the first comparison of performance and energy efficiency of high productivity computing systems based on FPGA (Field-Programmable Gate Array) and GPU (Graphics Processing Unit) technologies. The search for higher performance compute solutions has recently led to great interest in heterogeneous systems containing FPGA and GPU accelerators. While these accelerators can provide significant […]

CUDA

Jul, 7

Optimization of a FDTD code for graphical processing units

Modern graphics processing units (GPUs) provide high computational power which can significantly decrease simulation time. We present two implementations of the FDTD algorithm on GPU and compare their performance with the CPU version.

Jul, 7

Pricing of cross-currency interest rate derivatives on Graphics Processing Units

We present a Graphics Processing Unit (GPU) parallelization of the computation of the price of cross-currency interest rate derivatives via a Partial Differential Equation (PDE) approach. In particular, we focus on the GPU-based parallel computation of the price of long-dated foreign exchange interest rate hybrids, namely Power Reverse Dual Currency (PRDC) swaps with Bermudan cancelable […]

CUDA

Jul, 7

High precision integer multiplication with a graphics processing unit

In this paper we evaluate the potential for using an NVIDIA graphics processing unit (GPU) to accelerate high precision integer multiplication. The reported peak vector performance for a typical GPU appears to offer considerable potential for accelerating such a regular computation. Because of limitations in the on-chip memory, the high cost of kernel launches, and […]

Jul, 7

Introducing Energy Efficiency into Graphics Processors

Graphics processor (GPU) architectures have evolved rapidly in recent years with increasing performance demanded by 3D graphics applications such as games. However, challenges exist in integrating complex GPUs into mobile devices because of power and energy constraints, motivating the need for energy efficiency in GPUs. While a significant amount of power optimisation research effort has […]

Jul, 7

An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

Sequence alignment algorithms such as the Smith-Waterman algorithm are among the most important applications in the development of bioinformatics. Sequence alignment algorithms must process large amounts of data which may take a long time. Here, we introduce our Adaptive Hybrid Multiprocessor technique to accelerate the implementation of the Smith-Waterman algorithm. Our technique utilizes both the […]

CUDA

Jul, 7

Graphics Processing Unit Accelerated O(N) Micromagnetic Solver

An efficient micromagnetic solver running on graphics processing units (GPU) is demonstrated. The solver implements a nonuniform grid interpolation method (NGIM) to compute the superposition integral for the magnetostatic field with operations and memory requirements. The NGIM divides the computational domain into a hierarchy of boxes containing sources and observers, and it uses spatial interpolation […]

CUDA

Jul, 6

Accelerating Wavelet Lifting on Graphics Hardware Using CUDA

The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as massively parallel coprocessors through NVidia’s CUDA compute […]

CUDA

Jul, 6

Option pricing with COS method on graphics processing units

In this paper, acceleration on the GPU for option pricing by the COS method is demonstrated. In particular, both European and Bermudan options will be discussed in detail. For Bermudan options, we consider both the Black-Scholes model and Levy processes of infinite activity. Moreover, the influence of the number of terms in the Fourier-cosine expansion, […]

Jul, 6

Fast face recognition approach using a graphical processing unit (GPU)

In this manuscript, we present an implementation of a correlation method for face recognition application on GPU. Our correlator is based on the famous "4f" setup and the use of a Phase Only Filter (POF). Traditionally, the correlation method is implemented using optical components for real-time application. Unfortunately, optical implementation is complex and has exorbitant […]

Jul, 6

Tuning A Hybrid GPU-CPU V-cycle Multilevel Preconditioner for Solving Large Real and Complex Systems of FEM Equations

This paper presents techniques for tuning an accelerated preconditioned conjugate gradient solver with a multilevel preconditioner. The solver is optimized for a fast solution of sparse systems of equations arising in computational electromagnetics in a finite element method using higher order elements. The goal of the tuning is to increase the throughput while at the […]

Jul, 6

Flexible OpenCL accelerated disparity estimation for video communication applications

Due to widespread broadband connections in normal households, the use of video chats via Internet is no longer limited to business meetings. However, the camera configuration usually makes it impossible to achieve direct eye contact between the conversational partners. This effect can be compensated using virtual view synthesis methods based on disparity maps. The virtual […]

OpenCL

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing

Optimization of a FDTD code for graphical processing units

Pricing of cross-currency interest rate derivatives on Graphics Processing Units

High precision integer multiplication with a graphics processing unit

Introducing Energy Efficiency into Graphics Processors

An Adaptive Hybrid Multiprocessor technique for bioinformatics sequence alignment

Graphics Processing Unit Accelerated O(N) Micromagnetic Solver

Accelerating Wavelet Lifting on Graphics Hardware Using CUDA

Option pricing with COS method on graphics processing units

Fast face recognition approach using a graphical processing unit (GPU)

Tuning A Hybrid GPU-CPU V-cycle Multilevel Preconditioner for Solving Large Real and Complex Systems of FEM Equations

Flexible OpenCL accelerated disparity estimation for video communication applications

Recent source codes

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

PELSI: Power-Efficient Layer-Switched Inference

Ouroboros: Virtualized Queues for dynamic memory management

MSCCL++: A GPU-driven communication stack for scalable AI applications

Benchmark compute shader of Unity against InteropUnityCUDA

Most viewed papers (last 30 days)