high performance computing on graphics processing units: hgpu.org

Posts

Aug, 9

Real-Time All-in-Focus Video-Based Rendering Using A Network Camera Array

We present a real-time video-based rendering system using a network camera array. Our system consists of 64 commodity network cameras that are connected to a single PC through a Gigabit Ethernet. To render a high-quality novel view, we estimate a view-dependent per-pixel depth map in real-time by using a layered representation. The rendering algorithm is […]

OpenGL

Aug, 9

Graphics Processing Units for Handhelds

During the past few years, mobile phones and other handheld devices have gone from only handling dull text-based menu systems to, on an increasing number of models, being able to render high-quality three-dimensional graphics at high frame rates. This paper is a survey of the special considerations that must be taken when designing graphics processing […]

Aug, 9

Geospatial visualization using hardware accelerated real-time volume rendering

We present a visualization framework using direct volume rendering techniques that achieves real-time performance and high image quality. The visualization program runs on a desktop as well as in an immersive environment. The application is named HurricaneVis, and it uses OpenGL, GLSL and VTK. For immersive visualization VRJuggler is added. To achieve real-time rendering rates […]

OpenGL

Aug, 9

Performance Evaluation of Feature Extraction Algorithm on GPGPU

Nvidia’s GPGPU based Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on GPU. It provide several key abstractions- a hierarchy of thread block, shared memory, and barrier synchronization. This model has proven quite successful at programming multithreaded many core GPUs and scale transparently to hundreds of cores: many industry […]

CUDA

Aug, 9

Cache Miss Analysis for GPU Programs Based on Stack Distance Profile

Using the graphics processing unit (GPU) to accelerate the general purpose computation has attracted much attention from both the academia and industry due to GPU’s powerful computing capacity. Thus optimization of GPU programs has become a popular research direction. In order to support the general purpose computing more efficiently, GPU has integrated the general data […]

Aug, 9

Matrix Multiplication on GPUs with On-Line Fault Tolerance

Commercial graphics processing units (GPUs) prove their attractive, inexpensive in high performance scientific applications. However, a recent research through Folding@home demonstrates that two-thirds of tested GPUs on Folding@home exhibit a detectable, pattern-sensitive rate of memory soft errors for GPGPU. Fault tolerance has been viewed as critical to the effective use of these GPUs. In this […]

Aug, 9

Optimization of parallel Genetic Algorithms for nVidia GPUs

Led by General Purpose computing over Graphical Processing Units (GPGPUs), the parallel computing area is witnessing a rapid change in dominant parallel systems. A major hurdle in this switch is the Single Instruction Multiple Thread (SIMT) architecture of GPUs which is usually not suitable for the design of legacy parallel algorithms. Genetic Algorithms (GAs) is […]

Aug, 9

In-process optical characterization method for sub-100-nm nanostructures

Optical measurements based on laser light scattering by nanostructures provide fast and contactless measurement of the surface of nanostructures for defects. In this paper, a novel in-process measurement method based on coherent laser light scattering by sub-100-nm structures is presented. It is shown that nanostructure defects can be identified by their unique scattering pattern. This […]

Aug, 8

High-Performance Diagnostic Fault Simulation on GPUs

In this paper, we present an efficient diagnostic fault simulator based on a state-of-the-art graphics processing unit (GPU). Diagnostic fault simulation plays an important role to identify and locate the causes of circuit failures. However, today’s complex VLSI circuits pose ever higher computational demand for such simulators. Our GPU based diagnostic fault simulator (GDSim) is […]

Aug, 8

Performance Comparison with OpenMP Parallelization for Multi-core Systems

Today, the multi-core processor has occupied more and more market shares, and the programming personnel also must face the collision brought by the revolution of multi-core processor. Semiconductor scaling limits and associated power and thermal challenges limit performance growth for single-core microprocessors. This reason leads many microprocessor vendors to turn instead to multi-core chip organizations. […]

CUDA

•

OpenCL

Aug, 8

GPU Computing in EGI Environment Using a Cloud Approach

Recently GPU computing, namely the possibility to use the vector processors of graphics card as computational general purpose units of High Performance Computing environments, has generated considerable interest in the scientific community. Some communities in European Grid Infrastructure (EGI) are reshaping their applications to exploit this new programming paradigm. Each EGI community, called Virtual Organization […]

Aug, 8

AES finalists implementation for GPU and multi-core CPU based on OpenCL

Benefit from the OpenCL (Open Computing Language), applications can be easily transplanted among different GPUs, multi-core CPUs, and other processors. In this paper, we present implementation of AES finalists (Rijndael, Serpent, Twofish) in XTS mode, based on OpenCL. Benchmark testing is performed on 4 mainstream GPUs and multi-core CPUs. The results are also compared with […]

CUDA

•

OpenCL

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Real-Time All-in-Focus Video-Based Rendering Using A Network Camera Array

Graphics Processing Units for Handhelds

Geospatial visualization using hardware accelerated real-time volume rendering

Performance Evaluation of Feature Extraction Algorithm on GPGPU

Cache Miss Analysis for GPU Programs Based on Stack Distance Profile

Matrix Multiplication on GPUs with On-Line Fault Tolerance

Optimization of parallel Genetic Algorithms for nVidia GPUs

In-process optical characterization method for sub-100-nm nanostructures

High-Performance Diagnostic Fault Simulation on GPUs

Performance Comparison with OpenMP Parallelization for Multi-core Systems

GPU Computing in EGI Environment Using a Cloud Approach

AES finalists implementation for GPU and multi-core CPU based on OpenCL

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)