high performance computing on graphics processing units: hgpu.org

Posts

Jun, 6

Parallel centerline extraction on the GPU

Centerline extraction is important in a variety of visualization applications including shape analysis, geometry processing, and virtual endoscopy. Centerlines allow accurate measurements of length along winding tubular structures, assist automatic virtual navigation, and provide a path-planning system to control the movement and orientation of a virtual camera. However, efficiently computing centerlines with the desired accuracy […]

CUDA

Jun, 6

High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System

We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the […]

CUDA

Jun, 6

GPGPU opportunities for the LHCb trigger

This note describes arguments to study the use general purpose graphic processing units to improve the performance of the LHCb trigger, presents the current developments in the integration into the Gaudi framework and the implementation of algorithms and points towards possible R and D directions.

CUDA

Jun, 5

Performance models for CPU-GPU data transfers

Many GPU applications perform data transfers to and from GPU memory at regular intervals. For example because the data does not fit into GPU memory or because of inter- node communication at the end of each time step. Overlapping GPU computation with CPU-GPU communication can reduce the costs of moving data. Several different techniques exist […]

CUDA

Jun, 5

Real-time Model-based Articulated Object Pose Detection and Tracking with Variable Rigidity Constraints

We introduce a real-time system for recognizing and tracking the position and orientation of a large number of complex real-world objects, together with an articulated robotic manipulator operating upon them. The proposed system is fast, accurate and reliable and yet does not require precise camera calibration. The key to this high level of performance is […]

CUDA

•

OpenGL

Jun, 5

GPU Accelerated Nature Inspired Methods for Modelling Large Scale Bi-Directional Pedestrian Movement

Pedestrian movement, although ubiquitous and well-studied, is still not that well under-stood due to the complicating nature of the embedded social dynamics. Interest among researchers in simulating the nature of pedestrian movement and interactions has grown significantly in part due to increased computational and visualization capabilities afforded by high power computing. Different approaches have been […]

CUDA

Jun, 5

Mapping parallel programs to heterogeneous multi-core systems

Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-performance computing. They promise to deliver increased performance at lower energy cost than purely homogeneous, CPU-based systems. In recent years GPU-based heterogeneous systems have become increasingly popular. They combine a programmable GPU with a multi-core CPU. GPUs have become flexible enough to […]

OpenCL

Jun, 5

A Comparative Study of Game Tree Searching Methods

In this paper, a comprehensive survey on gaming tree searching methods that can use to find the best move in two players zero-sum computer games was introduced. The purpose of this paper is to discuss, compares and analyzes various sequential and parallel algorithms of gaming tree, including some enhancement for them. Furthermore, a number of […]

CUDA

Jun, 5

Adventures in the microlensing cloud: large datasets, eResearch tools, and GPUs

As astronomy enters the petascale data era, astronomers are faced with new challenges relating to storage, access and management of data. A shift from the traditional approach of combining data and analysis at the desktop to the use of remote services, pushing the computation to the data, is now underway. In the field of cosmological […]

CUDA

Jun, 4

3rd International Conference on Communication and Electronics Information, ICCEI 2015

Submission Deadline: 2014-08-25 Publication: All ICCEI 2015 papers will be published in the International Journal of Information and Electronics Engineering (ISSN: 2010-3719), and all papers will be indexed by Google Scholar, EBSCO, Electronic Journals Library, Engineering & Technology Digital Library, Crossref and ProQuest, DOAJ, Ei (INSPEC, IET). Call for Paper: Electronics and Communications Engineering Telecommunication […]

Jun, 4

7th International Conference on Computer and Automation Engineering, ICCAE 2015

Submission Deadline: 2014-08-20 Publication: All papers for the ICCAE 2015 will be published in Journal of Automation and Control Engineering (JOACE) (ISSN: 2301-3702) as one volume, and will be indexed by EI (INSPEC, IET), Ulrich’s Periodicals Directory, Google Scholar, EBSCO, Engineering & Technology Digital Library and Electronic Journals Digital Library. Topic: T1: Modern and Advanced […]

Jun, 3

Utilizing Graphics Processing Units for Rapid Facial Recognition Using Video Input

Facial recognition is an active research area that provides a real-time application of pattern recognition techniques. Input can be provided to recognition algorithms using both static images and video data. However, there are significant challenges to working with live streaming data as the recognition method needs to keep up with the frame rate of the […]

CUDA

Large Language Model Powered C-to-CUDA Code Translation: A Novel Auto-Parallelization Framework

GigaAPI: a user-space API that simplifies multi-GPU programming, bridging the gap between the capabilities of parallel GPU systems and the ability of developers to harness their full potential

GigaAPI for GPU Parallelization

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Parallel centerline extraction on the GPU

High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System

GPGPU opportunities for the LHCb trigger

Performance models for CPU-GPU data transfers

Real-time Model-based Articulated Object Pose Detection and Tracking with Variable Rigidity Constraints

GPU Accelerated Nature Inspired Methods for Modelling Large Scale Bi-Directional Pedestrian Movement

Mapping parallel programs to heterogeneous multi-core systems

A Comparative Study of Game Tree Searching Methods

Adventures in the microlensing cloud: large datasets, eResearch tools, and GPUs

3rd International Conference on Communication and Electronics Information, ICCEI 2015

7th International Conference on Computer and Automation Engineering, ICCAE 2015

Utilizing Graphics Processing Units for Rapid Facial Recognition Using Video Input

Recent source codes

Large Language Model Powered C-to-CUDA Code Translation: A Novel Auto-Parallelization Framework

GigaAPI: a user-space API that simplifies multi-GPU programming, bridging the gap between the capabilities of parallel GPU systems and the ability of developers to harness their full potential

Coccinelle: a C code transformation engine using SmPL for matches, refactorings, and bug fixing

DuoReduce: MLIR's benchmark

Shamrock: Multi-GPU hydrodynamics for astrophysics

LLMPerf: GPU Performance Modeling meets Large Language Models

Hercules: A Compiler for Productive Programming of Heterogeneous Systems

Celerity Runtime: High-level C++ for Accelerator Clusters

wgpy: WebGL accelerated numpy-compatible array library for web browser

Microbenchmarking OpenMP target offload with Catch2

Most viewed papers (last 30 days)