high performance computing on graphics processing units: hgpu.org

Posts

Dec, 17

OpenCL Accelerated Multi-GPU Cone-Beam Reconstruction

Volume reconstruction in cone-beam CT is a computationally demanding task. Since recent years, the reconstruction is accelerated by utilizing Graphics Processing Units (GPUs). Frameworks for General Purpose Computations on GPUs are proven tool to access the resources of graphics cards. WIth the Open Computing Language (OpenCL) the first open standard for cross-vendor and cross-platform programming […]

OpenCL

Dec, 17

High Performance Computing Image Analysis for Radiotherapy Planning

The Edinburgh Cancer Centre at the Western General Hospital in Edinburgh is doing research on image analysis for predicting lung fibrosis induced by radiation as part of a treatment plan. They are developing a MATLAB code to analyse three dimensional Computed tomography (CT) images of patients but, because a standard three dimensional CT image is […]

CUDA

Dec, 17

Parallel Firewalls on General-Purpose Graphics Processing Units

Firewalls use a rule database to decide which packets will be allowed from one network onto another thereby implementing a security policy. In high-speed networks as the inter-arrival rate of packets decreases, the latency incurred by a firewall increases. In such a scenario, a single firewall become a bottleneck and reduces the overall throughput of […]

CUDA

Dec, 16

Ray-Traced Collision Detection: Interpenetration Control and Multi-GPU Performance

We proposed [LGA13] an iterative ray-traced collision detection algorithm (IRTCD) that exploits spatial and temporal coherency and proved to be computationally efficient but at the price of some geometrical approximations that allow more interpenetration than needed. In this paper, we present two methods to efficiently control and reduce the interpenetration without noticeable computation overhead. The […]

OpenCL

Dec, 16

Exploiting parallel features of modern computer architectures in bioinformatics

The exponential growth in bioinformatics data generation and the stagnation of processor frequencies in modern processors stress the need for efficient implementations that fully exploit the parallel capabilities offered by modern computers. This thesis focuses on parallel algorithms and implementations for bioinformatics problems. Various types of parallelism are described and exploited. This thesis presents applications […]

CUDA

Dec, 16

A Cloud Computing Service Architecture of a Parallel Algorithm Oriented to Scientific Computing with CUDA and Monte Carlo

The GPGPU (General Purpose Graphics Processing Units) have become a whole new area for research due to the fast development of GPU hardware and programming tools, such as CUDA (Compute Unified Device Architecture). Here we have made a research on CUDA and its applications in the field of scientific computation, as organ electronics. We propose […]

CUDA

Dec, 16

Parallel GPU Implementation of Hough Transform for Circles

Hough transform is one of the most widely used algorithms in image processing. The major problems of Hough’s transform are its time consuming and its abundant requirement of computational resources. In this paper, we try to solve this problem by paralleling this algorithm and implementing it on GPUs (Graphic Process unit) using CUDA (Compute Unified […]

CUDA

Dec, 16

Finding the Force – Consistent Particle Seeding for Satellite Aerodynamics

When calculating satellite trajectories in low-earth orbit, engineers need to adequately estimate aerodynamic forces. But to this day, obtaining the drag acting on the complicated shapes of modern spacecraft suffers from many sources of error. While part of the problem is the uncertain density in the upper atmosphere, this works focuses on improving the modeling […]

CUDA

Dec, 15

A GPU Implementation of Dynamic Programming for the Optimal Polygon Triangulation

This paper presents a GPU (Graphics Processing Units) implementation of dynamic programming for the optimal polygon triangulation. Recently, GPUs can be used for general purpose parallel computation. Users can develop parallel programs running on GPUs using programming architecture called CUDA (Compute Unified Device Architecture) provided by NVIDIA. The optimal polygon triangulation problem for a convex […]

CUDA

Dec, 15

A Distributed Approximation Algorithm for Mixed Packing-Covering Linear Programs

Mixed packing-covering linear programs capture a simple but expressive subclass of linear programs. They commonly arise as linear programming relaxations of a number important combinatorial problems, including various network design and generalized matching problems. In this paper, we propose an efficient distributed approximation algorithm for solving mixed packing-covering problems which requires a poly-logarithmic number of […]

CUDA

Dec, 15

A Parallel Method for Impulsive Image Noise Removal on Hybrid CPU/GPU Systems

A parallel algorithm for image noise removal is proposed. The algorithm is based on peer group concept and uses a fuzzy metric. An optimization study on the use of the CUDA platform to remove impulsive noise using this algorithm is presented. Moreover, an implementation of the algorithm on multi-core platforms using OpenMP is presented. Performance […]

CUDA

Dec, 15

Real time Multi-GPU-based Event Detection in High Definition Videos

Video processing algorithms present a very important tool for many applications related to computer vision domain such as motion tracking, videos indexation, robot navigation and event detection. However, the new video standards, especially in high definitions, cause that the current implementations, even running on modern hardware, no longer respect the needs of real-time processing. In […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

OpenCL Accelerated Multi-GPU Cone-Beam Reconstruction

High Performance Computing Image Analysis for Radiotherapy Planning

Parallel Firewalls on General-Purpose Graphics Processing Units

Ray-Traced Collision Detection: Interpenetration Control and Multi-GPU Performance

Exploiting parallel features of modern computer architectures in bioinformatics

A Cloud Computing Service Architecture of a Parallel Algorithm Oriented to Scientific Computing with CUDA and Monte Carlo

Parallel GPU Implementation of Hough Transform for Circles

Finding the Force – Consistent Particle Seeding for Satellite Aerodynamics

A GPU Implementation of Dynamic Programming for the Optimal Polygon Triangulation

A Distributed Approximation Algorithm for Mixed Packing-Covering Linear Programs

A Parallel Method for Impulsive Image Noise Removal on Hybrid CPU/GPU Systems

Real time Multi-GPU-based Event Detection in High Definition Videos

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)