high performance computing on graphics processing units: hgpu.org

Posts

Nov, 25

Fast hough transform on GPUs: exploration of algorithm trade-offs

The Hough transform is a commonly used algorithm to detect lines and other features in images. It is robust to noise and occlusion, but has a large computational cost. This paper introduces two new implementations of the Hough transform for lines on a GPU. One focuses on minimizing processing time, while the other has an […]

CUDA

Nov, 25

Decryption-decompression of AES protected ZIP files on GPUs

AES is a strong encryption system, so decryption-decompression of AES encrypted ZIP files requires very large computing power and techniques of reducing the password space. This makes implementations of techniques on common computing system not practical. In [1], we reduced the original very large password search space to a much smaller one which surely containing […]

CUDA

Nov, 25

Heterogeneous Computing and Load Balancing Techniques for Monte Carlo Simulation in a Distributed Environment

CPU-GPU clusters have emerged as a dominant HPC platform, with the three of the four fastest supercomputers in the world falling in this category. The reasons for the popularity of these environments include their cost-effectiveness and energy efficiency. The need for exploiting both the CPU and GPU on each node of such platforms has created […]

CUDA

Nov, 25

Coupler Design and Optimization by GPU-Accelerated DG-FEM

In this contribution we show how consumer graphic processors (GPUs) in conjunction with a suited numerical method can be used for the inexpensive design and optimization of small and medium sized rf components. The underlying scheme is implemented in an open source framework which is readily available. As an application example we present simulations of […]

CUDA

Nov, 25

High Rayleigh Number Mantle Convection on GPU

We implemented two- and three-dimensional Rayleigh-Benard convection on Nvidia GPUs by utilizing a 2nd-order finite difference method. By exploiting the massive parallelism of GPU using both CUDA for C and optimized CUBLAS routines, we have on a single Fermi GPU run simultaneous of Raileigh number up to 6×10^10 (on a mesh of 2000×4000 uniform grid […]

CUDA

Nov, 25

Fine-grained parallelization of a Vlasov-Poisson application on GPU

Understanding turbulent transport in magnetised plasmas is a subject of major importance to optimise experiments in tokamak fusion reactors. Also, simulations of fusion plasma consume a great amount of CPU time on today’s supercomputers. The Vlasov equation provides a useful framework to model such plasma. In this paper, we focus on the parallelization of a […]

CUDA

Nov, 24

Symphony: A Scheduler for Client-Server Applications on Coprocessor-based Heterogeneous Clusters

Coprocessors such as GPUs are increasingly being deployed in clusters to process scientific and compute-intensive jobs. In this work, we study if GPU-based heterogeneous clusters can benefit client-server applications. Specifically, we consider the practical situation where multiple client-server applications share a heterogeneous cluster (multi-tenancy), and experience unpredictable variations in incoming client request rates, including steep […]

CUDA

Nov, 24

Automatic generation of heterogeneous spectrometers for radio astronomy

We have developed a software package to automatically generate spectrometers with minimal user input. Spectrometer design is often done by building the instrument from scratch. We have automated this design, creating a parameterized spectrometer that only requires a recompile to implement a change in specification. This spectrometer combines FPGAs and GPUs, doing coarse channelization on […]

CUDA

Nov, 24

An Explicit Algorithm for Porous Media Flow Simulation using GPUs

The proposed approach is aimed at implementation by explicit difference schemes having a simple structure. By the analogy with the kinetically-consistent finite difference schemes and the quasi-gas dynamic system of equations [1,2] the classical model of slightly compressible fluid flows in porous media is modified taking into account the minimal scales of averaging on space […]

CUDA

Nov, 24

Parallelized agent-based simulation on CPU and graphics hardware for spatial and stochastic models in biology

The complexity of biological systems is enormous, even when considering a single cell where a multitude of highly parallel and intertwined processes take place on the molecular level. This paper focuses on the parallel simulation of signal transduction processes within a cell carried out solely on the graphics processing unit (GPU). Each signaling molecule is […]

CUDA

Nov, 24

Parallel fuzzy connected image segmentation on GPU

PURPOSE: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors […]

CUDA

Nov, 24

The K-Anonymity Approach in Preserving the Privacy of E-Services that Implement Data Mining

In this paper, we first described the concept of k-anonymity and different approaches of its implementation, by formalizing the main theoretical notions. Afterwards, we have analyzed, based on a practical example, how the k-anonymity approach applies to the data-mining process in order to protect the identity and privacy of clients to whom the data refers. […]

CUDA

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Fast hough transform on GPUs: exploration of algorithm trade-offs

Decryption-decompression of AES protected ZIP files on GPUs

Heterogeneous Computing and Load Balancing Techniques for Monte Carlo Simulation in a Distributed Environment

Coupler Design and Optimization by GPU-Accelerated DG-FEM

High Rayleigh Number Mantle Convection on GPU

Fine-grained parallelization of a Vlasov-Poisson application on GPU

Symphony: A Scheduler for Client-Server Applications on Coprocessor-based Heterogeneous Clusters

Automatic generation of heterogeneous spectrometers for radio astronomy

An Explicit Algorithm for Porous Media Flow Simulation using GPUs

Parallelized agent-based simulation on CPU and graphics hardware for spatial and stochastic models in biology

Parallel fuzzy connected image segmentation on GPU

The K-Anonymity Approach in Preserving the Privacy of E-Services that Implement Data Mining

Recent source codes

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

KISim: Kubernetes Intelligent Scheduling Simulator

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

Most viewed papers (last 30 days)