high performance computing on graphics processing units: hgpu.org

Posts

Dec, 25

Bioinformatics Sequence Comparisons on Manycore Processors

Searching similarities between sequences is a fundamental operation in bioinformatics, providing insight in biological functions as well as tools for high-throughput data. There is a need to have algorithms able to process efficiently billions of sequences. To look for approximate similarities, a common heuristic is to consider short words that appear exactly in both sequences, […]

OpenCL

Dec, 23

Password Cracking in the Cloud

Cloud computing is a great resource for applications that require computing capacity for a short time but do not need investing in fixed capital for long term. As a result, it can be used for lot of attacks such as cracking passwords, keys or other forms of brute force attacks that are computationally expensive but […]

CUDA

Dec, 23

Employing GPU Accelerators for Efficient Enforcement of Data Integrity in Outsourced Data

Cloud computing provides on-demand webbased software, middleware, and computing resources. It is a service-oriented model and one of its service is Data as a Service (DaaS), also known as Outsourced Database (ODB) model. Although DaaS solves the problem of storing terabytes of data, the security of the data is a major concern for all the […]

CUDA

Dec, 23

Toward GPU-accelerated Traffic Simulation and Its Real-Time Challenge

Traffic simulation is a growing domain of computational physics. Many life and industrial applications would benefit from traffic simulation to establish reliable transportation systems. A core challenge of this science research, however, is its unbounded scale of computation. This paper explores an advantage of using the graphics processing unit (GPU) for this computational challenge. We […]

CUDA

Dec, 23

Coulomb, Landau and Maximally Abelian Gauge Fixing in Lattice QCD with Multi-GPUs

A lattice gauge theory framework for simulations on graphic processing units (GPUs) using NVIDIA’s CUDA is presented. The code comprises template classes that take care of an optimal data pattern to ensure coalesced reading from device memory to achieve maximum performance. In this work we concentrate on applications for lattice gauge fixing in 3+1 dimensional […]

CUDA

Dec, 23

Implementation of Motion Estimation Based on Heterogeneous Parallel Computing System with OpenCL

Heterogeneous computing system increases the performance of parallel computing in many domain of general purpose computing with CPU, GPU and other accelerators. Open Computing Language (OpenCL) is the first open, royaltyfree standard for heterogenous computing on multi hardware platforms. In this paper, we propose a parallel Motion Estimation (ME) algorithm implemented using OpenCL and present […]

OpenCL

Dec, 21

Multicore and GPU Programming Models, Languages and Compilers Workshop, PLC 2013

Co-located with 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2013). his workshop aims to bring the programming community together to explore and discuss various options to make programming heterogeneous systems less challenging and more interesting. The workshop seeks to explore programming methodologies in the form of directive-based approaches, language extensions, novel tools and […]

Dec, 20

KFusion: Obtaining Modularity and Performance with Regards to General Purpose GPU Computing and Co-processors

Concurrency has recently come to the forefront of computing as multi-core processors become more and more common. General purpose graphics processing unit computing brings with them new language support for dealing with co-processor environments such as OpenCL and CUDA. Programming language support for multi-core architectures introduces a fundamentally new mechanism for modularity – a kernel. […]

OpenCL

Dec, 20

A Parallelized Algorithm for Hyperspectral Biometrics

The parallelized algorithm for hyperspectral biometrics uses the processing power of a GPU (Graphical Processing Unit) to compare hyperspectral images of people’s faces. The feature extraction algorithm first retrieves uniquely identifiable features from raw hyperspectral data from 64 bands and creates both a database and individual target files. Using these files, the comparison algorithm written […]

CUDA

Dec, 20

Track finding in ATLAS using GPUs

The reconstruction and simulation of collision events is a major task in modern HEP experiments involving several ten thousands of standard CPUs. On the other hand the graphics processors (GPUs) have become much more powerful and are by far outperforming the standard CPUs in terms of floating point operations due to their massive parallel approach. […]

CUDA

Dec, 20

GPU Environmental Delegation of Agent Perceptions for MABS

Considering the digital simulation of complex systems, General-Purpose Computing on Graphics Processing Units (GPGPU) is a relevant approach for addressing scalability issues. However, GPU programming is a very specific approach that strongly limits both the accessibility and the re-usability of the frameworks developed using GPGPU. This paper presents our approach for the integration of GPU […]

CUDA

Dec, 20

GPUs: An Oasis in the Supercomputing Desert

A novel metric is introduced to compare the supercomputing resources available to academic researchers on a national basis. Data from the supercomputing Top 500 and the top 500 universities in the Academic Ranking of World Universities (ARWU) are combined to form the proposed "500/500" score for a given country. Australia scores poorly in the 500/500 […]

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Bioinformatics Sequence Comparisons on Manycore Processors

Password Cracking in the Cloud

Employing GPU Accelerators for Efficient Enforcement of Data Integrity in Outsourced Data

Toward GPU-accelerated Traffic Simulation and Its Real-Time Challenge

Coulomb, Landau and Maximally Abelian Gauge Fixing in Lattice QCD with Multi-GPUs

Implementation of Motion Estimation Based on Heterogeneous Parallel Computing System with OpenCL

Multicore and GPU Programming Models, Languages and Compilers Workshop, PLC 2013

KFusion: Obtaining Modularity and Performance with Regards to General Purpose GPU Computing and Co-processors

A Parallelized Algorithm for Hyperspectral Biometrics

Track finding in ATLAS using GPUs

GPU Environmental Delegation of Agent Perceptions for MABS

GPUs: An Oasis in the Supercomputing Desert

Recent source codes

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

Most viewed papers (last 30 days)