high performance computing on graphics processing units: hgpu.org

Posts

Oct, 19

A Modular Framework for Deformation and Fracture using GPU Shaders

Advances in the graphical realism of modern video games have been achieved mainly through the development of the GPU (Graphics Processing Unit), providing a dedicated graphics co-processor and framebuffer. The most recent GPU’s are extremely capable and so flexible that it is now possible to implement a wide range of algorithms on graphics hardware that […]

Oct, 19

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation

The main goal of this paper is to discuss new & faster methods for testing the strength of security used in today’s wireless networks. This paper discusses probabilistic password generation methodsfor testing the real security of networks protected by WPA/WPA2 PSK (Pre-shared key) standard (also known as Personal mode). The main advantage of using those […]

CUDA

Oct, 19

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

The main advantage of a distributed computing system over standalone computer is an ability to share the workload between cores, processors and computers. In our paper we present a hybrid cluster system – a novel computing architecture with multi-core CPUs working together with many-core GPUs. It integrates two types of CPU, i.e., Intel and AMD […]

OpenCL

Oct, 19

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Computational inference of causal relationships underlying complex networks, such as gene-regulatory pathways, is NP-complete due to its combinatorial nature when permuting all possible interactions. Markov chain Monte Carlo (MCMC) has been introduced to sample only part of the combinations while still guaranteeing convergence and traversability, which therefore becomes widely used. However, MCMC is not able […]

CUDA

Oct, 18

Comparative Evaluation of Binary Features

Performance evaluation of salient features has a long-standing tradition in computer vision. In this paper, we fill the gap of evaluation for the recent wave of binary feature descriptors, which aim to provide robustness while achieving high computational efficiency. We use established metrics to embed our assessment into the body of existing evaluations, allowing us […]

Oct, 18

Computing Reachable Sets via Barrier Methods on SIMD Architectures

We consider the problem of computing reachable sets of ODE-based control systems parallely on CUDA hardware. To this end, we modify an existing algorithm based on solving optimal control problems. The idea is to simplify the optimal control problems to pure feasibility problems instead of minimizing an objective function. We show that an interior point […]

CUDA

Oct, 18

Strain Visualization of Ultra Sound Signals Processed by General Purpose Graphic Process Unit

Medical Imaging is a technique aimed to develop tools to improve medical diagnosis and procedures. Medical imaging applications use different types of signals to create images of the inside of the human body. It is a very important technique that decreases the complication rate during procedures and provides with more information for more accurate diagnosis. […]

CUDA

Oct, 18

Efficient 2D Software Rendering

The market of computer graphics is dominated by GPU based technologies. However today’s fast central processing units (CPU) based on modern architectural design offer new opportunities in the field of classical software rendering. Because the technological development of the GPU architecture almost reached the limits in the field of the programming model, the CPU-based solutions […]

OpenGL

Oct, 18

GPU-based Batched Spatial Query Processing on R-Trees

R-trees are popular spatial indexing techniques that have been widely used in many geospatial applications. The increasingly available Graphics Processing Units (GPUs) resources for general computing have attracted considerable research interests in applying the massive data parallel technologies to index and query geospatial data based on R-trees. In this paper, we investigate on the potentials […]

CUDA

Oct, 17

Platform-independent parallelization of the Lattice Boltzmann method with OpenCL

Simulations, like fluid dynamics, are very computationally intensive problems. Since the Lattice Boltzmann method uses a discrete grid of cells for simulating the flow, there are no dependencies between the single cells during the computation for one time step. Therefore, the computing can easily be done in parallel. During the last years, multi-CPU computers have […]

OpenCL

Oct, 17

Design and Performance Evaluation of a Software Framework for Multi-Physics Simulations on Heterogeneous Supercomputers

Despite the experience of several decades the numerical simulation of computational fluid dynamics is still an enormously challenging and active research field. Most simulation tasks of scientific and industrial relevance require the modeling of multiple physical effects, complex numerical algorithms, and have to be executed on supercomputers due to their high computational demands. Facing these […]

CUDA

Oct, 17

Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming

This work focuses on compiler and run-time techniques for improving the productivity and the performance portability of general-purpose parallel programming. More specifically, we focus on shared-memory task-parallel languages, where the programmer explicitly exposes parallelism in the form of short tasks that may outnumber the cores by orders of magnitude. The compiler, the run-time, and the […]

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

A Modular Framework for Deformation and Fracture using GPU Shaders

Accelerated GPU Powered Methods for Auditing Security of Wireless Networks Using Probabilistic Password Generation

A Hybrid CPU/GPU Cluster for Encryption and Decryption of Large Amounts of Data

A Novel Learning Algorithm for Bayesian Network and Its Efficient Implementation on GPU

Comparative Evaluation of Binary Features

Computing Reachable Sets via Barrier Methods on SIMD Architectures

Strain Visualization of Ultra Sound Signals Processed by General Purpose Graphic Process Unit

Efficient 2D Software Rendering

GPU-based Batched Spatial Query Processing on R-Trees

Platform-independent parallelization of the Lattice Boltzmann method with OpenCL

Design and Performance Evaluation of a Software Framework for Multi-Physics Simulations on Heterogeneous Supercomputers

Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)