high performance computing on graphics processing units: hgpu.org

Posts

Oct, 2

A state-of-the-art password strength analysis demonstrator

Due to recent developments: leaks of large lists of user passwords (e.g. LinkedIn), new probabilistic password cracking techniques and the introduction of password cracking using GPUs. Passwords can now be cracked faster than ever before. The leaked password lists have been analyzed by hackers and common patterns found inside the passwords are being exploited to […]

OpenCL

Oct, 2

GPU-powered Simulation Methodologies for Biological Systems

The study of biological systems witnessed a pervasive cross-fertilization between experimental investigation and computational methods. This gave rise to the development of new methodologies, able to tackle the complexity of biological systems in a quantitative manner. Computer algorithms allow to faithfully reproduce the dynamics of the corresponding biological system, and, at the price of a […]

CUDA

Oct, 2

A graphics processor-based intranuclear cascade and evaporation simulation

Monte Carlo simulations of the transport of protons in human tissue have been deployed on graphics processing units (GPUs) with impressive results. To provide a more complete treatment of non-elastic nuclear interactions in these simulations, we developed a fast intranuclear cascade-evaporation simulation for the GPU. This can be used to model non-elastic proton collisions on […]

CUDA

Oct, 1

GPU-Accelerated Real-Time Surveillance De-Weathering

A fully automatic de-weathering system to increase the visibility/stability in surveillance applications during bad weather has been developed. Rain, snow and haze during daylight are handled in real-time performance with acceleration from CUDA implemented algorithms. Video from fixed cameras is processed on a PC with no need of special hardware except an NVidia GPU. The […]

CUDA

Oct, 1

Head Pose Tracking Using GPU Based Real-time 3D Registration

The head pose tracking is one of the important criteria for improving the abilities of the human computer interactions and the human robot interactions. With the improvement of low cost consumer depth cameras lot of research attention attracted to the 3D based head pose estimation which is more accurate and robust to the environment conditions. […]

CUDA

Oct, 1

Optimizing RDF stores by coupling General-purpose Graphics Processing Units and Central Processing Units

From our experience in using RDF stores as a backend for social media streams, we pinpoint three shortcomings of current RDF stores in terms of aggregation speed, constraints checking and large-scale reasoning. Parallel algorithms are being proposed to scale reasoning on RDF graphs. However the current efforts focus on the closure computation using High Performance […]

CUDA

Oct, 1

Optimizing Real Time GPU Kernels Using Fuzzy Inference System

CPU technology is slowly reaching its threshold, however Moore’s Law still holds true for GPUs. With the increasing scope for GPGPU computing more and more applications are being ported to the GPU framework. One of the most suited application areas for GPGPU computing is image processing and computer vision. The high performance given by GPUs […]

CUDA

Oct, 1

Template Library for Multi-GPU Pseudorandom Number Recursion-based Generators

The aim of the paper is to show how to design and implement fast parallel algorithms for Linear Congruential, Lagged Fibonacci and Wichmann-Hill pseudorandom number generators. The new algorithms employ the divide-and-conquer approach for solving linear recurrence systems. They are implemented on multi GPU-accelerated systems using CUDA. Numerical experiments performed on a computer system with […]

CUDA

Sep, 30

Exploring Programming Multi-GPUs using OpenMP & OpenACC-based Hybrid Model

Heterogeneous computing come with tremendous potential and is a leading candidate for scientific applications that are becoming more and more complex. Accelerators such as GPUs whose computing momentum is growing faster than ever offer application performance when compute intensive portions of an application are offloaded to them. It is quite evident that future computing architectures […]

Sep, 30

Compiling a High-level Directive-Based Programming Model for GPGPUs

OpenACC is an emerging directive-based programming model for programming accelerators that typically enable non-expert programmers to achieve portable and productive performance of their applications. In this paper, we present the research and development challenges, and our solutions to create an open-source OpenACC compiler in a main stream compiler framework (OpenUH of a branch of Open64). […]

Sep, 30

Data-parallel Acceleration of PARSEC Black-Scholes Benchmark

The way programmers has been relying on processor improvements to gain speedup in their applications is no longer applicable in the same fashion. Programmers usually have to parallelize their code to utilize the CPU cores in the system to gain a significant speedup. To accelerate parallel applications furthermore there are a couple of techniques available. […]

Sep, 30

Approximate dynamic programming with post-decision states as a solution method for dynamic economic models

I introduce and evaluate a new stochastic simulation method for dynamic economic models. It is based on recent work in the operations research and engineering literatures (Van Roy et. al, 1997; Powell, 2007; Bertsekas, 2011). The baseline method involves rewriting the household’s dynamic program in terms of post-decision states. This makes it possible to choose […]

CUDA

high performance computing on graphics processing units: hgpu.org

Posts

A state-of-the-art password strength analysis demonstrator

GPU-powered Simulation Methodologies for Biological Systems

A graphics processor-based intranuclear cascade and evaporation simulation

GPU-Accelerated Real-Time Surveillance De-Weathering

Head Pose Tracking Using GPU Based Real-time 3D Registration

Optimizing RDF stores by coupling General-purpose Graphics Processing Units and Central Processing Units

Optimizing Real Time GPU Kernels Using Fuzzy Inference System

Template Library for Multi-GPU Pseudorandom Number Recursion-based Generators

Exploring Programming Multi-GPUs using OpenMP & OpenACC-based Hybrid Model

Compiling a High-level Directive-Based Programming Model for GPGPUs

Data-parallel Acceleration of PARSEC Black-Scholes Benchmark

Approximate dynamic programming with post-decision states as a solution method for dynamic economic models

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)