high performance computing on graphics processing units: hgpu.org

Posts

Jun, 8

Efficient Parallel Proximity Queries and an Application to Highly Complex Motion Planning Problems with Many Narrow Passages

In industrial manufacturing, like the automotive industry, digital mock-ups are used to design complex machinery with the help of computer systems. In this field, motion planning algorithms play an important role to ensure the (de-)composability of the digital prototypes. In the last decades, sampling-based motion planning algorithms have shown themselves to be practical in this […]

CUDA

Jun, 8

Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach

GPUs (Graphics processing units) can be used for general purpose parallel computation. Developers can develop parallel programs running on GPUs using different computing architectures like CUDA or OpenCL. The Optimal Matrix Chain Multiplication problem is an optimization problem to find the optimal order for multiplying a chain of matrices. The optimal order of multiplication depends […]

CUDA

Jun, 8

Modernizing the core quantum chemistry algorithms

This document covers the basics of computational chemistry and how using the modern programming techniques the theory can be efficiently implemented on digital computers. The computer implementations are developed from the core two-electron integrals to many-body and coupled cluster algorithms. A particular attention is paid to the physical constraints of he computer resources and the […]

CUDA

Jun, 7

How a Single Chip Causes Massive Power Bills. GPUSimPow: A GPGPU Power Simulator

Modern GPUs are true power houses in every meaning of the word: While they offer general-purpose (GPGPU) compute performance an order of magnitude higher than that of conventional CPUs, they have also been rapidly approaching the infamous "power wall", as a single chip sometimes consumes more than 300W. Thus, the design space of GPGPU microarchitecture […]

CUDA

Jun, 7

Genetic Programming using the Karva Gene Expression Language on Graphical Processing Units

Genetic Programming (GP) has been employed in many problem domains, and as a result, it has been the subject of much scientific inquiry. The extensive literature body of GP has reported applications in algorithm discovery, image enhancement and cooperative multi-agent systems, as well as many other areas and disciplines, such as agent-based modelling in Geography […]

CUDA

Jun, 7

Parallel Dynamic Solidification Model of Continuous Steel Casting on GPU

Nowadays, dynamic solidification models of continuously cast steel are commonly used in steelworks over the world to control the casting process and to monitor the steel production. Moreover, these models of transient temperature field can also be utilized for optimization of continuous casting, its on-line regulation, or may help operators to solve non-standard or breakdown […]

CUDA

Jun, 7

Scientific Computing on Hybrid Architectures

Modern computer architectures, with multicore CPUs and GPUs or other accelerators, make stronger demands than ever on writers of scientific code. Normally, the most efficient program has to be written – using a substantial effort – by expert programmers for a certain application on a particular computer. This thesis deals with several algorithmic and technical […]

CUDA

Jun, 7

CUDA Based Performance Evaluation of the Computational Efficiency of the DCT Image Compression Technique on Both the CPU and GPU

Recent advances in computing such as the massively parallel GPUs (Graphical Processing Units),coupled with the need to store and deliver large quantities of digital data especially images, has brought a number of challenges for Computer Scientists, the research community and other stakeholders. These challenges, such as prohibitively large costs to manipulate the digital data amongst […]

CUDA

Jun, 6

Parallel Implementation of Finite Element Codes using CUDA

The purpose of this work is to study the performance of parallel computation of Finite Element Method using the NVIDIA’s CUDA. The numerical experiments are performed only on the stiffness matrix using the conjugate gradient method. In addition, the generalized minimal residual method is considered to solve the Stokes problem using both PETSc and CUDA. […]

CUDA

Jun, 6

Development of an explicit pressure-based unstructured solver for three-dimensional incompressible flows with graphics hardware acceleration

In this research, a numerical algorithm was developed to solve the incompressible Navier-Stokes equations using explicit time stepping. The goal of this research was to develop an unsteady SIMPLER based algorithm with lower computational overhead. The new explicit algorithm uses a four stage Runge-Kutta scheme to update the velocities and eliminates the need for the […]

CUDA

Jun, 6

Parallelization & checkpointing of GPU applications through program transformation

GPUs have emerged as a powerful tool for accelerating general-purpose applications. The availability of programming languages that makes writing general-purpose applications for running on GPUs tractable have consolidated GPUs as an alternative for accelerating general-purpose applications. Among the areas that have benefited from GPU acceleration are: signal and image processing, computational fluid dynamics, quantum chemistry, […]

OpenCL

Jun, 6

A MapReduce Framework for Heterogeneous Computing Architectures

Nowadays, an increasing number of computational systems are equipped with heterogeneous compute resources, i.e., following different architecture. This applies to the level of a single chip, a single node and even supercomputers and large-scale clusters. With its impressive price-to-performance ratio as well as power efficiency compared to traditional multicore processors, graphics processing units (GPUs) has […]

CUDA

•

OpenCL

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

Efficient Parallel Proximity Queries and an Application to Highly Complex Motion Planning Problems with Many Narrow Passages

Accelerated Dynamic Programming on GPU: A Study of Speed Up and Programming Approach

Modernizing the core quantum chemistry algorithms

How a Single Chip Causes Massive Power Bills. GPUSimPow: A GPGPU Power Simulator

Genetic Programming using the Karva Gene Expression Language on Graphical Processing Units

Parallel Dynamic Solidification Model of Continuous Steel Casting on GPU

Scientific Computing on Hybrid Architectures

CUDA Based Performance Evaluation of the Computational Efficiency of the DCT Image Compression Technique on Both the CPU and GPU

Parallel Implementation of Finite Element Codes using CUDA

Development of an explicit pressure-based unstructured solver for three-dimensional incompressible flows with graphics hardware acceleration

Parallelization & checkpointing of GPU applications through program transformation

A MapReduce Framework for Heterogeneous Computing Architectures

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)