high performance computing on graphics processing units: hgpu.org

Posts

Jun, 23

GPU Implementation of the DP code

Main goal of this PRACE project was to evaluate how GPUs could speed up the DP code – a linear response TDDFT code. Profiling analysis of the code has been done to identify computational bottlenecks to be delegated to the GPU. In order to speed up this code using GPUs, two different strategies have been […]

CUDA

Jun, 22

CUDA Enhanced Simulated Annealing for Chip Layout Problem

This paper introduces an implementation of a parallel solution for the chip layout problem on an NVidia CUDA framework. The experiment allows for varying chip sizes, interconnecting signals, and three chip transformations: rotate, swap, and translate. Total signal distance is minimized as the system converges toward an optimal solution using simulated annealing. Lee’s maze routing […]

CUDA

•

OpenGL

Jun, 22

Exploring GPGPUs Workload Characteristics and Power Consumption

While general purpose computing on GPUs continues to enjoy higher computing performance with every new generation. The high power consumption of GPUs is an increasingly important concern. To create power-efficient GPUs, it is important to thoroughly study its power consumption. The power consumption of GPUs varies significantly with workloads. Therefore, in this work we study […]

CUDA

Jun, 22

Virtualization and Migration with GPGPUs

Recently, cloud computing providers have started to offer virtual machines specifically for high performance computing as a service (HPCaaS). The cloud computing providers usually employ virtualization as an abstraction layer between the application software and the underlying hardware. Virtualization allows flexible migration between physical systems, which is a requirement for many load balancing techniques. In […]

CUDA

Jun, 21

GPU Optimized Code for Long Term Simulations of Beam-beam Effects in Colliders

We report on the development of a new code for long-term simulation of beam-beam effects in particle colliders. The underlying physical model relies on a matrix-based arbitrary-order symplectic particle tracking for beam transport and the Bassetti-Erskine approximation for the beam-beam interaction. The computations are accelerated through a parallel implementation on a hybrid GPU/CPU platform. With […]

CUDA

Jun, 21

Parallel Language Programming In Different Platforms

The need to speed-up computing has introduced the interest to explore parallelism in algorithms and parallel programming. Technology is evolving fast but computing power in sequential execution is not increasing as much as earlier but CPUs contain more and more parallel computing resources. However, parallel algorithms may not be able to exploit all the parallelism […]

CUDA

•

OpenCL

Jun, 21

Beam Dynamics Simulations with a GPU-accelerated Version of ELEGANT

Large scale beam dynamics simulations can derive significant benefit from efficient implementation of general-purpose particle tracking on GPUs. We present the latest results of our work on accelerating Argonne National Lab’s accelerator simulation code ELEGANT, using CUDA-enabled GPUs. We summarize the performance of beamline elements ported to GPU, and discuss optimization techniques for some core […]

CUDA

Jun, 21

Applying the “Simple Accelerator Modelling in MATLAB” (SAMM) Code to High Luminosity LHC Upgrade

The “Simple Accelerator Modelling in Matlab” (SAMM) code is a set of Matlab routines for modelling beam dynamics in high energy particle accelerators. It includes a set of CUDA codes that can be run on a graphics processing unit. These can be called from SAMM and can potentially give a significant increase in tracking speed. […]

CUDA

Jun, 21

A Numerical Study of Continuous Data Assimilation for the 2D-NS Equations Using Nodal Points

This thesis conducts a number of numerical experiments using massively parallel GPU computations to study a new continuous data assimilation algorithm. We test the algorithm on two-dimensional incompressible fluid flows given by the Navier-Stokes equations. In this context, observations of the Eulerian velocity field given at a finite resolution of nodal points in space may […]

CUDA

Jun, 21

libCudaOptimize: an Open Source Library of GPU-based Metaheuristics

Evolutionary Computation techniques and other metaheuristics have been increasingly used in the last years for solving many real-world tasks that can be formulated as optimization problems. Among their numerous strengths, a major one is their natural predisposition to parallelization. In this paper, we introduce libCudaOptimize, an open source library which implements some metaheuristics for continuous […]

CUDA

Jun, 21

CFMDS: CUDA-based fast multidimensional scaling for genome-scale data

BACKGROUND: Multidimensional scaling (MDS) is a widely used approach to dimensionality reduction. It has been applied to feature selection and visualization in various areas. Among diverse MDS methods, the classical MDS is a simple and theoretically sound solution for projecting data objects onto a low dimensional space while preserving the original distances among them as […]

CUDA

Jun, 21

Artificial Neural Network Simulation on CUDA

The advent of low cost GPU hardware and user friendly parallel programming APIs, such as NVIDIA CUDA means that affordable, programmable, high-performance computing environments for simulation are now attainable for development of scientific simulations. In this paper the authors present the MineHunter program, a parallel simulation of neural networks on NVIDIA CUDA. The simulation consists […]

CUDA

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

* * *

high performance computing on graphics processing units: hgpu.org

Posts

GPU Implementation of the DP code

CUDA Enhanced Simulated Annealing for Chip Layout Problem

Exploring GPGPUs Workload Characteristics and Power Consumption

Virtualization and Migration with GPGPUs

GPU Optimized Code for Long Term Simulations of Beam-beam Effects in Colliders

Parallel Language Programming In Different Platforms

Beam Dynamics Simulations with a GPU-accelerated Version of ELEGANT

Applying the “Simple Accelerator Modelling in MATLAB” (SAMM) Code to High Luminosity LHC Upgrade

A Numerical Study of Continuous Data Assimilation for the 2D-NS Equations Using Nodal Points

libCudaOptimize: an Open Source Library of GPU-based Metaheuristics

CFMDS: CUDA-based fast multidimensional scaling for genome-scale data

Artificial Neural Network Simulation on CUDA

Recent source codes

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

SYCL Container

Most viewed papers (last 30 days)