high performance computing on graphics processing units: hgpu.org

hgpu.org » ROCm

Engineering Supercomputing Platforms for Biomolecular Applications

Robert Welch, Charles Laughton, Oliver Henrich, Tom Burnley, Daniel Cole, Alan Real, Sarah Harris, James Gebbie-Rayet

View

Tags: AMD Radeon Instinct MI250X, AMD Radeon Instinct MI300X, ATI, Benchmarking, Biology, Biomolecules, Computational biology, CUDA, HPC, Molecular dynamics, nVidia, nVidia A100, nVidia GH200, nVidia H100, Package, Physics, ROCm, Tesla V100

June 22, 2025 by hgpu

The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUs

Timothée David--Cléris, Guillaume Laibe, Yona Lapeyre

View

Tags: AMD, AMD Radeon Instinct MI250X, Astrophysics, CUDA, MPI, nVidia, nVidia A100, OpenMP, Package, Physics, PTX, ROCm, SYCL

March 23, 2025 by hgpu

CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads

Radostin Stoyanov, Viktória Spišaková, Jesus Ramos, Steven Gurfinkel, Andrei Vagin, Adrian Reber, Wesley Armour, Rodrigo Bruno

View

Tags: AMD Radeon Instinct MI210, ATI, Computer science, CUDA, Deep learning, nVidia, nVidia A100, nVidia H100, nVidia RTX A6000, Package, ROCm

March 3, 2025 by hgpu

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations

microSYCL: SYCL micro-benchmarks repository

Exploring SYCL as a Portability Layer for High-Performance Computing on CPUs

XaaS containers

Acceleration as a Service (XaaS) Source Containers

CASS: Cuda-Amd aSSembly

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

Cluser of smartphones for edge computing application using TensorFlow

Low-cost edge computing using upcycled smartphones

SYCL Container

Exploring SYCL for batched kernels with memory allocations

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Can Large Language Models Predict Parallel Code Performance?

See all packages

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Login | Sitemap | Feedback | Policy

Contact us:

contact@hpgu.org