high performance computing on graphics processing units: hgpu.org

hgpu.org » FEM

An implementation of tensor product patch smoothers on GPU

Cu Cui, Paul Grosse-Bley, Guido Kanschat, Robert Strzodka

View

Download (PDF)

Tags: CUDA, FEM, Finite element method, Mathematics, Numerical Analysis, nVidia, nVidia A100

June 2, 2024 by hgpu

Performant low-order matrix-free finite element kernels on GPU architectures

Randolph R. Settgast, Yohann Dudouit, Nicola Castelletto, William R. Tobin, Benjamin C. Corbett, Sergey Klevtsov

View

Download (PDF)

Source codes

Tags: AMD Radeon Instinct MI250X, ATI, FEM, Finite element method, Mathematics, Numerical Analysis, nVidia, nVidia A100, nVidia V100, Package, Sparse

August 28, 2023 by hgpu

Explicit caching HYB: a new high-performance SpMV framework on GPGPU

Chong Chen

View

Download (PDF)

Source codes

Tags: Computer science, CUDA, FEM, Finite element method, nVidia, Package, Sparse matrix, Tesla V100

April 17, 2022 by hgpu

Portable high-order finite element kernels I: Streaming Operations

Noel Chalmers, Tim Warburton

View

Download (PDF)

Tags: AMD, AMD Radeon Instinct MI60, AMD Radeon VII, Computer science, CUDA, FEM, HIP, Mathematical Software, nVidia, nVidia GeForce GTX Titan V, OCCA, Portability, Tesla V100

October 18, 2020 by hgpu

GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices

Utpal Kiran, Sachin Singh Gautam, Deepak Sharma

View

Download (PDF)

Tags: Computer science, CUDA, FEM, Finite element method, nVidia, Sparse matrix, Tesla K40

June 28, 2020 by hgpu

GPU Accelerated Finite Element Assembly with Runtime Compilation

Tao Cui, Xiaohu Guo, Hui Liu

View

Download (PDF)

Tags: Computer science, CUDA, Differential equations, FEM, Finite element method, Mathematical Software, Numerical Analysis, nVidia, Partial differential equations, PDEs, Symbolic Computation, Tesla K20, Tesla M2090, Tesla V100

February 15, 2018 by hgpu

Accelerated Sparse Matrix Operations in Nonlinear Least Squares Solvers

Lukas Polok

View

Download (PDF)

Tags: Algorithms, Computer science, Differential equations, Factorization, FEM, Finite element method, Linear Algebra, nVidia, nVidia GeForce GTX 680, OpenCL, Partial differential equations, PDEs, Sparse matrix, Tesla K40, Thesis

December 19, 2017 by hgpu

Acceleration of tensor-product operations for high-order finite element methods

Kasia Swirydowicz, Noel Chalmers, Ali Karakus, Timothy Warburton

View

Download (PDF)

Tags: Computer science, CUDA, FEM, Finite element method, Mathematical Software, Numerical Analysis, nVidia, Overview, Performance, Tesla P100

November 7, 2017 by hgpu

Simulating the Cardinal Movements of Childbirth Using Finite Element Analysis on the Graphics Processing Unit

Zelimkhan Gerikhanov

View

Download (PDF)

Tags: AMD Radeon HD 8970 M, ATI, Biology, Biomechanics, FEM, Finite element method, Medicine, MRI, OpenCL, Thesis

August 17, 2017 by hgpu

GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models

Axel Modave, Amik St-Cyr, Tim Warburton

View

Download (PDF)

Tags: BLAS, Computational Physics, CUBLAS, CUDA, FEM, Finite element method, Geoscience, Linear Algebra, nVidia, nVidia GeForce GTX 980, OCCA, Physics, Profiling, Seismic modeling, Seismology

March 1, 2016 by hgpu

High-Performance Tensor Contractions for GPUs

A. Abdelfattah, M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, Tz. Kolev, I. Masliah, S. Tomov

View

Download (PDF)

Tags: Algorithms, Code generation, Computer science, CUDA, FEM, Finite element method, Linear Algebra, Matrix multiplication, nVidia, OpenMP, Tesla K40

January 29, 2016 by hgpu

Parallel Explicit FEM Algorithms Using GPU’s

Seyed Parsa Banihashemi

View

Download (PDF)

Tags: Algorithms, AMD Radeon R9 280X, ATI, ATI Radeon HD 7970, Computer science, FEM, Finite element method, nVidia, nVidia GeForce GTX Titan, OpenCL, Particle swarm optimization, Performance, Tesla M2090, Thesis

January 22, 2016 by hgpu

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

KIS-S: A GPU-Aware Kubernetes Inference Simulator with RL-Based Auto-Scaling

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

Accelerated discovery and design of Fe-Co-Zr magnets with tunable magnetic anisotropy through machine learning and parallel computing

ParEval: A Parallel Code Evaluation Benchmark

ParEval-Repo: A Benchmark Suite for Evaluating LLMs with Repository-level HPC Translation Tasks

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

Libra: Synergizing CUDA and Tensor Cores for High-Performance Sparse Matrix Multiplication

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

No More Shading Languages: Compiling C++ to Vulkan Shaders

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Engineering Supercomputing Platforms for Biomolecular Applications

See all packages

* * *

high performance computing on graphics processing units: hgpu.org

An implementation of tensor product patch smoothers on GPU

Performant low-order matrix-free finite element kernels on GPU architectures

Explicit caching HYB: a new high-performance SpMV framework on GPGPU

Portable high-order finite element kernels I: Streaming Operations

GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices

GPU Accelerated Finite Element Assembly with Runtime Compilation

Accelerated Sparse Matrix Operations in Nonlinear Least Squares Solvers

Acceleration of tensor-product operations for high-order finite element methods

Simulating the Cardinal Movements of Childbirth Using Finite Element Analysis on the Graphics Processing Unit

GPU performance analysis of a nodal discontinuous Galerkin method for acoustic and elastic models

High-Performance Tensor Contractions for GPUs

Parallel Explicit FEM Algorithms Using GPU’s

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)