high performance computing on graphics processing units: hgpu.org

hgpu.org » Programming » CUDA » Accuracy, Memory, and Speed Strategies in GPU-Based Finite-Element Matrix-Generation

Accuracy, Memory, and Speed Strategies in GPU-Based Finite-Element Matrix-Generation

Adam Dziekonski, Piotr Sypek, Adam Lamecki, Michal Mrozowski

Department of Microwave and Antenna Engineering, CUDA Research Center for Computational Electromagnetics, Gdansk University of Technology, Gdansk, Poland

IEEE Antennas and Wireless Propagation Letters, Volume: 11, Page(s): 1346-1349, 2012

DOI:10.1109/LAWP.2012.2227449

@article{dziekonski2012accuracy,

title={Accuracy, Memory, and Speed Strategies in GPU-Based Finite-Element Matrix-Generation},

author={Dziekonski, A. and Sypek, P. and Lamecki, A. and Mrozowski, M.},

journal={Antennas and Wireless Propagation Letters, IEEE},

volume={11},

pages={1346–1349},

year={2012},

publisher={IEEE}

}

Download (PDF)

View

Source

2547

views

This letter presents strategies on how to optimize graphics processing unit (GPU)-based finite-element matrix-generation that occurs in the finite element method (FEM) using higher-order curvilinear elements. The goal of the optimization is to increase the speed of evaluation and assembly of large finite-element matrices on a single GPU while maintaining the accuracy of numerical integration at the desired level. For this reason, the choice of the optimal Gaussian quadratures for curvilinear finite elements focused on accuracy, memory usage, and runtime of numerical integration is discussed. Moreover, we show how to efficiently utilize symmetry of local mass and stiffness matrices on a GPU in the numerical integration step. The performance results, obtained on a workstation equipped with one Tesla C2075, indicate that the proposed strategies retain the accuracy of computations, allow generation of larger sparse linear systems, and provide 2.5-fold acceleration of GPU-based finite-element matrix-generation.

Tags: CUDA, Electrodynamics, FEM, Finite element method, nVidia, Optimization, Tesla C2075

December 16, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Accuracy, Memory, and Speed Strategies in GPU-Based Finite-Element Matrix-Generation

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Accuracy, Memory, and Speed Strategies in GPU-Based Finite-Element Matrix-Generation

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)