Accelerating the FDTD Method Using SSE and Graphics Processing Units

hgpu.org » Applications » Computer science » Accelerating the FDTD Method Using SSE and Graphics Processing Units

Accelerating the FDTD Method Using SSE and Graphics Processing Units

Matthew J. Livesey

Faculty of Engineering and Physical Sciences, University of Manchester

University of Manchester, 2011

BibTeX

Download (PDF)

View

Source

1754

views

The Finite-Difference Time-Domain (FDTD) method is a computational technique for modelling the behaviour of electromagnetic waves in three-dimensional space. When executed to solve real-world problems the FDTD method is characterised by long execution times involving a large amount of data organised into matrices. The FDTD method exhibits ample data parallelism, and parallel computing techniques are frequently used to reduce the execution time of FDTD computations. This project explores the opportunities for accelerating the computation of the FDTD method using two types of commonly available parallel hardware. First, the Streaming SIMD Extensions (SSE) capability of the x86 processor architecture is considered. The SSE capability of processors based on this architecture allow a single instruction to be applied to multiple sets of data simultaneously. This project investigates the possibility of using SSE to accelerate the FDTD method by performing calculations for multiple matrix elements with a single instruction. Second, general-purpose computing on graphics processing units (GPUs) is considered. As the hardware used in GPUs has become less specialised, they have come to resemble general-purpose, massively parallel processors. Typical modern GPUs have hundreds of processor cores which operate concurrently, offering the potential to greatly accelerate computations with sufficient parallelism. This project investigates the possibility of accelerating the FDTD method using GPU computing. Both hardware technologies are evaluated for the speedup they offer to the FDTD method, as well as the complexity of their implementation.

Tags: Computer science, CUDA, Data parallelism, FDTD, Finite-difference time-domain, nVidia, Tesla M2050, Tesla T10, Thesis

March 28, 2012 by hgpu

No votes yet.

Please wait...

high performance computing on graphics processing units: hgpu.org

Accelerating the FDTD Method Using SSE and Graphics Processing Units

Recent source codes

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

chemtrain: Training Molecular Dynamics Potentials in JAX

microSYCL: SYCL micro-benchmarks repository

XaaS containers

SYCL Container

CASS: Cuda-Amd aSSembly

Cluser of smartphones for edge computing application using TensorFlow

CFAL-bench

Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration

Can Large Language Models Predict Parallel Code Performance?

Most viewed papers (last 30 days)

Accelerating the FDTD Method Using SSE and Graphics Processing Units

Share this:

Recent source codes

Most viewed papers (last 30 days)