high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Spectral Method Characterization on FPGA and GPU Accelerators

Spectral Method Characterization on FPGA and GPU Accelerators

Karl Pereira, Peter Athanas, Heshan Lin, Wu Feng

Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Blacksburg,USA

International Conference on ReConFigurable Computing and FPGAs (ReConFig’11), 2011

DOI:10.1109/ReConFig.2011.83

@inproceedings{pereira2011spectral,

title={Spectral Method Characterization on FPGA and GPU Accelerators},

author={Pereira, K. and Athanas, P. and Lin, H. and Feng, W.},

booktitle={2011 International Conference on Reconfigurable Computing and FPGAs},

pages={487–492},

year={2011},

organization={IEEE}

}

Download (PDF)

View

Source

2023

views

As CPU clock frequencies plateau and the doubling of CPU cores per processor exacerbate the memory wall, hybrid core computing, utilizing CPUs augmented with FPGAs and/or GPUs holds the promise of addressing highperformance computing demands, particularly with respect to performance, power and productivity. This paper compares the sustained performance of a complex, single precision, floating-point, 1D, Fast Fourier Transform (FFT) implementation on state-of-the-art FPGA and GPU accelerators. As results show, FPGA floating-point performance is highly sensitive to a mix of dedicated FPGA resources; DSP48E slices, block RAMs and FPGA I/O banks in particular. Estimated results show that for the floating-point FFT benchmark on FPGAs, these resources are the performance limiting factor. For fixed-point FFTs, however, FPGAs exploit a flexible data path width to trade-off circuit cost with speed of computation in applications requiring smaller precision to improve performance, power and device utilization. GPUs cannot fully take advantage of this, having a fixed datawidth architecture.

Tags: Computer science, CUDA, DSP, FFT, FPGA, nVidia, nVidia GeForce GTX 280, Performance, Tesla C2050

January 15, 2012 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

KernelGYM & Dr. Kernel: A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations

high performance computing on graphics processing units: hgpu.org

Spectral Method Characterization on FPGA and GPU Accelerators

Your response

Recent source codes

DICE: Diffusion Large Language Models Excel at Generating CUDA Kernels

KernelGYM & Dr. Kernel: A distributed GPU environment and a collection of RL training methods to support RL for Kernel Generations

Vortex-Optimized Light-weight Toolchain (VOLT)

SciDef: Automated Definition Extraction from Scientific Literature

bioagent-bench: Benchmark for evaluating LLM agents in bioinformatics

Benchmark suite for LLM inference on NVIDIA consumer GPUs

Theorizer: from the paper Generating Literature-Driven Scientific Discoveries at Scale

Nsight Python: a Python kernel profiling interface based on NVIDIA Nsight Tools

Awesome LLM-Driven Kernel Generation

PhysProver: Advancing Automatic Theorem Proving for Physics

Most viewed papers (last 30 days)

Spectral Method Characterization on FPGA and GPU Accelerators

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)