Parallelizing a high-order WENO scheme for complicated flow structures on GPU and MIC

hgpu.org » Programming » CUDA » Parallelizing a high-order WENO scheme for complicated flow structures on GPU and MIC

Parallelizing a high-order WENO scheme for complicated flow structures on GPU and MIC

Liang Deng, Fang Wang, Han-Li Bai, Qing-Xin Xu

Computational Aerodynamics Institute, China Aerodynamics Research Development Center, No. 278, west section of Jianmen Road, Fucheng District, Mianyang City, Sichuan Province, China

2015 International Conference on Computational Science and Engineering (ICCSE), 2015

DOI:10.2991/iccse-15.2015.86

@article{deng2015parallelizing,

title={Parallelizing a high-order WENO scheme for complicated flow structures on GPU and MIC},

author={DENG, Liang and WANG, Fang and BAI, Han-Li and XU, Qing-Xin},

year={2015}

}

Download (PDF)

View

Source

2409

views

As a conservative, high-order accurate, shock-capturing method, weighted essentially non-oscillatory (WENO) scheme have been widely used to effectively resolve complicated flow structures in computational fluid dynamics (CFD) simulations. However, using a high-order WENO scheme can be highly time-consuming, which greatly limits the CFD application’s performance efficiency. In this paper, we present various parallel strategies base on the latest many-core platform such as NVIDIA Fermi GPU, NVIDIA Kepler GPU and Intel MIC coprocessor to accelerate a high-order WENO scheme. Comparison analysis of the two generations GPUs between Fermi and Kepler, and cross-platform performance analysis (focusing on Kepler GPU and MIC) are also detailed discussed. The experiments show that the Kepler GPU offers a clear advantage in contrast to the previous Fermi GPU maintaining exactly the same source code. Furthermore, while Kepler GPU can be several times faster than MIC without utilizing the increasingly available SIMD computing power on Vector Processing Unit (VPU), MIC can provide the computing capability equivalent to Kepler GPU when VPU is utilized. Our implementations and optimization techniques can serve as case studies for paralleling high-order schemes on many-core architectures.

Tags: CUDA, Fluid dynamics, Intel Xeon Phi, nVidia, Tesla K20, Tesla M2050

August 18, 2015 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org