2480

Parallel Processing of the Building-Cube Method on a GPU Platform

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, Kazuhiro Nakahashi
Cyberscience Center, Tohoku University, Sendai 980-8578, Japan
Computers & Fluids (06 January 2011)

@article{Komatsu2011,

   title={“ParallelProcessingoftheBuilding-CubeMethodonaGPUPlatform”},

   journal={“Computers&Fluids”},

   volume={“InPress},

   number={“”},

   pages={“-“},

   year={“2011”},

   note={“”},

   issn={“0045-7930”},

   doi={“DOI:10.1016/j.compfluid.2010.12.019”},

   url={“http://www.sciencedirect.com/science/article/B6V26-51WD0G3-1/2/7a998c6fef2b8b626815c2016661f2b0”},

   author={“KazuhikoKomatsuandTakashiSogaandRyusukeEgawaandHiroyukiTakizawaandHiroakiKobayashiandShunTakahashiandDaisukeSasakiandKazuhiroNakahashi”},

   keywords={“Building-Cube Method”,”GPGPU”,”Multiple GPUs”}

}

Source Source   

855

views

The Building-Cube Method (BCM) based on equally-spaced Cartesian meshes has been proposed as a next generation CFD method. Due to the equally-spaced meshes, it is well suited for highly parallel computation. This paper proposes a parallel implementation scheme of BCM on a GPU cluster system, which needs efficient hierarchical parallel processing to exploit the potential of the cluster system. The proposed scheme employs the Red-Black SOR method for the pressure calculations, which is the most time-consuming part of BCM, to obtain massive data parallelism of BCM. By exploiting the coarse-grain and fine-grain parallelism of BCM, the proposed scheme hierarchically assigns equally-divided tasks into the GPU cluster system. Furthermore, to exploit the computational power of GPUs in the cluster system, the proposed scheme employs an efficient data management such as coalesced data transfer and reusing data on an on-chip memory. Experimental results show that the single GPU implementation can achieve about three times higher performance than the single CPU one. Moreover, the multiple GPU implementation can achieve an almost ideal scalability. Finally, the possibility of further acceleration of not only the pressure calculation but also the whole BCM is discussed.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: