Acceleration of a Full-scale Industrial CFD Application with OP2

Istvan Z. Reguly, Gihan R. Mudalige, Carlo Bertolli, Michael B. Giles, Adam Betts, Paul H. J. Kelly, David Radford
Oxford e-Research Centre, University of Oxford
arXiv:1403.7209 [cs.CE], (27 Mar 2014)

   author={Reguly}, I.~Z. and {Mudalige}, G.~R. and {Bertolli}, C. and {Giles}, M.~B. and {Betts}, A. and {Kelly}, P.~H.~J. and {Radford}, D.},

   title={"{Acceleration of a Full-scale Industrial CFD Application with OP2}"},

   journal={ArXiv e-prints},




   keywords={Computer Science – Computational Engineering, Finance, and Science, Computer Science – Performance, C.4},




   adsnote={Provided by the SAO/NASA Astrophysics Data System}


Download Download (PDF)   View View   Source Source   



Hydra is a full-scale industrial CFD application used for the design of turbomachinery at Rolls Royce plc. It consists of over 300 parallel loops with a code base exceeding 50K lines and is capable of performing complex simulations over highly detailed unstructured mesh geometries. Unlike simpler structured-mesh applications, which feature high speed-ups when accelerated by modern processor architectures, such as multi-core and many-core processor systems, Hydra presents major challenges in data organization and movement that need to be overcome for continued high performance on emerging platforms. We present research in achieving this goal through the OP2 domain-specific high-level framework. OP2 targets the domain of unstructured mesh problems and follows the design of an active library using source-to-source translation and compilation to generate multiple parallel implementations from a single high-level application source for execution on a range of back-end hardware platforms. We chart the conversion of Hydra from its original hand-tuned production version to one that utilizes OP2, and map out the key difficulties encountered in the process. To our knowledge this research presents the first application of such a high-level framework to a full scale production code. Specifically we show (1) how different parallel implementations can be achieved with an active library framework, even for a highly complicated industrial application such as Hydra, and (2) how different optimizations targeting contrasting parallel architectures can be applied to the whole application, seamlessly, reducing developer effort and increasing code longevity. Performance results demonstrate that not only the same runtime performance as that of the hand-tuned original production code could be achieved, but it can be significantly improved on conventional processor systems. Additionally, we achieve further acceleration by exploiting many-core parallelism, particularly on GPU systems. Our results provide evidence of how high-level frameworks such as OP2 enable portability across a wide range of contrasting platforms and their significant utility in achieving near-optimal performance without the intervention of the application programmer.
VN:F [1.9.22_1171]
Rating: 0.0/5 (0 votes cast)

* * *

* * *

Follow us on Twitter

HGPU group

1662 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

337 people like HGPU on Facebook

* * *

Free GPU computing nodes at hgpu.org

Registered users can now run their OpenCL application at hgpu.org. We provide 1 minute of computer time per each run on two nodes with two AMD and one nVidia graphics processing units, correspondingly. There are no restrictions on the number of starts.

The platforms are

Node 1
  • GPU device 0: nVidia GeForce GTX 560 Ti 2GB, 822MHz
  • GPU device 1: AMD/ATI Radeon HD 6970 2GB, 880MHz
  • CPU: AMD Phenom II X6 @ 2.8GHz 1055T
  • RAM: 12GB
  • OS: OpenSUSE 13.1
  • SDK: nVidia CUDA Toolkit 6.5.14, AMD APP SDK 3.0
Node 2
  • GPU device 0: AMD/ATI Radeon HD 7970 3GB, 1000MHz
  • GPU device 1: AMD/ATI Radeon HD 5870 2GB, 850MHz
  • CPU: Intel Core i7-2600 @ 3.4GHz
  • RAM: 16GB
  • OS: OpenSUSE 12.3
  • SDK: AMD APP SDK 3.0

Completed OpenCL project should be uploaded via User dashboard (see instructions and example there), compilation and execution terminal output logs will be provided to the user.

The information send to hgpu.org will be treated according to our Privacy Policy

HGPU group © 2010-2015 hgpu.org

All rights belong to the respective authors

Contact us: