Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators
Computer and Information Sciences Department, University of Delaware, Newark, Delaware, United States of America
PLoS ONE 9(1): e86484, 2014
@article{wang2014fast,
title={Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators},
author={Wang, Wei and Xu, Lifan and Cavazos, John and Huang, Howie H and Kay, Matthew},
journal={PLOS ONE},
volume={9},
number={1},
pages={e86484},
year={2014},
publisher={Public Library of Science}
}
Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than 150x speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least 200x faster than the sequential implementation and 30x faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of 120x with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media.
February 4, 2014 by hgpu