Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures

hgpu.org » Applications » Computer science » Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures

Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures

E. Phipps, M. D’Elia, H. C. Edwards, M. Hoemmen, J. Hu, S. Rajamanickam

Center for Computing Research, Sandia National Laboratories, Albuquerque, NM and Livermore, CA

arXiv:1511.03703 [cs.MS], (11 Nov 2015)

@article{phipps2015embedded,

title={Embedded Ensemble Propagation for Improving Performance, Portability and Scalability of Uncertainty Quantification on Emerging Computational Architectures},

author={Phipps, E. and D’Elia, M. and Edwards, H. C. and Hoemmen, M. and Hu, J. and Rajamanickam, S.},

year={2015},

month={nov},

archivePrefix={"arXiv"},

primaryClass={cs.MS}

}

Download (PDF)

View

Source

1297

views

Quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in an embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan).

Tags: Computer science, CUDA, Differential equations, nVidia, OpenMP, Partial differential equations, PDEs, Tesla K20

November 24, 2015 by hgpu

Rating: 1.5/5. From 2 votes.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org