Large-Scale Compute-Intensive Analysis via a Combined In-Situ and Co-Scheduling Workflow Approach
CCS-7, Los Alamos National Lab, Los Alamos, NM 87545
International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’15), 2015
@inproceedings{Sewell:2015b,
author={Sewell, Christopher and Heitmann, Katrin and Finkel, Hal and Zagaris, George and Parete-Koon, Suzanne and Fasel, Patricia and Pope, Adrian and Frontiere, Nicholas and Lo, Li-Ta and Messer, Bronson and Habib, Salman and Ahrens, James},
title={Large-Scale Compute-Intensive Analysis via a Combined In-situ and Co-scheduling Workflow Approach},
booktitle={Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
series={SC ’15},
year={2015},
location={Austin, Texas},
numpages={11},
publisher={IEEE Press},
address={Piscataway, NJ, USA}
}
Large-scale simulations can produce hundreds of terabytes to petabytes of data, complicating and limiting the efficiency of work-flows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in-situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in-situ and co-scheduling approaches for handling petabyte-scale outputs. We compare different analysis set-ups ranging from purely off-line, to purely in-situ to insitu/co-scheduling. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multicore, and many-core architectures.
December 12, 2015 by hgpu