A Reproducible Research Methodology for Designing and Conducting Faithful Simulations of Dynamic HPC Applications

Luka Stanisic
LIG – Laboratoire d’Informatique de Grenoble
tel-01248109 (21 January 2016)


   title={A Reproducible Research Methodology for Designing and Conducting Faithful Simulations of Dynamic HPC Applications},

   author={Stanisic, Luka},


   school={Universit{‘e} Grenoble Alpes}


Download Download (PDF)   View View   Source Source   



The evolution of High-Performance Computing systems has taken asharp turn in the last decade. Due to the enormous energyconsumption of modern platforms, miniaturization and frequencyscaling of processors have reached a limit. The energy constraintshas forced hardware manufacturers to develop alternative computerarchitecture solutions in order to manage answering the ever-growingneed of performance imposed by the scientists and thesociety. However, efficiently programming such diversity ofplatforms and fully exploiting the potentials of the numerousdifferent resources they offer is extremely challenging. Thepreviously dominant trend for designing high performanceapplications, which was based on large monolithic codes offeringmany optimization opportunities, has thus become more and moredifficult to apply since implementing and maintaining such complexcodes is very difficult. Therefore, application developersincreasingly consider modular approaches and dynamic applicationexecutions. A popular approach is to implement the application at ahigh level independently of the hardware architecture as DirectedAcyclic Graphs of tasks, each task corresponding to carefullyoptimized computation kernels for each architecture. A runtimesystem can then be used to dynamically schedule those tasks on thedifferent computing resources.Developing such solutions and ensuring their good performance on awide range of setups is however very challenging. Due to the highcomplexity of the hardware, to the duration variability of theoperations performed on a machine and to the dynamic scheduling ofthe tasks, the application executions are non-deterministic and theperformance evaluation of such systems is extremelydifficult. Therefore, there is a definite need for systematic andreproducible methods for conducting such research as well asreliable performance evaluation techniques for studying thesecomplex systems.In this thesis, we show that it is possible to perform a clean,coherent, reproducible study, using simulation, of dynamic HPCapplications. We propose a unique workflow based on two well-knownand widely-used tools, Git and Org-mode, for conducting areproducible experimental research. This simple workflow allows forpragmatically addressing issues such as provenance tracking and dataanalysis replication. Our contribution to the performance evaluationof dynamic HPC applications consists in the design and validation ofa coarse-grain hybrid simulation/emulation of StarPU, a dynamictask-based runtime for hybrid architectures, over SimGrid, aversatile simulator for distributed systems. We present how thistool can achieve faithful performance predictions of nativeexecutions on a wide range of heterogeneous machines and for twodifferent classes of programs, dense and sparse linear algebraapplications, that are a good representative of the real scientificapplications.
No votes yet.
Please wait...

* * *

* * *

* * *

HGPU group © 2010-2022 hgpu.org

All rights belong to the respective authors

Contact us: