Locality optimization on a NUMA architecture for hybrid LU factorization

Adrien Remy, Marc Baboulin, Masha Sosonkina, Brigitte Rozoy
Inria and Universite Paris-Sud, France
hal-00957673, (10 March 2014)




   title={Locality optimization on a NUMA architecture for hybrid LU factorization},

   author={R{‘e}my, Adrien and Baboulin, Marc and Sosonkina, Masha and Rozoy, Brigitte},

   keywords={ccNUMA; thread placement; dense linear systems; LU factorization; MAGMA library},


   affiliation={Laboratoire de Recherche en Informatique – LRI , POSTALE – INRIA Saclay – Ile de France, Old Dominion University – ODU},

   type={Rapport de recherche},







Download Download (PDF)   View View   Source Source   



We study the impact of non-uniform memory accesses (NUMA) on the solution of dense general linear systems using an LU factorization algorithm. In particular we illustrate how an appropriate placement of the threads and memory on a NUMA architecture can improve the performance of the panel factorization and consequently accelerate the global LU factorization. We apply these placement strategies and present performance results for a hybrid multicore/GPU LU algorithm as it is implemented in the public domain library MAGMA.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: