2437

Accelerating linpack with CUDA on heterogenous clusters

Massimiliano Fatica
NVIDIA Corporation, Santa Clara, CA
In GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units (2009), pp. 46-51.

@conference{fatica2009accelerating,

   title={Accelerating linpack with CUDA on heterogenous clusters},

   author={Fatica, M.},

   booktitle={Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units},

   pages={46–51},

   year={2009},

   organization={ACM}

}

Source Source   

958

views

This paper describes the use of CUDA to accelerate the Linpack benchmark on heterogenous clusters, where both CPUs and GPUs are used in synergy with minor or no modifications to the original source code. A host library intercepts the calls to DGEMM and DTRSM and executes them simultaneously on both GPUs and CPU cores. An 8U cluster is able to sustain more than a Teraflop using a CUDA accelerated version of HPL.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2017 hgpu.org

All rights belong to the respective authors

Contact us: