2713

On testing GPU memory for hard and soft errors

Guochun Shi, Jeremy Enos, Michael Showerman, Volodymyr Kindratenko
National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL, USA
Proc. Symposium on Application Accelerators in High-Performance Computing – SAAHPC’09, 2009

@conference{shi2009testing,

   title={On testing GPU memory for hard and soft errors},

   author={Shi, G. and Enos, J. and Showerman, M. and Kindratenko, V.},

   booktitle={Proc. Symposium on Application Accelerators in High-Performance Computing},

   year={2009}

}

Download Download (PDF)   View View   Source Source   Source codes Source codes

Package:

986

views

NVIDIA GPUs are becoming increasingly popular in scientific computation as a way to accelerate the execution of computationally demanding codes. The graphics memory used in GPUs is not protected against soft errors that may be caused by cosmic radiation and thus is a source of concern for the scientific computing community. In this short paper we report on an attempt to test GPU memory for both permanent memory errors due to manufacturing defects and prolonged use and soft errors due to single radiation events. We present a new GPU memory test methodology and show results of error measurements on two large GPU clusters.
No votes yet.
Please wait...

* * *

* * *

Featured events

2018
November
27-30
Hida Takayama, Japan

The Third International Workshop on GPU Computing and AI (GCA), 2018

2018
September
19-21
Nagoya University, Japan

The 5th International Conference on Power and Energy Systems Engineering (CPESE), 2018

2018
September
22-24
MediaCityUK, Salford Quays, Greater Manchester, England

The 10th International Conference on Information Management and Engineering (ICIME), 2018

2018
August
21-23
No. 1037, Luoyu Road, Hongshan District, Wuhan, China

The 4th International Conference on Control Science and Systems Engineering (ICCSSE), 2018

2018
October
29-31
Nanyang Executive Centre in Nanyang Technological University, Singapore

The 2018 International Conference on Cloud Computing and Internet of Things (CCIOT’18), 2018

HGPU group © 2010-2018 hgpu.org

All rights belong to the respective authors

Contact us: