https://hgpu.org/?p=1048
A hardware redundancy and recovery mechanism for reliable scientific computation on graphics processors