High speed cipher cracking: the case of Keeloq on CUDA

Giovanni Agosta, Alessandro Barenghi, Gerardo Pelosi
Dipartimento di Elettronica e Informazione (DEI), Politecnico di Milano, Via G. Ponzio 34/5, 20133 Milan, Italy
3rd Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures (PARMA 2012), 2012


   title={High speed cipher cracking: the case of Keeloq on CUDA},

   author={Agosta, Giovanni and Barenghi, Alessandro and Pelosi, Gerardo},

   booktitle={3rd Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures (PARMA 2012), 2012},





Download Download (PDF)   View View   Source Source   



Graphic Processing Units (GPU) are increasingly popular in the field of high-performance computing for their ability to provide computational power for massively parallel problems at a reduced cost. However, the programming model exposed by the GPGPU software development tools is often insufficient to achieve full performance, and a major rethinking of algorithmic choices is needed. In this paper, we showcase such an effect on a case study drawn from the cryptography application domain. The pervasive use of cryptographic primitives in modern embedded systems is a growing trend. Small, efficient cryptosystems have been effectively employed to design and implement keyless password-based access control systems in various wireless authentication applications. The security margin provided by these lightweight ciphers should be accurately examined in light of the speed and area constraints imposed by the target environment. Research on this subject has lead to careful cryptanalyses of ciphers such as CRYPTO-1 (used for micropayment purposes) and KEELOQ (employed in vehicle keys). We present a re-design of the ASIC-oriented KEELOQ implementation to perform efficient exhaustive key search attacks while fitting tightly the parallel programming model exposed by modern GPUs. Indeed, the bitslicing technique allows the intrinsic parallelism offered by word-oriented SIMD computations to be effectively exploited. Through proper adaptation of the algorithm implementation to a platform radically different from the one it was designed for, we achieved a x40 speedup in the computation time w.r.t. a single-core CPU bruteforce attack, employing only consumer grade hardware. The outstanding speedup obtainable points to a significant weakening of the security margin of the cipher, since it proves that anyone with off-the-shelf hardware is able to circumvent the security measures in place.
No votes yet.
Please wait...

* * *

* * *

HGPU group © 2010-2020 hgpu.org

All rights belong to the respective authors

Contact us: