GMP implementation on CUDA – A Backward Compatible Design With Performance Tuning
Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto
University of Toronto, 2011
@article{liu2011gmp,
title={GMP implementation on CUDA – A Backward Compatible Design With Performance Tuning},
author={Liu, Hao Jun and Tong, Chu},
year={2011}
}
The goal of this project is to implement the GMP library in CUDA and evaluate its performance. GMP (GNU Multiple Precision) is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers. There is no practical limit to the precision except the ones implied by the available memory in the machine GMP runs on. GMP has a rich set of functions, and the functions have a regular interface. The main target applications for GMP are cryptography applications and research, Internet security applications, algebra systems, computational algebra research, etc. GMP was carefully designed to be as fast as possible, both for small operands and for huge operands. The speed is originally achieved by using fullwords as the basic arithmetic type, by using fast algorithms, with highly optimized assembly code for the most common inner loops for a lot of CPUs, and by a general emphasis on speed. [1] With the emerging of relatively cheap and high performance vector processors, ie. the NVIDIA CUDA, we are able to implement some of the operations supported in GMP using CUDA while attempt to maintain the similar programming interface of the original library. The project implements addiction, multiplication, bit manipulation, bitwise logic, subtraction for integers. The function prototypes have been modified on the part necessary to ensure both backward compatibility while maintaining best performance of the library. We conducted extensive functional and performance tests on our library and the result is presented.
February 7, 2012 by hgpu