Computing Spectral Transforms Used in Digital Logic on the GPU
Dept. of Computer Science, Faculty of Electronic Engineering, University of Nis, Nis, Serbia
Chapter 2 in book "GPU Computing with Applications in Digital Logic", 2012
@article{gajic2012computing,
title={Computing Spectral Transforms Used in Digital Logic on the GPU},
author={Gajic, D. and Stankovic, R.S.},
journal={GPU Computing with Applications in Digital Logic},
pages={25},
year={2012}
}
GPU computing originated in the opening of the graphics processing units (GPUs), which are devices intended to produce computer graphics, for general purpose computations. Since computer graphics is based on matrix operations, GPUs are purposely designed to implement such operations efficiently. Spectral transforms are defined in terms of sets of basis functions, which can be conveniently arranged as columns of certain matrices whose entries could be complex numbers, real numbers, integers, or elements of some finite fields. In this way, spectral transforms are defined by transform matrices and the determining spectra of discrete functions, which are conveniently defined by function vectors, reduces to matrix-vector operations. In many cases, basis functions and due to that also the related transform matrices, express some specific properties enabling us to derive fast computation algorithms by exploiting instruction parallelism. The matrix specification and the parallelism in related algorithms make computing the spectral transform a task well suited for implementation on GPUs. A direct mapping of existing fast algorithms, however, does not lead to implementations that take full advantage of all GPU features regarding computational power and memory bandwidth. Therefore, a suitable reformulation of the existing algorithms is necessary. Most of the related work is devoted to the implementation of the Fast Fourier Transform (FFT), that is an algorithm for efficient computation of the Discrete Fourier Transform (DFT), due to its numerous applications. Since computing of DFT involves complex number arithmetic, which in programming implementations doubles the computing requirements (for the real and the imaginary part), the advantages of using GPUs are impressive. In this chapter, we point out that a considerable speedup in computing can be achieved even in the case of integer valued transforms or Boolean valued transforms such as transforms that are used in digital logic for processing logic (Boolean) functions. The chapter discusses the implementation of the Walsh, the Reed-Muller, and the arithmetic transforms, as examples of the Kronecker product representable transforms, and the Haar transform, as an example of the layered-Kronecker transforms. In the reformulation of the algorithms to make them well suited for computation on GPUs, special attention is paid to the organization of the computations, impact of the integer and Boolean arithmetic, different structure of fast algorithms, memory transfers reduction, and related issues. Furthermore, the chapter presents a discussion of the implementation of two kinds of FFT-like algorithms, the Cooley-Tukey algorithms and the so-called algorithms with constant geometry. Performances of GPU implementations are compared with the classical C/C++ implementations on the Central Processing Unit (CPU). Experiments show that, even though the spectral transforms considered involve arithmetic over integers (in the case of the Walsh, the arithmetic, and the Haar transforms) and Boolean values in the case of the Reed-Muller transform, significant speedups can be achieved by implementing the algorithms in OpenCL and performing them on the GPU. As an illustration of the possible applications, the procedures developed for the Walsh transform are used to compute the correlation and autocorrelation of discrete functions in terms of Boolean variables viewed as functions on finite dyadic groups.
December 30, 2012 by hgpu