CPU/GPGPU/HW comparison of an Eigenfaces face recognition system
Departamento de Automatica, Ingenieria Electronica e Informatica Industrial, Escuela Tecnica Superior de Ingenieros Industriales, Universidad Politecnica de Madrid
Universidad Politecnica de Madrid, 2014
@thesis{mateo2014cpu,
title={CPU/GPGPU/HW comparison of an Eigenfaces face recognition system},
author={Mateo, Julio Camarero},
year={2014}
}
In this Master Thesis it has been established the specifications for developing a face recognition system in a variety of platforms at the same time: MATLAB running in a personal computer, C code in an embedded microprocessor (MicroBlaze), a simpler reconfigurable hardware for an FPGA-based platform, a flexible hardware for higher performance, and finally a C and a CUDA code for a laptop and a personal computer. After having studied the opportunities that recent solutions can bring to the design of a novel face recognition system, it has been proven the feasibility of the algorithm chosen with a first version using MATLAB tools. However, the scope of the applicability of the CUDA code, being capable to port it to most of the architectures that have been tested throughout this project, once the flow proposed in FastCUDA is fully developed and settled, gives a high opportunity to a one-effort-fits-all approach for SW, GPU and HW based systems. In addition to the image processing task required for building a database of faces and a set of identities, two applications were developed in order to verify the main concepts presented in the theory of the Eigenfaces algorithm. One software and two hardware versions of a reconfigurable face recognition system have been implemented in a real FPGA-based WSN node, followed by consecutive processes of verification and testing the correctness of the different versions. Besides, it has been given the required information for designing two more systems for both a laptop and a desktop, and supervised their development. Finally, it has been created a verification system that facilitates the debugging process when designing a new recognition system, as well as it tests the numerical precision of every solution by comparing the results with a golden reference. The results show to be uneven, in the sense that some implementation details which might look like second order decisions, have a critical impact in performance, so a detailed knowledge of the architecture, and what is accelerated and what is not, is required. The results of all the systems were presented in a summary table. It was obtained that the hardware version requires 3.65 times less energy than the best of the other solutions. Nevertheless, the methodology and design flow followed with this type of solutions is much more complex and more time-consuming. As an extension to the study, it has been explored novel approaches such as MATLAB HDL Coder, Vivado HLS and FastCUDA tool flow. These techniques will make easier and faster the task of designing hardware, while bringing software programmers the possibility of taking advantage of FPGAs, and therefore it is expected a big expansion in this direction.
March 17, 2014 by hgpu