Graphics processors (GPU – Graphic Processor Units) recently have gained a lot of interest as an efficient platform for general-purpose computation. Cellular Automata approach which is inherently parallel gives the opportunity to implement high performance simulations. This paper presents how shared memory in GPU can be used to improve performance for Cellular Automata models. In […]

December 6, 2013 by hgpu

Real-time stereo matching, which is important in many applications like self-driving cars and 3-D scene reconstruction, requires large computation capability and high memory bandwidth. The most time-consuming part of stereomatching algorithms is the aggregation of information (i.e. costs) over local image regions. In this paper, we present a generic representation and suitable implementations for three […]

October 13, 2012 by hgpu

The goal of this thesis is to design and implement a hyper neural network that has a topology with limited number inputs of individual neurons and uses genetic programming as the learning algorithm. Parallelization of this neural network is done with use of OpenCL standard which allows running it on wide range of devices. From […]

January 6, 2012 by hgpu

In this paper, we have shown that exploitation of the GPU’s massively parallel architecture can dramatically increase the speed of MoM calculations. While the code can certainly be improved, matrix fill speed-up factors are already commonly found to be between 150X-260X. The conjugate gradient solver stands to be improved at this writing but still results […]

September 1, 2011 by hgpu

This work describes a parallelizable optical flow field estimator based upon a modified batch version of the Self-Organizing Map (SOM). This estimator handles the ill-posedness in gradient-based motion estimation via a novel combination of regression and self-organization. The aperture problem is treated using an algebraic framework that partitions motion estimates obtained from regression into two […]

July 22, 2011 by hgpu

This work describes a parallelizable optical flow field estimator based upon a modified batch version of the Self-Organizing Map (SOM). This estimator handles the ill-posedness in gradient-based motion estimation via a novel combination of regression and self-organization. The aperture problem is treated using an algebraic framework that partitions motion estimates obtained from regression into two […]

July 22, 2011 by hgpu