Posts
Sep, 20
Evolutionary Clustering on CUDA
Unsupervised clustering of large data sets is a complicated task. Due to its complexity, various meta-heuristic machine learning algorithms have been used to automate the clustering process. Genetic and evolutionary algorithms have been deployed to find clusters in data sets with success. The GPU computing is a recent programming paradigm introducing high performance parallel computing […]
Sep, 20
Binaural Simulations Using Audio Rate FDTD Schemes and CUDA
Three dimensional finite difference time domain schemes can be used as an approach to spatial audio simulation. By embedding a model of the human head in a 3D computational space, such simulations can emulate binaural sound localisation. This approach normally relies on using high sample rates to give finely detailed models, and is computationally intensive. […]
Sep, 20
Forecasting high frequency financial time series using parallel FFN with CUDA and ZeroMQ
Feed forward neural networks (FFNs) are powerful data-modelling tools that have been used in many fields of science. Specifically in financial applications, due to the number of factors affecting the market, models with a large quantity of input features, hidden and output neurons can be obtained. In financial problems, the response time is crucial and […]
Sep, 20
GPU-Acceleration of Linear Algebra using OpenCL
In this report we’ve created a linear algebra API using OpenCL, for use with MATLAB. We’ve demonstrated that the individual linear algebra components can be faster when using the GPU as compared to the CPU. We found that the API is heavily memory bound, but still faster than MATLAB in our testcase. The API components […]
Sep, 19
Direct GPU/FPGA Communication Via PCI Express
Parallel processing has hit mainstream computing in the form of CPUs, GPUs and FPGAs. While explorations proceed with all three platforms individually and with the CPU-GPU pair, little exploration has been performed with the synergy of GPU-FPGA. This is due in part to the cumbersome nature of communication between the two. This paper presents a […]
Sep, 19
Simulating spiking neural networks on GPU
Modern graphics cards contain hundreds of cores that can be programmed for intensive calculations. They are beginning to be used for spiking neural network simulations. The goal is to make parallel simulation of spiking neural networks available to a large audience, without the requirements of a cluster. We review the ongoing efforts towards this goal, […]
Sep, 19
Parallelization of a Block-Matching Algorithm
In this work we present a parallelization technique, together with its GPU implementation, for the full-search block-matching algorithm. This problem consists in finding the block that best matches a given reference template in terms of some photometric measure within a predefined search area. Block matching is a fundamental processing step for many signal-processing applications. Its […]
Sep, 19
Beauty And The Beast: Exploiting GPUs In Haskell
In this paper we compare a Haskell system that exploits a GPU back end using Obsidian against a number of other GPU/parallel processing systems. Our examples demonstrate two major results. Firstly they show that the Haskell system allows the applications programmer to exploit GPUs in a manner that eases the development of parallel code by […]
Sep, 19
Gauge fixing using overrelaxation and simulated annealing on GPUs
We adopt CUDA-capable Graphic Processing Units (GPUs) for Coulomb, Landau and maximally Abelian gauge fixing in 3+1 dimensional SU(3) lattice gauge field theories. The local overrelaxation algorithm is perfectly suited for highly parallel architectures. Simulated annealing preconditioning strongly increases the probability to reach the global maximum of the gauge functional. We give performance results for […]
Sep, 18
Implementation of QR Updating Algorithms on the GPU
The least squares problem is an extremely useful device to represent an approximate solution to overdetermined systems, and a QR factorisation is a common method for solving least squares problems. It is often the case that multiple least squares solutions have to be computed with only minor changes in the underlying data. In this case, […]
Sep, 18
The Architecture and Evolution of CPU-GPU Systems for General Purpose Computing
GPU computing has emerged in recent years as a viable execution platform for throughput oriented applications or regions of code. GPUs started out as independent units for program execution but there are clear trends towards tight-knit CPU-GPU integration. In this work, we will examine existing research directions and future opportunities for chip integrated CPU-GPU systems. […]
Sep, 18
Quasi-real-time analysis of dynamic near field scattering data using a graphics processing unit
We present an implementation of the analysis of dynamic near field scattering (NFS) data using a graphics processing unit (GPU). We introduce an optimized data management scheme thereby limiting the number of operations required. Overall, we reduce the processing time from hours to minutes, for typical experimental conditions. Previously the limiting step in such experiments, […]

