Posts
Aug, 31
A parallel algorithm for implicit depletant simulations
We present an algorithm to simulate the many-body depletion interaction between anisotropic colloids in an implicit way, integrating out the degrees of freedom of the depletants, which we treat as an ideal gas. Because the depletant particles are statistically independent and the depletion interaction is short-ranged, depletants are randomly inserted in parallel into the excluded […]
Aug, 31
An Asynchronous Event Communication Technique for Soft Real-Time GPGPU Applications
CONTEXT. Interactive GPGPU applications requires low response time feedback from events such as user input in order to provide a positive user experience. Communication of these events must be performed asynchronously as to not cause significant performance penalties. OBJECTIVES. In this study the usage of CPU/GPU shared virtual memory to perform asynchronous communication is explored. […]
Aug, 31
A GPU-accelerated local search algorithm for the Correlation Clustering problem
The solution of the Correlation Clustering (CC) problem can be used as a criterion to measure the amount of balance in signed social networks, where positive (friendly) and negative (antagonistic) interactions take place. Metaheuristics have been used successfully for solving not only this problem, as well as other hard combinatorial optimization problems, since they can […]
Aug, 28
Boosting Java Performance using GPGPUs
Heterogeneous programming has started becoming the norm in order to achieve better performance by running portions of code on the most appropriate hardware resource. Currently, significant engineering efforts are undertaken in order to enable existing programming languages to perform heterogeneous execution mainly on GPUs. In this paper we describe Jacc, an experimental framework which allows […]
Aug, 28
VisPy: Harnessing The GPU For Fast, High-Level Visualization
The growing availability of large, multidimensional data sets has created demand for high-performance, interactive visualization tools. VisPy leverages the GPU to provide fast, interactive, and beautiful visualizations in a high-level API. Here we introduce the main features, architecture, and techniques used in VisPy.
Aug, 28
High-Speed Object Detection: Design, Study and Implementation of a Detection Framework using Channel Features and Boosting
In this thesis we design, implement and study a high-speed object detection framework. Our baseline detector uses integral channel features as object representation and AdaBoost as supervised learning algorithm. We suggest the implementation of two approximation techniques for speeding up the baseline detector and show their effectiveness by performing experiments on both detection quality and […]
Aug, 28
Deep Convolutional Neural Networks for Smile Recognition
This thesis describes the design and implementation of a smile detector based on deep convolutional neural networks. It starts with a summary of neural networks, the difficulties of training them and new training methods, such as Restricted Boltzmann Machines or autoencoders. It then provides a literature review of convolutional neural networks and recurrent neural networks. […]
Aug, 28
A Parallel Algorithm to Test Chordality of Graphs
We present a simple parallel algorithm to test chordality of graphs which is based on the parallel Lexicographical Breadth-First Search algorithm. In total, the algorithm takes time O(N) on N-threads machine and it performs work O(N^2), where N is the number of vertices in a graph. Our implementation of the algorithm uses a GPU environment […]
Aug, 27
CudaChain: A Practical GPU-accelerated 2D Convex Hull Algorithm
This paper presents a practical GPU-accelerated convex hull algorithm and a novel Sorting-based Preprocessing Approach (SPA) for planar point sets. The proposed algorithm consists of two stages: (1) two rounds of preprocessing performed on the GPU and (2) the finalization of calculating the expected convex hull on the CPU. We first discard the interior points […]
Aug, 27
gScan: Accelerating Graham Scan on the GPU
This paper presents a fast implementation of the Graham scan on the GPU. The proposed algorithm is composed of two stages: (1) two rounds of preprocessing performed on the GPU and (2) the finalization of finding the convex hull on the CPU. We first discard the interior points that locate inside a quadrilateral formed by […]
Aug, 27
Adaptive Multi-GPU Exchange Monte Carlo for the 3D Random Field Ising Model
The study of disordered spin systems through Monte Carlo simulations has proven to be a hard task due to the adverse energy landscape present at the low temperature regime, making it difficult for the simulation to escape from a local minimum. Replica based algorithms such as the Exchange Monte Carlo (also known as parallel tempering) […]
Aug, 27
Accelerated Deep Learning using Intel Xeon Phi
Deep learning, a sub-topic of machine learning inspired by biology, have achieved wide attention in the industry and research community recently. State-of-the-art applications in the area of computer vision and speech recognition (among others) are built using deep learning algorithms. In contrast to traditional algorithms, where the developer fully instructs the application what to do, […]

