Posts
Oct, 14
Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, with GPU implementations
The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of computing on CUDA-enabled GPUs. The summed area table (SAT) of a matrix is a data structure frequently used in the area of computer vision which can be obtained by computing the column-wise prefix-sums and then the rowwise prefix-sums. The […]
Oct, 13
Scalable approximate k-NN in multidimensional big data
This thesis studies the scalability of the similarity search problem in large-scale multidimensional data. Similarity search, translating into the neighbour search problem, finds many applications for information retrieval, visualization, machine learning and data mining. The current exponential growth of data motivates the need for approximate and scalable algorithms. In most of existing algorithms and data-structures, […]
Oct, 13
A Parallel Algorithm for Enumerating Joint Weight of a Binary Linear Code in Network Coding
In this paper, we present a parallel algorithm for enumerating joint weight of a binary linear (n, k) code, aiming at accelerating assessment of its decoding error probability for network coding. To reduce the number of pairs of codewords to be investigated, our parallel algorithm reduces dimension k by focusing on the all-one vector included […]
Oct, 13
GAIN: GPU-based Constraint Checking for Context Consistency
Applications in pervasive computing are often context-aware. However, due to uncontrollable environmental noises, contexts collected by applications can be distorted or even conflicting with each other. This is known as the context inconsistency problem. To provide reliable services, applications need to validate contexts before using them. One promising approach is to check contexts against consistency […]
Oct, 13
HACC: Simulating Sky Surveys on State-of-the-Art Supercomputing Architectures
Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the ‘Dark Universe’, dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the […]
Oct, 13
Towards Efficient Indexing of Spatiotemporal Trajectories on the GPU for Distance Threshold Similarity Searches
Applications in many domains require processing moving object trajectories. In this work, we focus on a trajectory similarity search that finds all trajectories within a given distance of a query trajectory over a time interval, which we call the distance threshold similarity search. We develop three indexing strategies with spatial, temporal and spatiotemporal selectivity for […]
Oct, 11
Mapping dynamic programming algorithms on graphics processing units
Alignment is the fundamental operation used to compare biological sequences. It also serves to identify regions of similarity that are eventually consequences of structural, functional, or evolutionary relationships. Today, the processing of sequences from large DNA or protein databases is a big challenge. Graphics Processing Units (GPUs) are based on a highly parallel, many-core streaming […]
Oct, 11
Interactive Simulations with Navier-Stokes Equations on many-core Architectures
Navier-Stokes Equations are a mathematical model to describe the behaviour of fluids. They have proven to represent real fluid flows quite well and are base for many fluid simulations. In order to exploit the performance provided by modern many-core systems, fluid simulation algorithms must be able to efficiently solve the Navier-Stokes Equations in parallel. The […]
Oct, 11
Monte Carlo Path Tracing with OpenCL
We introduce an interactive Monte Carlo path tracer that uses the OpenCL framework. A path tracer draws a photo-realistic image of a 3D scene by simulating physical effects of light. Interactivity enables the user to move around the scene in real time, while OpenCL makes it possible to run the same code on either CPU […]
Oct, 11
Performance Improvement of Multichannel Audio by Graphics Processing Units
Multichannel acoustic signal processing has undergone major development in recent years due to the increased complexity of current audio processing applications. People want to collaborate through communication with the feeling of being together and sharing the same environment, what is considered as Immersive Audio Schemes. In this phenomenon, several acoustic effects are involved: 3D spatial […]
Oct, 11
Leo: A Profile-Driven Dynamic Optimization Framework for GPU Applications
Parallel architectures like GPUs are a tantalizing compute fabric for performance-hungry developers. While GPUs enable order-of-magnitude performance increases in many data-parallel application domains, writing efficient codes that can actually manifest those increases is a non-trivial endeavor, typically requiring developers to exercise specialized architectural features exposed directly in the programming model. Achieving good performance on GPUs […]
Oct, 10
Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM
Real-time dense computer vision and SLAM offer great potential for a new level of scene modelling, tracking and real environmental interaction for many types of robot, but their high computational requirements mean that use on mass market embedded platforms is challenging. Meanwhile, trends in low-cost, low-power processing are towards massive parallelism and heterogeneity, making it […]