Posts
Dec, 17
Parallel Mining of Neuronal Spike Streams on Graphics Processing Units
Multi-electrode arrays (MEAs) provide dynamic and spatial perspectives into brain function by capturing the temporal behavior of spikes recorded from cultures and living tissue. Understanding the firing patterns of neurons implicit in these spike trains is crucial to gaining insight into cellular activity. We present a solution involving a massively parallel graphics processing unit (GPU) […]
Dec, 17
GPU implementation of JPEG2000 for hyperspectral image compression
Hyperspectral image compression has received considerable interest in recent years due to the enormous data volumes collected by imaging spectrometers for Earth Observation. JPEG2000 is an important technique for data compression which has been successfully used in the context of hyperspectral image compression, either in lossless and lossy fashion. Due to the increasing spatial, spectral […]
Dec, 17
Code Optimization Techniques for Graphics Processing Units
Books on parallel programming theory often talk about such weird beasts like the PRAM model, a hypothetical hardware that would provide the programmer with a number of processors that is proportional to the input size of the problem at hand. Modern general purpose computers afford only a few processing units; four is currently a reasonable […]
Dec, 17
Customizable Memory Schemes for Data Parallel Accelerators
Memory system efficiency is crucial for any processor to achieve high performance, especially in the case of data parallel machines. Processing capabilities of parallel lanes will be wasted, when data requests are not accomplished in a sustainable and timely manner. Irregular vector memory accesses can lead to inefficient use of the parallel banks/modules/channels and significantly […]
Dec, 17
Parallel mesh adaptation and graph analysis using graphics processing units
In the field of Computational Fluid Dynamics, several types of mesh adaptation strategies are used to enhance a mesh’s quality, thereby improving simulation speed and accuracy. Mesh smoothing (r-refinement) is a simple and effective technique, where nodes are repositioned to increase or decrease local mesh resolution. Mesh partitioning divides a mesh into sections, for use […]
Dec, 17
A Comparative Analysis of GPU Implementations of Spectral Unmixing Algorithms
Spectral unmixing is a very important task for remotely sensed hyperspectral data exploitation. It involves the separation of a mixed pixel spectrum into its pure component spectra (called endmembers) and the estimation of the proportion (abundance) of each endmember in the pixel. Over the last years, several algorithms have been proposed for: i) automatic extraction […]
Dec, 17
A Comparison of Modern GPU and CPU Architectures: And the Common Convergence of Both
In the past few decades, processor technology specifically designed for the processing and output of graphical data has become a major market. With the rise of parallelism as an important method of improving processor throughput, Graphics Processing Units (GPUs) have come to drive architecture demands in many ways. In this work, we plan to explore […]
Dec, 16
A Parallel GPU Version of the Traveling Salesman Problem
This paper describes and evaluates an implementation of iterative hill climbing with random restart for determining high-quality solutions to the traveling salesman problem. With 100,000 restarts, this algorithm finds the optimal solution for four out of five 100-city TSPLIB inputs and yields a tour that is only 0.07% longer than the optimum on the fifth […]
Dec, 16
Reducing Thread Divergence in GPU-based B and B Applied to the Flow-shop problem
In this paper, we propose a pioneering work on designing and programming B&B algorithms on GPU. To the best of our knowledge, no contribution has been proposed to raise such challenge. We focus on the parallel evaluation of the bounds for the Flow-shop scheduling problem. To deal with thread divergence caused by the bounding operation, […]
Dec, 16
Algorithms acceleration of pattern-matching in multi-core architectures
The aim of this thesis is to create or adapt a programming model in order to make multi-core processors accessible by almost every programmer. This objective includes existing codes and algorithms reuse, debuggability, and the capacity to introduce changes incrementally. We face multi-cores with many architectures including homogeneity versus heterogeneity and shared-memory versus distributed-memory. We […]
Dec, 16
High-performance polynomial GCD computations on graphics processors
We propose an algorithm to compute a greatest common divisor (GCD) of univariate polynomials with large integer coefficients on Graphics Processing Units (GPUs). At the highest level, our algorithm relies on modular techniques to decompose the problem into subproblems that can be solved separately. Next, we employ resultant-based or matrix algebra methods to compute a […]
Dec, 16
Reducing Thread Divergence in GPU-based B&B Applied to the Flow-shop problem
In this paper, we propose a pioneering work on designing and programming B&B algorithms on GPU. To the best of our knowledge, no contribution has been proposed to raise such challenge. We focus on the parallel evaluation of the bounds for the Flow-shop scheduling problem. To deal with thread divergence caused by the bounding operation, […]