Posts
Aug, 10
GPGPU Based Aeroacoustic Optimization of a Contra-Rotating Fan
Contra-rotating fans have several advantages over single stage axial fans. If they are well designed, the exit flow field is almost irrotational. This helps to increase the aerodynamic efficiency by up to 16%, when compared to single stage fans. However, since the second stage interacts with the flow disturbances from the first stage, the associated […]
Aug, 10
An Improved Image Segmentation Algorithm Based on GPU Parallel Computing
In the process of image segmentation, the classic Fuzzy C-Means (FCM) algorithm is time-consuming and depends heavily on initialization center. Based on Graphic Processing Unit (GPU), this paper proposes a novel FCM algorithm by improving the computational formulas of membership degree and the update criterion of cluster centers. Our algorithm can initialize cluster centers purposefully […]
Aug, 10
Faster sequence alignment through GPU-accelerated restriction of the seed-and-extend search space
MOTIVATION: In computing pairwise alignments of biological sequences, software implementations employ a variety of heuristics that decrease the computational effort involved in computing potential alignments. A key element in achieving high processing throughput is to identify and prioritize potential alignments where high-scoring mappings can be expected. These tasks involve list-processing operations that can be efficiently […]
Aug, 9
Real-Time Automatic Object Classification and Tracking using Genetic Programming and NVIDIA CUDA
Genetic Programming (GP) is a widely used methodology for solving various computational problems. GP’s problem solving ability is usually hindered by its long execution times. In this thesis, GP is applied toward real-time computer vision. In particular, object classification and tracking using a parallel GP system is discussed. First, a study of suitable GP languages […]
Aug, 9
Vivaldi: A Domain-Specific Language for Volume Processing and Visualization on Distributed Heterogeneous Systems
As the size of image data from microscopes and telescopes increases, the need for high-throughput processing and visualization of large volumetric data has become more pressing. At the same time, many-core processors and GPU accelerators are commonplace, making high-performance distributed heterogeneous computing systems affordable. However, effectively utilizing GPU clusters is difficult for novice programmers, and […]
Aug, 9
Fast Semantic Segmentation of RGB-D Scenes with GPU-Accelerated Deep Neural Networks
In semantic scene segmentation, every pixel of an image is assigned a category label. This task can be made easier by incorporating depth information, which structured light sensors provide. Depth, however, has very different properties from RGB image channels. In this paper, we present a novel method to provide depth information to convolutional neural networks. […]
Aug, 9
Parallel Distributed Breadth First Search on the Kepler Architecture
We present the results obtained by using an evolution of our CUDA-based solution for the exploration, via a Breadth First Search, of large graphs. This latest version exploits at its best the features of the Kepler architecture and relies on a 2D decomposition of the adjacency matrix to reduce the number of communications among the […]
Aug, 9
GPU Parallel Implementation of the Approximate K-SVD Algorithm Using OpenCL
Training dictionaries for sparse representations is a time consuming task, due to the large size of the data involved and to the complexity of the training algorithms. We investigate a parallel version of the approximate K-SVD algorithm, where multiple atoms are updated simultaneously, and implement it using OpenCL, for execution on graphics processing units (GPU). […]
Aug, 7
Optimizing memory management on heterogeneous systems using polyhedral, compile-time techniques
The target of this thesis is to optimize memory management on heterogeneous systems. Our approach involves performing memory access pattern analysis on kernels in order to produce an accurate estimation of the memory usage. This information is produced in the form of array ranges describing which elements are accessed as well as whether they are […]
Aug, 7
On the Fly Porn Video Blocking Using Distributed Multi-GPU and Data Mining Approach
Preventing users from accessing adult videos and at the same time allowing them to access good educational videos and other materials through campus wide network is a big challenge for organizations. Major existing web filtering systems are textual content or link analysis based. As a result, potential users cannot access qualitative and informative video content […]
Aug, 7
Dense Arithmetic over Finite Fields with the CUMODP Library
CUMODP is a CUDA library for exact computations with dense polynomials over finite fields. A variety of operations like multiplication, division, computation of subresultants, multi-point evaluation, interpolation and many others are provided. These routines are primarily designed to offer GPU support to polynomial system solvers and a bivariate system solver is part of the library. […]
Aug, 7
Multi-Agent Systems and General-Purpose Computing on Graphics Processing Units: A Survey
In some application domains, using a Multi-Agent Systems (MAS) modeling approach may require to handle a large number of agents (crowds, traffic, animal societies, ecosystems, etc.). Today, as this number is constantly growing, the computational resources which are needed cannot be fulfilled by the CPU of single Personal Computers (PC) any more. Considering this issue, […]