Posts
Nov, 16
Computer Vision and Image Segmentation Implemented on GPU Using Compute Unified Device Architecture as Applied on Quality Inspection of Pre-etched Printed Circuit Board
Computer vision and image processing continue to expand its area of application. Traditionally, this technology was hosted by a sequential processing paradigm of a Central Processing Unit (CPU). With this implementation in mind limits the usefulness of a device that is capable of parallel processing for several years. At the same time, it has been […]
Nov, 15
Device specialization in heterogeneous multi-GPU environments
In the last few years there have been many activities towards coupling CPUs and GPUs in order to get the most from CPU-GPU heterogeneous systems. One of the main problems that prevent these systems to be exploited in a device-aware manner is the CPU-GPU communication bottleneck, which often doesn’t allow to produce code more efficient […]
Nov, 15
Resolving the conflict between generality and plausibility in verified computation
The area of proof-based verified computation (outsourced computation built atop probabilistically checkable proofs and cryptographic machinery) has lately seen renewed interest. Although recent work has made great strides in reducing the overhead of naive applications of the theory, these schemes still cannot be considered practical. The core issue is that the work for the prover […]
Nov, 15
Fast 3D Structure Localization in Medical Volumes using CUDA-enabled GPUs
Effective and fast localization of anatomical structures is a crucial first step towards automated analysis of medical volumes. In this paper, we propose an iterative approach for structure localization in medical volumes based on the adaptive bandwidth mean-shift algorithm for object detection (ABMSOD). We extend and tune the ABMSOD algorithm, originally used to detect 2D […]
Nov, 15
Accelerating the Gillespie Exact Stochastic Simulation Algorithm Using Hybrid Parallel Execution on Graphics Processing Units
The Gillespie Stochastic Simulation Algorithm (GSSA) and its variants are cornerstone techniques to simulate reaction kinetics in situations where the concentration of the reactant is too low to allow deterministic techniques such as differential equations. The inherent limitations of the GSSA include the time required for executing a single run and the need for multiple […]
Nov, 15
High Dimensional Spaces and Modelling in the task of Speaker Recognition
The automatic speaker recognition made a significant progress in the last two decades. Huge speech corpora containing thousands of speakers recorded on several channels are at hand, and methods utilizing as much information as possible were developed. Nowadays state-of-the-art methods are based on Gaussian mixture models used to estimate relevant statistics from feature vectors extracted […]
Nov, 14
Load Balanced Parallel GPU Out-of-Core for Continuous LOD Model Visualization
Rendering massive 3D models has been recognized as a challenging task. Due to the limited size of GPU memory, a massive model containing hundreds of millions of primitives cannot fit into most of modern GPUs. By applying parallel levelof-detail (LOD), as proposed in [1], only a portion of primitives instead of the whole are necessary […]
Nov, 14
G-SNPM – A GPU-based SNP mapping tool
MOTIVATION AND OBJECTIVES: In genotyping analysis often researchers need to merge together genetic datasets coming from different genotyping platforms that use different sets of Single Nucleotide Polymorphisms (SNPs) to represent genetic polymorphisms. In order to do this, it is necessary to know the exact position of a SNP in a chromosome and update this information […]
Nov, 14
Performance modeling of atomic additions on GPU scratchpad memory
GPU application implementations using scatter approaches will fall into write contention due to atomic updates of output elements, if these result from more than one input element. Colliding threads will be serialized, seriously harming performance. Dealing with these issues requires a proper understanding of the behavior of the scratchpad or shared memory under conflicting accesses […]
Nov, 14
A simple method to accelerate fringe analysis algorithms based on graphics processing unit and MATLAB
With the fast development during the past few years, multicore has become a revolutionary technique for the performance improvement of computing devices, ranging from supercomputers to cell phones. Among multicore processors, a graphics processing units (GPU) is outstanding because of its huge computational performance and comparably low cost. It can be used as a coprocessor […]
Nov, 14
Correctly rounding elementary functions on GPU
The IEEE 754-2008 standard recommends the correct rounding of elementary functions. This requires to solve the Table Maker’s Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lef’evre algorithm, on Graphics Processing Units (GPU) which are massively parallel architectures with a partial SIMD execution (Single […]
Nov, 14
Efficient similarity search on multimedia databases
Manipulating and retrieving multimedia data has received increasing attention with the advent of cloud storage facilities. The ability of querying by similarity over large data collections is mandatory to improve storage and user interfaces. But, all of them are expensive operations to solve only in CPU; thus, it is convenient to take into account High […]