8579

Posts

Nov, 15

High Dimensional Spaces and Modelling in the task of Speaker Recognition

The automatic speaker recognition made a significant progress in the last two decades. Huge speech corpora containing thousands of speakers recorded on several channels are at hand, and methods utilizing as much information as possible were developed. Nowadays state-of-the-art methods are based on Gaussian mixture models used to estimate relevant statistics from feature vectors extracted […]
Nov, 14

Load Balanced Parallel GPU Out-of-Core for Continuous LOD Model Visualization

Rendering massive 3D models has been recognized as a challenging task. Due to the limited size of GPU memory, a massive model containing hundreds of millions of primitives cannot fit into most of modern GPUs. By applying parallel levelof-detail (LOD), as proposed in [1], only a portion of primitives instead of the whole are necessary […]
Nov, 14

G-SNPM – A GPU-based SNP mapping tool

MOTIVATION AND OBJECTIVES: In genotyping analysis often researchers need to merge together genetic datasets coming from different genotyping platforms that use different sets of Single Nucleotide Polymorphisms (SNPs) to represent genetic polymorphisms. In order to do this, it is necessary to know the exact position of a SNP in a chromosome and update this information […]
Nov, 14

Performance modeling of atomic additions on GPU scratchpad memory

GPU application implementations using scatter approaches will fall into write contention due to atomic updates of output elements, if these result from more than one input element. Colliding threads will be serialized, seriously harming performance. Dealing with these issues requires a proper understanding of the behavior of the scratchpad or shared memory under conflicting accesses […]
Nov, 14

A simple method to accelerate fringe analysis algorithms based on graphics processing unit and MATLAB

With the fast development during the past few years, multicore has become a revolutionary technique for the performance improvement of computing devices, ranging from supercomputers to cell phones. Among multicore processors, a graphics processing units (GPU) is outstanding because of its huge computational performance and comparably low cost. It can be used as a coprocessor […]
Nov, 14

Correctly rounding elementary functions on GPU

The IEEE 754-2008 standard recommends the correct rounding of elementary functions. This requires to solve the Table Maker’s Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lef’evre algorithm, on Graphics Processing Units (GPU) which are massively parallel architectures with a partial SIMD execution (Single […]
Nov, 14

Efficient similarity search on multimedia databases

Manipulating and retrieving multimedia data has received increasing attention with the advent of cloud storage facilities. The ability of querying by similarity over large data collections is mandatory to improve storage and user interfaces. But, all of them are expensive operations to solve only in CPU; thus, it is convenient to take into account High […]
Nov, 14

An architecture for real time fluid simulation using multiple GPUs

Natural phenomena simulation, such as water and smoke, is a very important topic in order to increase real time scene realism in videogames and general real time simulations. However, this kind of simulation requires numerically solving the Navier-Stokes equations, which is a computationally expensive task. Additionally, to deal more immersing simulation, interaction between the flow […]
Nov, 14

Real-Time Scheduling Using GPUs – Advanced and More Accurate Proof of Feasibility

This paper will report our evaluation to use OpenCL as a platform for hard real-time scheduling. Especially, we have evaluated which types of tasks are faster on GPGPU than on CPU. We have investigated computational tasks, memory intensive tasks (especially tasks using low latency GDDR memory) and disk intensive tasks. This study is the part […]
Nov, 14

Kernel Weaver: Automatically Fusing Database Primitives for Efficient GPU Computation

Data warehousing applications represent an emerging application arena that requires the processing of relational queries and computations over massive amounts of data. Modern general purpose GPUs are high bandwidth architectures that potentially offer substantial improvements in throughput for these applications. However, there are significant challenges that arise due to the overheads of data movement through […]
Nov, 14

Real-Time Surface Extraction and Visualization of Medical Images using OpenCL and GPUs

Marching Cubes (MC) is an algorithm that extracts surfaces from volumetric scalar data. It is used extensively in visualization and analysis of medical data from modalities like CT and MR, usually after a 3D segmentation of the structures of interest have been performed. Implementations of MC on CPUs are slow, using several seconds (even minutes) […]
Nov, 11

A parallel method for tuning Fuzzy TSK Systems with CUDA

This paper studies an option for offloading some types of AI processing to the Graphics Processing Unit (GPU), by proposing the parallelization of the Batch Least Squares (BLS) method for tuning consequent parameters and the gradient method for tuning input fuzzy sets in a Takagi-Sugeno-Kang Fuzzy Inference System using the Compute Unified Device Architecture (CUDA). […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: