9382

Posts

Apr, 30

Mr. Scan: Extreme Scale Density-Based Clustering using a Tree-Based Network of GPGPU Nodes

Density-based clustering algorithms are a widely-used class of data mining techniques that can find irregularly shaped clusters and cluster data without prior knowledge of the number of clusters it contains. DBSCAN is the most well-known density-based clustering algorithm. We introduce our version of DBSCAN, called Mr. Scan, which uses a hybrid parallel implementation that combines […]
Apr, 29

RealTime GPU-Based Motion Planning for Task Executions

We present a realtime GPU-based motion planning algorithm for robot task executions. Many task execution strategies break down a high-level task planning problem into multiple low-level motion planning problems, and it is essential to solve those problems at interactive rates. In order to achieve high performance for the planning, our method exploits a high number […]
Apr, 29

Multigrid Optimization Methods for High Performance Computing

The aim of this work was the investigation of implementability and efficiency of an algorithm for solving optimal control problems on a new hardware architecture. For an academic test problem the collective smoothing multigrid method (CSMG) was realized on a commodity graphics card (GPU) and the performance in term of elapsed time compared to those […]
Apr, 29

Analysis of Multicore CPU and GPU Toward Parallelization of Total Focusing Method Ultrasound Reconstruction

Ultrasonic imaging and reconstruction tools are com-monly used to detect, identify and measure defects in different mechanical parts. Due to the complexity of the underlying physics, and due to the evergrowing quantity of acquired data, computation time is becoming a limitation to the opti-mal inspection of a mechanical part. This article presents the performances of […]
Apr, 29

Local Histogram Modification Based Contrast Enhancement with GPU Acceleration

This paper presents a novel local contrast enhancement algorithm based on local histogram modification. The computation of local contrast enhancement operators is usually slow though they produce better local contrast and details. We have addressed this issue by subtly designing a highly parallel algorithm, which could be easily implemented on Graphics Processing Units (GPU) to […]
Apr, 29

Split tiling for GPUs: automatic parallelization using trapezoidal tiles

Tiling is a key technique to enhance data reuse. For computations structured as one sequential outer "time" loop enclosing a set of parallel inner loops, tiling only the parallel inner loops may not enable enough data reuse in the cache. Tiling the inner loops along with the outer time loop enhances data locality but may […]
Apr, 27

CUDA Based CAMshift Algorithm for Object Tracking Systems

In this paper, we present an image object tracking system for GPGPU based CAMshift algorithm. For image object tracking, we use the parallel CAMshift tracking algorithm based on the HSV color image distribution of detected moving objects. In this, RGB-to-HSV color conversion, image masking such as open and close operation for image morphology, and computing […]
Apr, 27

Efficient Computation of the Kleene Star in Max-Plus Algebra using a CUDA GPU

This research aims to accelerate the computation of the Kleene star in max-plus algebra using CUDA technology on graphics processing units (GPUs). The target module is the Kleene star of a weighted adjacency matrix for directed acyclic graph (DAGs) which plays an essential role in calculating the earliest and/or latest schedule for a class of […]
Apr, 27

Modeling and Optimization of Parallel Matrix-based Computations on GPU

As graphics processing units (GPUs) are continually being utilized as coprocessors, the demand for optimally utilizing them for various applications continues to grow. This work narrows the gap between programmers and minimum execution time for matrix-based computations on a GPU. To minimize execution time, computation and communication time must be considered. For computation, the placement […]
Apr, 27

Orchestrated Scheduling and Prefetching for GPGPUs

In this paper, we present techniques that coordinate the thread scheduling and prefetching decisions in a General Purpose Graphics Processing Unit (GPGPU) architecture to better tolerate long memory latencies. We demonstrate that existing warp scheduling policies in GPGPU architectures are unable to effectively incorporate data prefetching. The main reason is that they schedule consecutive warps, […]
Apr, 27

H. 264 Parallel Optimization on Graphics Processors

Multimedia applications are present in most mobile hand-held devices. The H.264 standard is currently dominating the video compression world. H.264 has high computational complexity requiring large amount of processing resources. Many techniques emerged that optimize H.264 using parallelization on multicore systems ranging from groups of pictures until the smallest block of pixels. We propose a […]
Apr, 26

Using High Performance Computing for Optimizing Credit Risk Calculation

The volume of banks data calculation is increasing each year with extraordinary scale and with that, new forms of computation is needed. High performance computing is a very attractive field for optimization such bank calculous, which can give promising results. This paper shows a implementation of know model for assessing the credit risk of a […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: