5285

Posts

Aug, 17

Near real-time Fast Bilateral Stereo on the GPU

State of the art local stereo correspondence algorithms that adapt their supports to image content allow to infer very accurate disparity maps often comparable to algorithms based on global disparity optimization methods. However, despite their effectiveness, accurate local approaches based on this methodology are also computationally expensive and several simplifications aimed at reducing their computational […]
Aug, 17

Fast boosting trees for classification, pose detection, and boundary detection on a GPU

Discriminative classifiers are often the computational bottleneck in medical imaging applications such as foreground/background classification, 3D pose detection, and boundary delineation. To overcome this bottleneck, we propose a fast technique based on boosting tree classifiers adapted for GPU computation. Unlike standard tree-based algorithms, our method does not have any recursive calls which makes it GPU-friendly. […]
Aug, 17

GPU-based reconstruction and display for 4D ultrasound data

Due to the required computational effort of 4D ultrasound imaging, such systems depend on low complexity techniques like nearest neighbor interpolation, which affects volume quality. Moreover, more accurate techniques like normalized convolution, backward trilinear interpolation, and forward spherical and ellipsoidal Gaussian kernel, are avoided in real-time imaging because of the tight reconstruction time. The goal […]
Aug, 17

nGFSIM: A GPU-based fault simulator for 1-to-n detection and its applications

We present nGFSIM, a GPU-based fault simulator for stuck-at faults which can report the fault coverage of one-to n-detection for any specified integer n using only a single run of fault simulation. nGFSIM, which explores the massive parallelism in the GPU architecture and optimizes the memory access and usage, enables accelerated fault simulation without the […]
Aug, 17

GPU accelerated FDTD solver and its application in MRI

The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 […]
Aug, 17

Realtime Ray Tracing on GPU with BVH-based Packet Traversal

Recent GPU ray tracers can already achieve performance competitive to that of their CPU counterparts. Nevertheless, these systems can not yet fully exploit the capabilities of modern GPUs and can only handle medium-sized, static scenes. In this paper we present a BVH-based GPU ray tracer with a parallel packet traversal algorithm using a shared stack. […]
Aug, 17

CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU

Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Due to GPU’s tremendous computing capability, it has emerged as the co-processor of the CPU to achieve a high overall throughput. CUDA programming model provides the programmers adequate C language like APIs to better exploit the parallel power of […]
Aug, 17

High Throughput Variable Size Non-square Gabor Engine with Feature Pooling Based on GPU

Increasing application of Gabor feature space in various computer vision tasks and its high computational demand, encourages using parallel computing technologies. In this work we have designed a high throughput GPU based Gabor kernel that mimics the function of initial biological visual cortex layers namely ‘Simple’ and ‘Complex’ cells. The kernel is basically a Gabor […]
Aug, 17

Robotic approach to multi-beam optical tweezers with Computer Generated Hologram

Multi-beam optical tweezers is important technique to manipulate multiple small objects. Computer Generated Hologram (CGH) is one of the techniques and it can trap more than 200 objects in three dimension. For dexterous micromanipulation, it is useful to apply robotics into optical tweezers. In this research, we designed the optical system and control system of […]
Aug, 17

Regular Expression Matching and Operational Semantics

Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the regular expression on the fly. Thus they can be seen as virtual machines interpreting the regular expression much as if it […]
Aug, 16

Training Logistic Regression and SVM on 200GB Data Using b-Bit Minwise Hashing and Comparisons with Vowpal Wabbit (VW)

We generated a dataset of 200 GB with 10^9 features, to test our recent b-bit minwise hashing algorithms for training very large-scale logistic regression and SVM. The results confirm our prior work that, compared with the VW hashing algorithm (which has the same variance as random projections), b-bit minwise hashing is substantially more accurate at […]
Aug, 16

Topical perspective on massive threading and parallelism

Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: