5279

Posts

Aug, 17

CUKNN: A parallel implementation of K-nearest neighbor on CUDA-enabled GPU

Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Due to GPU’s tremendous computing capability, it has emerged as the co-processor of the CPU to achieve a high overall throughput. CUDA programming model provides the programmers adequate C language like APIs to better exploit the parallel power of […]
Aug, 17

High Throughput Variable Size Non-square Gabor Engine with Feature Pooling Based on GPU

Increasing application of Gabor feature space in various computer vision tasks and its high computational demand, encourages using parallel computing technologies. In this work we have designed a high throughput GPU based Gabor kernel that mimics the function of initial biological visual cortex layers namely ‘Simple’ and ‘Complex’ cells. The kernel is basically a Gabor […]
Aug, 17

Robotic approach to multi-beam optical tweezers with Computer Generated Hologram

Multi-beam optical tweezers is important technique to manipulate multiple small objects. Computer Generated Hologram (CGH) is one of the techniques and it can trap more than 200 objects in three dimension. For dexterous micromanipulation, it is useful to apply robotics into optical tweezers. In this research, we designed the optical system and control system of […]
Aug, 17

Regular Expression Matching and Operational Semantics

Many programming languages and tools, ranging from grep to the Java String library, contain regular expression matchers. Rather than first translating a regular expression into a deterministic finite automaton, such implementations typically match the regular expression on the fly. Thus they can be seen as virtual machines interpreting the regular expression much as if it […]
Aug, 16

Training Logistic Regression and SVM on 200GB Data Using b-Bit Minwise Hashing and Comparisons with Vowpal Wabbit (VW)

We generated a dataset of 200 GB with 10^9 features, to test our recent b-bit minwise hashing algorithms for training very large-scale logistic regression and SVM. The results confirm our prior work that, compared with the VW hashing algorithm (which has the same variance as random projections), b-bit minwise hashing is substantially more accurate at […]
Aug, 16

Topical perspective on massive threading and parallelism

Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified […]
Aug, 16

Tileable BTF

This paper presents a modular framework to efficiently apply the bidirectional texture functions (BTF) onto object surfaces. The basic building blocks are the BTF tiles. By constructing one set of BTF tiles, a wide variety of objects can be textured seamlessly without resynthesizing the BTF. The proposed framework nicely decouples the surface appearance from the […]
Aug, 16

Browsing Large Image Datasets through Voronoi Diagrams

Conventional browsing of image collections use mechanisms such as thumbnails arranged on a regular grid or on a line, often mounted over a scrollable panel. However, this approach does not scale well with the size of the datasets (number of images). In this paper, we propose a new thumbnail-based interface to browse large collections of […]
Aug, 16

An algorithm-architecture co-design framework for gridding reconstruction using FPGAs

Gridding is a method of interpolating irregularly sampled data on to a uniform grid and is a critical image reconstruction step in several applications which operate on non-Cartesian sampled data. In this paper, we present an algorithm-architecture co-design framework for accelerating gridding using FPGAs. We present a parameterized hardware library for accelerating gridding to support […]
Aug, 16

Accelerating the Nonuniform Fast Fourier Transform Using FPGAs

We present an FPGA accelerator for the Non-uniform Fast Fourier Transform, which is a technique to reconstruct images from arbitrarily sampled data. We accelerate the compute-intensive interpolation step of the NuFFT Gridding algorithm by implementing it on an FPGA. In order to ensure efficient memory performance, we present a novel FPGA implementation for Geometric Tiling […]
Aug, 16

SHARC: A streaming model for FPGA accelerators and its application to Saliency

Reconfigurable hardware such as FPGAs are being increasingly employed for accelerating compute-intensive applications. While recent advances in technology have increased the capacity of FPGAs, lack of standard models for developing custom accelerators creates issues with scalability and compatibility. We present SHARC – Streaming Hardware Accelerator with Run-time Configurability, for an FPGA-based accelerator. This model is […]
Aug, 16

Automatic Point Target Detection for Interactive Visual Analysis of SAR Images

Point target analysis is an important tool to analyze the quality of SAR images. To permit interactive visual analysis, visualization applications need to automatically detect point targets in a SAR image and estimate associated quality measurements such as the peak sidelobe ratio (PSLR). This task is computationally expensive. In this paper, we propose methods for […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org