Posts
Nov, 17
FAST: fast architecture sensitive tree search on modern CPUs and GPUs
In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous computing power by integrating multiple cores, each with wide vector units. There has been much work to exploit modern processor architectures for database primitives like scan, sort, join and aggregation. However, unlike other primitives, tree search presents significant challenges due to […]
Nov, 17
Scalable parallel programming with CUDA
Is CUDA the parallel programming model that application developers have been waiting for?
Nov, 17
Fast free-form deformation using graphics processing units
A large number of algorithms have been developed to perform non-rigid registration and it is a tool commonly used in medical image analysis. The free-form deformation algorithm is a well-established technique, but is extremely time consuming. In this paper we present a parallel-friendly formulation of the algorithm suitable for graphics processing unit execution. Using our […]
Nov, 17
A Graphics Parallel Memory Organization Exploiting Request Correlations
Real-time graphics applications require memory organizations featuring parallel pixel access and low-cost implementation. This work bases on a nonlinear skew mapping scheme and exploits the correlation between consecutive requests for pixels to design an efficient parallel memory organization. The mapping achieves parallel access, of mn pixels in various shapes, to the memory organized with mn […]
Nov, 17
permGPU: Using graphics processing units in RNA microarray association studies
BACKGROUND:Many analyses of microarray association studies involve permutation and bootstrap resampling, and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. RESULTS:We have developed a CUDA based implementation, permGPU, that employs graphics processing […]
Nov, 17
Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation with Nvidia CUDA Compatible Devices
In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for latent Dirichlet allocation (LDA) by using Nvidia CUDA compatible devices. While LDA is an efficient Bayesian multi-topic document model, it requires complicated computations for parameter estimation in comparison with other simpler document models, e.g. probabilistic latent semantic indexing, etc. Therefore, we […]
Nov, 17
Accelerating simultaneous algebraic reconstruction technique with motion compensation using CUDA-enabled GPU
PURPOSE: To accelerate the simultaneous algebraic reconstruction technique (SART) with motion compensation for speedy and quality computed tomography reconstruction by exploiting CUDA-enabled GPU. METHODS: Two core techniques are proposed to fit SART into the CUDA architecture: (1) a ray-driven projection along with hardware trilinear interpolation, and (2) a voxel-driven back-projection that can avoid redundant computation […]
Nov, 17
Eye-Full Tower: A GPU-based variable multibaseline omnidirectional stereovision system with automatic baseline selection for outdoor mobile robot navigation
In recent years, it can be observed that there is a gradual increase in the number of researchers and projects involved with the development of omnidirectional vision systems for various applications. The primary factors, which contributed towards this positive trend, are the availability of inexpensive and high resolution vision sensors, robust and fast computers and […]
Nov, 17
SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk of genetic epistasis in bipolar disorder
We developed a GPU-based analytical method, named as SHEsisEpi, which purely focuses on risk epistasis in a genome-wide association study (GWAS) of complex traits, excluding the contamination of marginal effects caused by single-locus association. We analyzed the Wellcome Trust Case Control Consortium’s (WTCCC) GWAS data of bipolar disorder (BPD) with 500K SNPs.
Nov, 17
Alignator: A GPU powered software package for robust fiducial-less alignment of cryo tilt-series
The robust alignment of tilt-series collected for cryo-electron tomography in the absence of fiducial markers, is a problem that, especially for tilt-series of vitreous sections, still represents a significant challenge. Here we present a complete software package that implements a cross-correlation based procedure that tracks similar image features that are present in several micrographs and […]
Nov, 17
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing
We present a Hessenberg reduction (HR) algorithm for hybrid systems of homogeneous multicore with GPU accelerators that can exceed 25 ? the performance of the corresponding LAPACK algorithm running on current homogeneous multicores. This enormous acceleration is due to proper matching of algorithmic requirements to architectural strengths of the system
Nov, 17
Direct numerical simulation of sub-grid structures in gas-solid flow — GPU implementation of macro-scale pseudo-particle modeling
Due to significant multi-scale heterogeneity, understanding sub-grid structures is critical to effective continuum-based description of gas-solid flow. However, it is challenging for both physical measurements and numerical simulations. In this article, with the macro-scale pseudo-particle method (MaPPM) implemented on a GPU-based HPC system, up to 30,000 fluidized solids are simulated using the N-S equation directly. […]

