12291

Posts

Jun, 12

Optimization Techniques on GPU: A Survey

In this paper, we present a comprehensive survey on parallelizing computations involved in optimization problem, on GPU using CUDA. Many researchers have reported significant speedup using CUDA on GPU. Stochastic algorithms, Metaheuristic algorithms and Heuristic algorithms i.e., Mixed Integer Non-linear Programming (MINLP), Central Force Optimization (CFO), Genetic Algorithms (GA), Particle Swarm Optimization (PSO), etc. are […]
Jun, 12

A GPU-accelerated immersive audio-visual framework for interaction with molecular dynamics using consumer depth sensors

With advances in computational power, the rapidly growing role of computational/simulation methodologies in the physical sciences, and the development of new human–computer interaction technologies, the field of interactive molecular dynamics seems destined to expand. In this paper, we describe and benchmark the software algorithms and hardware setup for carrying out interactive molecular dynamics utilizing an […]
Jun, 11

Parallel Prefix Scan with Compute Unified Device Architecture (CUDA)

Parallel prefix scan, also known as parallel prefix sum, is a building block for many parallel algorithms including polynomial evaluation, sorting and building data structures. This paper introduces prefix scan and also describes a step-by-step procedure to implement prefix scan efficiently with Compute Unified Device Architecture (CUDA). This paper starts with a basic naive algorithm […]
Jun, 11

Intersecting two families of sets on the GPU

The problem of intersecting two families of sets F and F’ is to find the family I of all the sets which are the intersection of some set in F and some other set in F’. In this paper we present an efficient parallel GPU-based approach, designed under CUDA architecture, to solve the problem. The […]
Jun, 11

GPU Implementation of Gaussian Processes

Gaussian process models (henceforth Gaussian Processes) provide a probabilistic, non-parametric framework for inferring posterior distributions over functions from general prior information and observed noisy function values. This, however, comes with a computational burden of O(N3) for training and O(N2) for prediction, where N is the size of the training set [1]. Therefore, this method does […]
Jun, 11

Enabling High Performance Computing in Cloud Infrastructure using rCUDA

With the dawn of virtualization and Infrastructureas-a-Service (IaaS), the comprehensive technical computing community is in view of the use of clouds for their technical computing needs. This is due to the relative scalability, ease of use, advanced user milieu customization abilities clouds provide, as well as many novel computing archetypes available for data-intensive applications. However, […]
Jun, 11

Canadian Hydrogen Intensity Mapping Experiment (CHIME) Pathfinder

A pathfinder version of CHIME (the Canadian Hydrogen Intensity Mapping Experiment) is currently being commissioned at the Dominion Radio Astrophysical Observatory (DRAO) in Penticton, BC. The instrument is a hybrid cylindrical interferometer designed to measure the large scale neutral hydrogen power spectrum across the redshift range 0.8 to 2.5. The power spectrum will be used […]
Jun, 9

Improvement Study of EEMD Decomposition Efficiency Based on CUDA Architecture

EEMD can inhibit mode mixing, which may occur in EMD, EEMD is a technology of adding many groups of white noise to original signal to do assisted analysis on the basis of EMD, however, it will greatly reduce the decomposition efficiency of the signal. In order to eliminate the effects of mode mixing, and improve […]
Jun, 9

Efficient all-against-all protein similarity matrix computation using OpenCL

In this report we introduced CLSW, a fast GPU-based Smith-Waterman score-only-alignment calculator. While generally applicable for any protein alignment problem, it was designed specifically as a proof-of-concept application for SIMAP. Even if we had only two weeks to develop a fully functional, validated and optimized implementation and all related concepts, our results show that in […]
Jun, 9

GPU-Accelerated Dynamic Functional Connectivity Analysis for Functional MRI Data Using OpenCL

Intense computations in engineering and science, especially bioinformatics have been made practical by the recent advances in Graphical Processing Unit (GPU) computing technology. In this study, implementation and performance evaluations for a GPU-accelerated dynamic functional connectivity (DFC) analysis, which is an analysis method for investigating dynamic interactions among different brain networks, is presented. Open Computing […]
Jun, 9

3D Skeleton Extraction Method using Potential Field on OpenCL

For 3D skeleton extraction, the algorithm based on generalized potential fields, known as the outstandingly flexible and robust method, is suffering from seriously heavy computational burden. In this paper, we put forward a parallel algorithm based on OpenCL heterogeneous parallel framework, which can make full use of the great computing power provided by heterogeneous model […]
Jun, 9

Multi-level parallelization for hybrid ACO

The Graphics-Processing-Unit (GPU) became one of the main platforms to design massively parallel metaheuristics. This advance is due to the highly parallel architecture of GPU and especially thanks to the publication of languages like CUDA. In this paper, we deal with a multilevel parallel hybrid Ant System (AS) to solve the Travelling Salesman Problem (TSP). […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: