7137

Posts

Jan, 26

Parallel particle swarm optimization using GPGPU

This work presents a parallelization method for the Particle Swarm Optimization algorithm using a low-cost architecture: a General Purpose Graphics Processing Unit (GPGPU). The strategies to better suit the architecture main characteristics are addressed along success rates and convergence times for the optimization of Rastrigin’s and Ackley’s functions on a 30-dimensional search space, and compared […]
Jan, 26

Defocus Magnification with CUDA

In photography, the application of depth-of-field can be used to make the main subject more prominent. Photographer can modify the range of depthof-field by adjusting the aperture size. Unfortunately, due to the limitation caused by the physical diameter of the lens aperture and the area of the photodiode, the compact camera cannot control the depth-of-field […]
Jan, 26

GPU Nonlinear Fixed Points, with an application to GPU IFS Rendering

Nonlinear functions, including nonlinear iterated function systems, have interesting fixed points. We present a non-Lipschitz theoretical approach to nonlinear function system fixed points which generalizes to non-contractive functions, compare several methods for evaluating such fixed points on modern graphics hardware, and present a nonlinear generalization of Barnsley’s Deterministic Iteration Algorithm. Unlike the many existing randomized […]
Jan, 26

Accounting for Uncertainty in Medical Data: A CUDA Implementation of Normalized Convolution

The domain of medical imaging is naturally moving towards methods that can represent, and account for, local uncertainties in the image data. Even so, fast and efficient solutions that take uncertainty into account are not readily available even for common problems such as gradient estimation. In this work we present a CUDA implementation of Normalized […]
Jan, 26

CUDA raytracing algorithm for visualizing discrete element model output

A raytracing algorithm is constructed using the CUDA API for visualizing output from a CUDA discrete element model, which outputs spatial information in dynamic particle systems. The raytracing algorithm is optimized with constant memory and compilation flags, and performance is measured as a function of the number of particles and the number of pixels. The […]
Jan, 26

CUDA Fortran for Scientists and Engineers

This document in intended for scientists and engineers who develop or maintain computer simulations and applications in Fortran, and who would like to harness parallel processing power of graphics processing units (GPUs) to accelerate their code. The goal here is to provide the reader with the fundamentals of GPU programming using CUDA Fortran as well […]
Jan, 26

A Parallel Ant Colony Optimization Algorithm for the Travelling Salesman Problem: Improving Performance Using CUDA

The ant colony optimization (ACO) algorithm is a metaheuristic algorithm used for combinatorial optimization problems. It is a good choice for many hard combinatorial problems because it is more efficient that brute force methods and produces better solutions than greedy algorithms. However, ACO is computationally expensive, and it can still take a long time to […]
Jan, 26

Using a GPU, Online Diarization – Offline Diarization

This article presents a low-latency, online speaker diarization system ("who is speaking now?") based on the repeated execution of a GPU-optimized, highly efficient offline diarization system ("who spoke when"). The system fulfills all requirements of the diarization task, i.e., it does not require any a priori information about the input, including specific speaker models. In […]
Jan, 26

Parallel Symbolic Analysis of Large Analog Circuits on GPU Platforms

Graph-based symbolic technique is a viable tool for calculating the behavior or the characterization of an analog circuit. Traditional symbolic analysis tools typically are used to calculate the behavior or the characteristic of a circuit in terms of symbolic parameters (Gielen et al., 1994). The introduction of determinant decision diagrams based symbolic analysis technique allows […]
Jan, 26

Visual Data Mining Using the Point Distribution Tensor

We explore a novel algorithm to analyze arbitrary distributions of 3D-points. Using a direct tensor field visualization technique allows to easily identify regions of linear, planar or isotropic structure. This approach is very suitable for visual data mining and exemplified upon geoscience applications. It allows to distinguish, for example, power lines and flat terrains in […]
Jan, 25

Scalable Parallel Minimum Spanning Forest Computation

The proliferation of data in graph form calls for the development of scalable graph algorithms that exploit parallel processing environments. One such problem is the computation of a graph’s minimum spanning forest (MSF). Past research has proposed several parallel algorithms for this problem, yet none of them scales to large, high-density graphs. In this paper […]
Jan, 25

Parallel LDPC Decoder Implementation on GPU Based on Unbalanced Memory Coalescing

We consider flexible decoder implementation of low density parity check (LDPC) codes via compute-unified-devicearchitecture (CUDA) programming on graphics processing unit (GPU), a research subject of considerable recent interest. An important issue in LDPC decoder design based on CUDA-GPU is realizing coalesced memory access, a technique that reduces memory transaction time considerably. In previous works along […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: