6484

Posts

Nov, 28

Parallel Pseudo-Random Number Generation

This is a preliminary report on parallel pseudo-random number generation. It was written under tight time constraints, so makes no claim to being an exhaustive survey of the field, which is already extensive, and in a state of flux as new computer architectures are introduced.
Nov, 28

Domain-Specific Optimizations Supporting Real-Time Image Compression

The work focuses on utilization of massivelly parallel processors for image compression acceleration. The text of the work studies GPU architecture, common GPU programming frameworks, and domain specific languages providing higher-level programming abstraction. The aim of the PhD thesis is to contribute to the effective software development for massively parallel processors through a domain specific […]
Nov, 27

Assembly of finite element methods on graphics processors

Recently, graphics processing units (GPUs) have had great success in accelerating many numerical computations. We present their application to computations on unstructured meshes such as those in finite element methods. Multiple approaches in assembling and solving sparse linear systems with NVIDIA GPUs and the Compute Unified Device Architecture (CUDA) are created and analyzed. Multiple strategies […]
Nov, 27

Computing room acoustics with CUDA-3D FDTD schemes with boundary losses and viscosity

In seeking to model realistic room acoustics, direct numerical simulation can be employed. This paper presents 3D Finite Difference Time Domain schemes that incorporate losses at boundaries and due to the viscosity of air. These models operate within a virtual room designed on a detailed floor plan. The schemes are computed at 44.1kHz, using large-scale […]
Nov, 27

A GPU-based Simulation for Stochastic Computing

Stochastic computing performs operations using streams of bits that represent probability values instead of deterministic values. An important benefit of stochastic computing is that it can tolerate a large number of failures in a noisy system. Additionally, for the VLSI implementation of a sophisticated algorithm, a stochastic implementation can consume much less hardware with lower […]
Nov, 27

Scalable Multi-Cache Simulation Using GPUs

Software simulation is the primary tool used for evaluation of processor design. Simulation offers better accuracy than analytical models and is an important evaluation step before actually fabricating a chip. Unfortunately, simulator speeds are slow — a conventional cycle-accurate simulator will be unable to keep up with increasing core counts in modern processor design. Parallel […]
Nov, 27

Towards paradisEO-MO-GPU: a framework for GPU-based local search metaheuristics

This paper is a major step towards a pioneering software framework for the reusable design and implementation of parallel metaheuristics on Graphics Processing Units (GPU). The objective is to revisit the ParadisEO framework to allow its utilization on GPU accelerators. The focus is on local search metaheuristics and the parallel exploration of their neighborhood. The […]
Nov, 27

Overlapping Computation and Communication for Advection on Hybrid Parallel Computers

We describe computational experiments exploring the performance improvements from overlapping computation and communication on hybrid parallel computers. Our test case is explicit time integration of linear advection with constant uniform velocity in a three-dimensional periodic domain. The test systems include a Cray XT5, a Cray XE6, and two multicore Infiniband clusters with different generations of […]
Nov, 27

Numerical Precision and Benchmarking Very-High-Order Integration of Particle Dynamics on GPU Accelerators

GPUs offer a powerful acceleration platform for many scientific applications. Numerical integration of classical Newtonian dynamical particles often requires very high-order numerical accuracy. We assess the floating-point precision and performance of various GPUs for applications involving high-order time-step integration methods for particle model simulations using N-squared interactions. We demonstrate how high-order algorithms can be expressed […]
Nov, 27

GPU Implementation of Spiking Neural Networks for Color Image Segmentation

Spiking neural networks (SNN) are powerful computational model inspired by the human neural system for engineers and neuroscientists to simulate intelligent computation of the brain. Inspired by the visual system, various spiking neural network models have been used to process visual images. However, it is time-consuming to simulate a large scale of spiking neurons in […]
Nov, 27

A low-cost 3D human interface device using GPU-based optical flow algorithms

Except for a few cases, nowadays it is very common to find a camera embedded in a consumer grade laptop, notebook, mobile internet device (MID), mobile phone or handheld game console. Some of them also have a Graphic Processing Unit (GPU) to handle 3D graphics and other related tasks. This trend will probably continue in […]
Nov, 27

A GPU based Parallel Hierarchical Fuzzy ART Clustering

Hierarchical clustering is an important and powerful but computationally extensive operation. Its complexity motivates the exploration of highly parallel approaches such as Adaptive Resonance Theory (ART). Although ART has been implemented on GPU processors, this paper presents the first hierarchical ART GPU implementation we are aware of. Each ART layer is distributed in the GPU’s […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: