6372

Posts

Nov, 16

Simulations of Large Particle Systems in Real Time

Simulation of interacting particle systems has been a well established method for many years now. Such systems can span different scales, including microscopic (where particles represent atoms, as in Molecular Dynamics simulations) as well as macroscopic. In the latter case, growing interest is put into Smoothed Particle Hydrodynamics approach. Traditionally, over many years, simulation of […]
Nov, 16

Object Space Based Collision Detection for Cloth Simulation on the GPU

This paper presents an approach for cloth-body collision detection in computer graphics simulations of clothing. It is an object-space based algorithm implemented in OpenCL on the GPU. The underlying idea behind this work is to speed up the solution of the collision detection problem by utilizing the excessive computational capacity of contemporary GPUs. Results of […]
Nov, 16

Parallel Approach for Longest Common Subsequence problem on GPU

Recent developments in genomic and molecular technologies produced a tremendous amount of information related to molecular biology. The management and analysis of these biological data require intensive computing power. Sequence aligning is one of the algorithmic tools in bioinformatics to look for resemblance among sequences of amino acids. The longest common subsequence (LCS) of biological […]
Nov, 16

Scope for performance enhancement of CMU Sphinx by parallelising with OpenCL

Automatic Speech Recognition (ASR) system that utilises many-core Graphics Processing Unit (GPU) architecture enables myriad of emerging applications like mobile based speech recognition, multimedia content transcription, and voice based language translation. This article discusses the feasibility and challenges in performance enhancement of CMU Sphinx-3.08 by parallelising the data-parallel parts using OpenCL that can utilise the […]
Nov, 16

CUDA and OpenCL-based asynchronous PSO

In "synchronous" PSO, positions and velocities of all particles are updated in turn in each "generation", after which each particle’s new fitness is evaluated. The value of the social attractor is only updated at the end of each generation, when the fitness values of all particles are known. The "asynchronous" version of PSO, instead, allows […]
Nov, 16

Grids, Clouds and Virtualization

Provides a thorough introduction and overview of existing technologies in grids, clouds and virtualization, including a brief history of the field. Examines both business and scientific applications of grids and clouds. Presents contributions from an international selection of experts in the field. Research into grid computing has been driven by the need to solve large-scale, […]
Nov, 16

Lattice-Boltzmann simulation of the shallow-water equations with fluid-structure interaction on multi-and manycore processors

We present an efficient method for the simulation of laminar fluid flows with free surfaces including their interaction with moving rigid bodies, based on the two-dimensional shallow water equations and the Lattice-Boltzmann method. Our implementation targets multiple fundamentally different architectures such as commodity multicore CPUs with SSE, GPUs, the Cell BE and clusters. We show […]
Nov, 16

Hybrid Parallelism for Volume Rendering on Large, Multi-and Many-core Systems

With the computing industry trending towards multi- and many-core processors, we study how a standard visualization algorithm, ray-casting volume rendering, can benefit from a hybrid parallelism approach. Hybrid parallelism provides the best of both worlds: using distributed-memory parallelism across a large numbers of nodes increases available FLOPs and memory, while exploiting shared-memory parallelism among the […]
Nov, 16

A Full-Depth Amalgamated Parallel 3D Geometric Multigrid Solver for GPU Clusters

Numerical computations of incompressible flow equations with pressure-based algorithms necessitate the solution of an elliptic Poisson equation, for which multigrid methods are known to be very efficient. In our previous work we presented a dual-level (MPI-CUDA) parallel implementation of the Navier-Stokes equations to simulate buoyancy-driven incompressible fluid flows on GPU clusters with simple iterative methods […]
Nov, 16

Prospects for scalable 3D FFTs on heterogeneous exascale systems

We consider the problem of implementing scalable three-dimensional fast Fourier transforms with an eye toward future exascale systems comprised of graphics co-processor (GPUs) or other similarly high-density compute units. We describe a new software implementation; derive and calibrate a suitable analytical performance model; and use this model to make predictions about potential outcomes at exascale, […]
Nov, 15

High Performance Data Mining Using R on Heterogeneous Platforms

The exponential increase in the generation and collection of data has led us in a new era of data analysis and information extraction. Conventional systems based on general-purpose processors are unable to keep pace with the heavy computational requirements of data mining techniques. High performance co-processors like GPUs and FPGAs have the potential to handle […]
Nov, 15

Variable selection in a GPU cluster using delta test

The work presented in this paper consists in an adaptation of a Genetic Algorithm (GA) to perform variable selection in an heterogeneous cluster where the nodes are themselves clusters of GPUs. Due to this heterogeneity, several mechanisms to perform a load balance will be discussed as well as the optimization of the fitness function to […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: