5697

Posts

Sep, 19

Computing without processors

Heterogeneous systems allow us to target our programming to the appropriate environment. From the programmer’s perspective the distinction between hardware and software is being blurred. As programmers struggle to meet the performance requirements of today’s systems, they will face an ever increasing need to exploit alternative computing elements such as GPUs (graphics processing units), which […]
Sep, 19

Real-time ray casting of algebraic B-spline surfaces

Piecewise algebraic B-spline surfaces (ABS surfaces) are capable of modeling globally smooth shapes of arbitrary topology. These can be potentially applied in geometric modeling, scientific visualization, computer animation and mathematical illustration. However, real-time ray casting the surface is still an obstacle for interactive applications, due to the large amount of numerical root findings of nonlinear […]
Sep, 19

The TheLMA project: Multi-GPU implementation of the lattice Boltzmann method

In this paper, we describe the implementation of a multi-graphical processing unit (GPU) fluid flow solver based on the lattice Boltzmann method (LBM). The LBM is a novel approach in computational fluid dynamics, with numerous interesting features from a computational, numerical, and physical standpoint. Our program is based on CUDA and uses POSIX threads to […]
Sep, 19

Improving SIMD efficiency for parallel Monte Carlo light transport on the GPU

Monte Carlo Light Transport algorithms such as Path Tracing (PT), Bi-Directional Path Tracing (BDPT) and Metropolis Light Transport (MLT) make use of random walks to sample light transport paths. When parallelizing these algorithms on the GPU the stochastic termination of random walks results in an uneven workload between samples, which reduces SIMD efficiency. In this […]
Sep, 19

Randomized selection on the GPU

We implement here a fast and memory-sparing probabilistic top k selection algorithm on the GPU. The algorithm proceeds via an iterative probabilistic guess-and-check process on pivots for a three-way partition. When the guess is correct, the problem is reduced to selection on a much smaller set. This probabilistic algorithm always gives a correct result and […]
Sep, 19

Hybrid smoothed particle hydrodynamics

We present a new algorithm for enforcing incompressibility for Smoothed Particle Hydrodynamics (SPH) by preserving uniform density across the domain. We propose a hybrid method that uses a Poisson solve on a coarse grid to enforce a divergence free velocity field, followed by a local density correction of the particles. This avoids typical grid artifacts […]
Sep, 19

Simpler and faster HLBVH with work queues

A recently developed algorithm called Hierachical Linear Bounding Volume Hierarchies (HLBVH) has demonstrated the feasibility of reconstructing the spatial index needed for ray tracing in real-time, even in the presence of millions of fully dynamic triangles. In this work we present a simpler and faster variant of HLBVH, where all the complex book-keeping of prefix […]
Sep, 19

A GPU-tailored approach for training kernelized SVMs

We present a method for efficiently training binary and multiclass kernelized SVMs on a Graphics Processing Unit (GPU). Our methods apply to a broad range of kernels, including the popular Gaus- sian kernel, on datasets as large as the amount of available memory on the graphics card. Our approach is distinguished from earlier work in […]
Sep, 19

Stream computing on graphics hardware

The raw compute performance of today’s graphics processor is truly amazing. With peak performance of over 60 GFLOPS, the compute power of the graphics processor (GPU) dwarfs that of today’s commodity CPU at a price of only a few hundred dollars. As the programmability and performance of modern graphics hardware continues to increase, many researchers […]
Sep, 19

A small-world network model for distributed storage of semantic metadata

The growing uptake of semantic web and grid ideas is raising the importance of optimising distribution algorithms for semantic metadata. While it is not yet clear how real-world metadata distribution patterns ought to evolve, practical experience of social and technical networks suggests that a small-world pattern is desirable and practical. We explore simulated small-world networks […]
Sep, 19

Using many-core hardware to correlate radio astronomy signals

A recent development in radio astronomy is to replace traditional dishes with many small antennas. The signals are combined to form one large, virtual telescope. The enormous data streams are cross-correlated to filter out noise. This is especially challenging, since the computational demands grow quadratically with the number of data streams. Moreover, the correlator is […]
Sep, 19

An adaptative game loop architecture with automatic distribution of tasks between CPU and GPU

This article presents a new architecture to implement all game loop models for games and real-time applications that use the GPU as a mathematics and physics coprocessor, working in parallel processing mode with the CPU. The presented model applies automatic task distribution concepts. The architecture can apply a set of heuristics defined in Lua scripts […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: