2925

Posts

Feb, 9

Accelerating urban fast response Lagrangian dispersion simulations using inexpensive graphics processor parallelism

Owing to the potential consequences associated with accidental or deliberate releases of chemical or biological agents in urban areas, fast response urban dispersion models must rapidly provide solutions that can be easily analyzed by researchers and emergency responders. In this paper, we describe a novel application of an existing Lagrangian dispersion modeling system to achieve […]
Feb, 9

Next-generation acceleration and code optimization for light transport in turbid media using GPUs

A highly optimized Monte Carlo (MC) code package for simulating light transport is developed on the latest graphics processing unit (GPU) built for general-purpose computing from NVIDIA – the Fermi GPU. In biomedical optics, the MC method is the gold standard approach for simulating light transport in biological tissue, both due to its accuracy and […]
Feb, 9

Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition

Good old on-line back-propagation for plain multi-layer perceptrons yields a very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images, and graphics cards to greatly speed up learning.
Feb, 9

Deep, Big, Simple Neural Nets for Handwritten Digit Recognition

Good old online backpropagation for plain multilayer perceptrons yields a very low 0.35% error rate on the MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images to avoid overfitting, and graphics cards to greatly speed up learning. Good […]
Feb, 9

Real-time GPU color-based segmentation of football players

In this paper, we propose a multi-camera application capable of processing high resolution images and extracting features based on colors patterns over graphic processing units (GPU). The goal is to work in real time under the uncontrolled environment of a sport event like a football match. Since football players are composed for diverse and complex […]
Feb, 9

Scalable SMT-based verification of GPU kernel functions

Interest in Graphical Processing Units (GPUs) is skyrocketing due to their potential to yield spectacular performance on many important computing applications. Unfortunately, writing such efficient GPU kernels requires painstaking manual optimization effort which is very error prone. We contribute the first comprehensive symbolic verifier for kernels written in CUDA C. Called the ‘Prover of User […]
Feb, 8

Mobile 3D Graphics

While advanced 3D graphics techniques are well understood and supported by industry standards, this is less true in the emerging mobile applications and games market. This book redresses this imbalance, providing an in-depth look at the new OpenGL ES (The Standard for Embedded Accelerated 3D Graphics), and shows what these new embedded systems graphics libraries […]
Feb, 8

Real-time relief mapping on arbitrary polygonal surfaces

This paper presents a technique for mapping relief textures onto arbitrary polygonal models in real time. In this approach, the mapping of the relief data is done in tangent space. As a result, it can be applied to polygonal representations of curved surfaces producing correct self-occlusions, interpenetrations, shadows and per-pixel lighting effects. The approach can […]
Feb, 8

Adaptive sampling of intersectable models exploiting image and object-space coherence

We present a sampling strategy and rendering framework for intersectable models, whose surface is implicitly defined by a black box intersection test that provides the location and normal of the closest intersection of a ray with the surface. To speed up image generation despite potentially slow intersection tests, our method exploits spatial coherence by adjusting […]
Feb, 8

VolQD: Direct Volume Rendering of Multi-million Atom Quantum Dot Simulations

In this work we present a hardware-accelerated direct volume rendering system for visualizing multivariate wave functions in semiconducting quantum dot (QD) simulations. The simulation data contains the probability density values of multiple electron orbitals for up to tens of millions of atoms, computed by the NEMO3-D quantum device simulator software run on large-scale cluster architectures. […]
Feb, 8

Automatic C-to-CUDA Code Generation for Affine Programs

Graphics Processing Units (GPUs) offer tremendous computational power. CUDA (Compute Unified Device Architecture) provides a multi-threaded parallel programming model, facilitating high performance implementations of general-purpose computations. However, the explicitly managed memory hierarchy and multi-level parallel view make manual development of high-performance CUDA code rather complicated. Hence the automatic transformation of sequential input programs into efficient […]
Feb, 8

Automatic program parallelization for multicore processors

With the advent of multi-core processors the problem of designing application that efficiently can utilize it performance become more and more important. Moreover developing programs for these processors requires from the programmers some additional, specific knowledge about the processor architecture. In multi-core systems efficient program execution is the main issue. It can even happen that […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: