1862

Posts

Nov, 28

Time-varying clustering for local lighting and material design

Abstract This paper presents an interactive graphics processing unit (GPU)-based relighting system in which local lighting condition, surface materials and viewing direction can all be changed on the fly. To support these changes, we simulate the lighting transportation process at run time, which is normally impractical for interactive use due to its huge computational burden. […]
Nov, 28

Shader-based tessellation to save memory bandwidth in a mobile multimedia processor

In this paper, we propose an architecture of tessellation hardware to save memory bandwidth in a mobile multimedia processor. To reduce the implementation overhead, floating-point computations of tessellation are accelerated by the conventional GPU pipeline, and only tessellation-specific control logic is handled by an additional hardware unit. Tightly coupled with a vertex shader, the additional […]
Nov, 28

Complexity effective memory access scheduling for many-core accelerator architectures

Modern DRAM systems rely on memory controllers that employ out-of-order scheduling to maximize row access locality and bank-level parallelism, which in turn maximizes DRAM bandwidth. This is especially important in graphics processing unit (GPU) architectures, where the large quantity of parallelism places a heavy demand on the memory system. The logic needed for out-of-order scheduling […]
Nov, 27

2011 Symposium on Application Accelerators in High Performance Computing (SAAHPC’11)

What do GPUs, FPGAs, vector processors and other exotic special-purpose chips have in common? They are advanced processor architectures that the scientific community is using to accelerate computationally demanding applications. While high-performance computing systems that use application accelerators are still rare, they will be the norm rather than the exception in the near future. The […]
Nov, 27

Aurally and visually enhanced audio search with soundtorch

Finding a specific or an artistically appropriate sound in a vast collection comprising thousands of audio files containing recordings of, say, footsteps, gunshots, and thunderclaps easily becomes a chore. To improve on this, we have developed an enhanced auditory and graphical zoomable user interface that leverages the human brain’s capability to single out sounds from […]
Nov, 27

Interactive Pixel-Accurate Free Viewpoint Rendering from Images with Silhouette Aware Sampling

We present an integrated, fully GPU-based processing pipeline to interactively render new views of arbitrary scenes from calibrated but otherwise unstructured input views. In a two-step procedure, our method first generates for each input view a dense proxy of the scene using a new multi-view stereo formulation. Each scene proxy consists of a structured cloud […]
Nov, 27

Evenly Spaced Streamlines for Surfaces: An Image-Based Approach

Abstract We introduce a novel, automatic streamline seeding algorithm for vector fields defined on surfaces in 3D space. The algorithm generates evenly spaced streamlines fast, simply and efficiently for any general surface-based vector field. It is general because it handles large, complex, unstructured, adaptive resolution grids with holes and discontinuities, does not require a parametrization, […]
Nov, 27

A Massively Parallel Architecture for Bioinformatics

Today’s general purpose computers lack in meeting the requirements on computing performance for standard applications in bioinformatics like DNA sequence alignment, error correction for assembly, or TFBS finding. The size of DNA sequence databases doubles twice a year. On the other hand the advance in computing performance per unit cost only doubles every 2 years. […]
Nov, 27

Accelerating error correction in high-throughput short-read DNA sequencing data with CUDA

Emerging DNA sequencing technologies open up exciting new opportunities for genome sequencing by generating read data with a massive throughput. However, produced reads are significantly shorter and more error-prone compared to the traditional Sanger shotgun sequencing method. This poses challenges for de-novo DNA fragment assembly algorithms in terms of both accuracy (to deal with short, […]
Nov, 27

Parallel reconstruction of neighbor-joining trees for large multiple sequence alignments using CUDA

Computing large multiple protein sequence alignments using progressive alignment tools such as ClustalW requires several hours on state-of-the-art workstations. ClustalW uses a three-stage processing pipeline: (i) pairwise distance computation; (ii) phylogenetic tree reconstruction; and (iii) progressive multiple alignment computation. Previous work on accelerating ClustalW was mainly focused on parallelizing the first stage and achieved good […]
Nov, 27

Towards Accelerated Computation of Atmospheric Equations Using CUDA

Main objective of this paper is to outline possibleways how to achieve a substantial acceleration in caseof advection-diffusion equation (A-DE) calculation,which is commonly used for a description of thepollutant behavior in atmosphere. A-DE is a kind ofpartial differential equation (PDE) and in general caseit is usually solved by numerical integration due to itshigh complexity. These […]
Nov, 27

Boids that see: Using self-occlusion for simulating large groups on GPUs

Behavioral models have been used in the entertainment industry to increase the realism in the simulation of large groups of individuals. Unfortunately, the classical models can be very compute-intensive when very large groups are considered, reducing its applicability in games and other interactive systems. In this article we explore both search space reduction and parallelism […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: