2917

Posts

Feb, 8

Adaptive sampling of intersectable models exploiting image and object-space coherence

We present a sampling strategy and rendering framework for intersectable models, whose surface is implicitly defined by a black box intersection test that provides the location and normal of the closest intersection of a ray with the surface. To speed up image generation despite potentially slow intersection tests, our method exploits spatial coherence by adjusting […]
Feb, 8

VolQD: Direct Volume Rendering of Multi-million Atom Quantum Dot Simulations

In this work we present a hardware-accelerated direct volume rendering system for visualizing multivariate wave functions in semiconducting quantum dot (QD) simulations. The simulation data contains the probability density values of multiple electron orbitals for up to tens of millions of atoms, computed by the NEMO3-D quantum device simulator software run on large-scale cluster architectures. […]
Feb, 8

Automatic C-to-CUDA Code Generation for Affine Programs

Graphics Processing Units (GPUs) offer tremendous computational power. CUDA (Compute Unified Device Architecture) provides a multi-threaded parallel programming model, facilitating high performance implementations of general-purpose computations. However, the explicitly managed memory hierarchy and multi-level parallel view make manual development of high-performance CUDA code rather complicated. Hence the automatic transformation of sequential input programs into efficient […]
Feb, 8

Automatic program parallelization for multicore processors

With the advent of multi-core processors the problem of designing application that efficiently can utilize it performance become more and more important. Moreover developing programs for these processors requires from the programmers some additional, specific knowledge about the processor architecture. In multi-core systems efficient program execution is the main issue. It can even happen that […]
Feb, 8

Partitioning streaming parallelism for multi-cores: a machine learning based approach

Stream based languages are a popular approach to expressing parallelism in modern applications. The efficient mapping of streaming parallelism to multi-core processors is, however, highly dependent on the program and underlying architecture. We address this by developing a portable and automatic compiler-based approach to partitioning streaming programs using machine learning. Our technique predicts the ideal […]
Feb, 8

Real-Time Simulation and Visualization of Subject-Specific 3D Lung Dynamics

In this paper we discuss a framework for modeling the 3D lung dynamics of normal and diseased human subjects and visualizing them using an Augmented Reality (AR) based environment. The framework is based on the results obtained from pulmonary function tests and lung image-data of human subjects obtained from 4D High-Resolution Computed Tomography (HRCT). The […]
Feb, 8

Towards On-Line Digital Doubles

We present a modular system for real-time 3D-scanning of human bodies under motion. The high-resolution shape and colour appearance is captured by several scanning units positioned around the object of interest. Each of these units performs a foreground-background segmentation and computes a valid depth-range for the spatially neighbouring units. Multiple depth-ranges are combined in a […]
Feb, 8

Programmable shaders for deformation rendering

In this paper, we present a method for rendering deformations as part of the programmable shader pipeline of contemporary Graphical Processing Units. In our method, we allow general deformations including cuts. Previous approaches to deformation place the role of the GPU as a general purpose processor for computing vertex displacement. With the advent of vertex […]
Feb, 7

Data parallel acceleration of decision support queries using Cell/BE and GPUs

Decision Support System (DSS) workloads are known to be one of the most time-consuming database workloads that processes large data sets. Traditionally, DSS queries have been accelerated using large-scale multiprocessor. The topic addressed in this work is to analyze the benefits of using high-performance/low-cost processors such as the GPUs and the Cell/BE to accelerate DSS […]
Feb, 7

Fitting multi-planet transit models to photometric time-data series by evolution strategies

In this paper we present the application of an evolution strategy to the problem of detecting multi-planet transit events in photometric time-data series. Planetary transits occur when a planet regularly eclipses its host star, reducing stellar luminosity. The transit method is amongst the most successful detection methods for exoplanet and is presently performed by space […]
Feb, 7

Learning to Detect Roads in High-Resolution Aerial Images

Reliably extracting information from aerial imagery is a difficult problem with many practical applications. One specific case of this problem is the task of automatically detecting roads. This task is a difficult vision problem because of occlusions, shadows, and a wide variety of non-road objects. Despite 30 years of work on automatic road detection, no […]
Feb, 7

Axel: a heterogeneous cluster with FPGAs and GPUs

This paper describes a heterogeneous computer cluster called Axel. Axel contains a collection of nodes; each node can include multiple types of accelerators such as FPGAs (Field Programmable Gate Arrays) and GPUs (Graphics Processing Units). A Map-Reduce framework for the Axel cluster is presented which exploits spatial and temporal locality through different types of processing […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: