Posts
Dec, 14
GPU Rendering of Relief Mapped Conical Frusta
This paper proposes to use relief-mapped conical frusta (cones cut by planes) to skin skeletal objects. Based on this representation, current programmable graphics hardware can perform the rendering with only minimal communication between the CPU and GPU. A consistent definition of conical frusta including texture parametrization and a continuous surface normal is provided. Rendering is […]
Dec, 14
GPU-Assisted High Quality Particle Rendering
Visualizing dynamic participating media in particle form by fully solving equations from the light transport theory is a computationally very expensive process. In this paper, we present a computational pipeline for particle volume rendering that is easily accelerated by the current GPU. To fully harness its massively parallel computing power, we transform input particles into […]
Dec, 14
A Flexible Kernel for Adaptive Mesh Refinement on GPU
We present a flexible GPU kernel for adaptive on-the-fly refinement of meshes with arbitrary topology. By simply reserving a small amount of GPU memory to store a set of adaptive refinement patterns, on-the-fly refinement is performed by the GPU, without any preprocessing nor additional topology data structure. The level of adaptive refinement can be controlled […]
Dec, 14
A GPU based real-time GPS software receiver
Off-the-shelf graphics processing units provide low-cost massive parallel computing performance, which can be utilized for the implementation of a GPS software receiver. In order to realize a real-time capable system the crucial stages of the receiver should be optimized to suit the requirements of a parallel processor. Moreover, the receiver should be capable to provide […]
Dec, 14
Scalable Programming Models for Massively Multicore Processors
Including multiple cores on a single chip has become the dominant mechanism for scaling processor performance. Exponential growth in the number of cores on a single processor is expected to lead in a short time to mainstream computers with hundreds of cores. Scalable implementations of parallel algorithms will be necessary in order to achieve improved […]
Dec, 14
Specular Effects on the GPU: State of the Art
This survey reviews algorithms that can render specular, i.e. mirror reflections, refractions, and caustics on the GPU. We establish a taxonomy of methods based on the three main different ways of representing the scene and computing ray intersections with the aid of the GPU, including ray tracing in the original geometry, ray tracing in the […]
Dec, 14
OpenMP in Multicore Architectures (tech. report)
OpenMP is an API (application program interface) used to explicitly direct multi-threaded, shared memory parallelism. With the advent of Multi-core processors, there has been renewed interest in parallelizing programs. Multi-core offers support to execute threads in parallel but at the same time the cost of communication is very less. This opens up new domains to […]
Dec, 14
OpenMP on Multicore Architectures
Dualcore processors are already ubiquitous, quadcore processors will spread out this year, systems with a larger number of cores exist, and more are planned. Some cores even execute multiple threads. Are these processors just SMP systems on a chip? Is OpenMP ready to be used for these architectures? We take a look at the cache […]
Dec, 14
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures
Understanding the most efficient design and utilization of emerging multicore systems is one of the most challenging questions faced by the mainstream and scientific computing industries in several decades. Our work explores multicore stencil (nearest-neighbor) computations — a class of algorithms at the heart of many structured grid codes, including PDF solvers. We develop a […]
Dec, 14
High-performance and Hardware-aware Computing: Proceedings of the First International Workshop on New Frontiers in High-performance and Hardware-aware Computing (HipHaC’08)
The HipHaC workshop aims at combining new aspects of parallel, heterogeneous, and reconfigurable microprocessor technologies with concepts of high-performance computing and, particularly, numerical solution methods. Compute- and memory-intensive applications can only benefit from the full hardware potential if all features on all levels are taken into account in a holistic approach.
Dec, 13
Embracing Heterogeneity: Parallel Programming for Changing Hardware
Computer systems are undergoing significant change: to improve performance and efficiency, architects are exposing more microarchitectural details directly to programmers. Software that exploits specialized accelerators, such as GPUs, and specialized processor features, such as software-controlled memory, exposes limitations in existing compiler and OS infrastructure. In this paper we propose a pragmatic approach, motivated by our […]
Dec, 13
Pseudo-random number generators for Monte Carlo simulations on ATI Graphics Processing Units
Basic uniform pseudo-random number generators are implemented on ATI Graphics Processing Units (GPU). The performance results of the realized generators (multiplicative linear congruential (GGL), XOR-shift (XOR128), RANECU, RANMAR, RANLUX and Mersenne Twister (MT19937)) on CPU and GPU are discussed. The obtained speed up factor is hundreds of times in comparison with CPU. RANLUX generator is […]