Posts
Dec, 11
Value Prediction and Speculative Execution on GPU
GPUs and CPUs have fundamentally different architectures. It is conventional wisdom that GPUs can accelerate only those applications that exhibit very high parallelism, especially vector parallelism such as image processing. In this paper, we explore the possibility of using GPUs for value prediction and speculative execution: we implement software value prediction techniques to accelerate programs […]
Dec, 11
Volume and Isosurface Rendering with GPU-Accelerated Cell Projection
We present an efficient GPU-based implementation of the Projected Tetrahedra (PT) algorithm. By reducing most of the CPU-GPU data transfer, the algorithm achieves interactive frame rates (up to 2.0 M Tets/s) on current graphics hardware. Since no topology information is stored, it requires substantially less memory than recent interactive ray casting approaches. The method uses […]
Dec, 11
Distributed Texture Memory in a Multi-GPU Environment
In this paper we present a consistent, distributed, shared memory system for GPU texture memory. This model enables the virtualization of texture memory and the transparent, scalable sharing of texture data across multiple GPUs. Textures are stored as pages, and as textures are read or written, our system satisfies requests for pages on demand while […]
Dec, 11
Zippy: A Framework for Computation and Visualization on a GPU Cluster
Due to its high performance/cost ratio, a GPU cluster is an attractive platform for large scale general-purpose computation and visualization applications. However, the programming model for high performance general-purpose computation on GPU clusters remains a complex problem. In this paper, we introduce the Zippy frame-work, a general and scalable solution to this problem. It abstracts […]
Dec, 11
GPU accelerated radio astronomy signal convolution
The increasing array size of radio astronomy interferometers is causing the associated computation to scale quadratically with the number of array signals. Consequently, efficient usage of alternate processing architectures should be explored in order to meet this computational challenge. Affordable parallel processors have been made available to the general scientific community in the form of […]
Dec, 11
Program optimization carving for GPU computing?
Contemporary many-core processors such as the GeForce 8800 GTX enable application developers to utilize various levels of parallelism to enhance the performance of their applications. However, iterative optimization for such a system may lead to a local performance maximum, due to the complexity of the system. We propose program optimization carving, a technique that begins […]
Dec, 11
Accelerating molecular dynamics simulations using Graphics Processing Units with CUDA
Molecular dynamics is an important computational tool to simulate and understand biochemical processes at the atomic level. However, accurate simulation of processes such as protein folding requires a large number of both atoms and time steps. This in turn leads to huge runtime requirements. Hence, finding fast solutions is of highest importance to research. In […]
Dec, 11
Displacement Mapping on the GPU – State of the Art
This paper reviews the latest developments of displacement mapping algorithms implemented on the vertex, geometry, and fragment shaders of graphics cards. Displacement mapping algorithms are classified as per-vertex and per-pixel methods. Per-pixel approaches are further categorized as safe algorithms that aim at correct solutions in all cases, to unsafe techniques that may fail in extreme […]
Dec, 11
GPU-boosted online image matching
Matching feature points between images is a key point in many computer vision tasks. As the number of images increases, this rapidly becomes a bottleneck. We here present how to use the power of GPUs to obtain image matching in typically 20 ms for one thousand points. This speedup makes applications like interactive image matching […]
Dec, 11
Future graphics architectures
Graphics architectures are in the midst of a major transition. In the past, these were specialized architectures designed to support a single rendering algorithm: the standard Z buffer. Realtime 3D graphics has now advanced to the point where the Z-buffer algorithm has serious shortcomings for generating the next generation of higher-quality visual effects demanded by […]
Dec, 11
The impact of accelerator processors for high-throughput molecular modeling and simulation
The recent introduction of cost-effective accelerator processors (APs), such as the IBM Cell processor and Nvidia’s graphics processing units (GPUs), represents an important technological innovation which promises to unleash the full potential of atomistic molecular modeling and simulation for the biotechnology industry. Present APs can deliver over an order of magnitude more floating-point operations per […]
Dec, 10
The 20th International ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC 2011
HPDC is the premier computer science conference for presenting new results relating to large scale high performance and distributed systems used in science and industry. For twenty years, HPDC has been at the center of new discoveries in clusters, grids, clouds, and parallel and multicore computers.