Posts
Aug, 10
A Flexible Multi-Volume Shader Framework for Arbitrarily Intersecting Multi-Resolution Datasets
We present a powerful framework for 3D-texture-based rendering of multiple arbitrarily intersecting volumetric datasets. Each volume is represented by a multi-resolution octree-based structure and we use out-of-core techniques to support extremely large volumes. Users define a set of convex polyhedral volume lenses, which may be associated with one or more volumetric datasets. The volumes or […]
Aug, 10
Real-time continuum grass
Simulating grass field in real-time has many applications, such as in virtual reality and games. Modeling accurate grass-grass, grass-object and grass-wind interactions requires a high computational cost. In this paper, we present a method to simulate grass field in real-time by considering grass field as a two dimensional grid-based continuum and shifting the complex interactions […]
Aug, 10
Performance evaluation and optimization of random memory access on multicores with high productivity
The slow progress in memory access latencies in comparison to CPU speeds has resulted in memory accesses dominating code performance. While architectural enhancements have benefited applications with data locality and sequential access, random memory access still remains a cause for concern. Several benchmarks have been proposed to evaluate the random memory access performance on multicore […]
Aug, 10
A parallel mapping of optical flow to Compute Unified Device Architecture for motion-based image segmentation
A correlation-based optical flow algorithm using compute unified device architecture (CUDA) technology to achieve fast motion-based image segmentation is described. Using CUDA, a 240 processor GPU implementation of an optimized correlation-based optical flow algorithm allows segmentation to be achieved at high frame rates on high-resolution video sequences. Details of the mapping of the optical flow […]
Aug, 10
Approaches for parallelizing reductions on modern GPUs
GPU hardware and software has been evolving rapidly. CUDA versions 1.1 and higher started supporting atomic operations on device memory, and CUDA versions 1.2 and higher started supporting atomic operations on shared memory. This paper focuses on parallelizing applications involving reductions on GPUs. Prior to the availability of support for locking, these applications could only […]
Aug, 9
G-NetMon: A GPU-accelerated Network Performance Monitoring System
At Fermilab, we have prototyped a GPU-accelerated network performance monitoring system, called G-NetMon, to support large-scale scientific collaborations. In this work, we explore new opportunities in network traffic monitoring and analysis with GPUs. Our system exploits the data parallelism that exists within network flow data to provide fast analysis of bulk data movement between Fermilab […]
Aug, 9
G-NetMon: A GPU-accelerated Network Performance Monitoring System for Large Scale Scientific Collaborations
Network traffic is difficult to monitor and analyze, especially in high-bandwidth networks. Performance analysis, in particular, presents extreme complexity and scalability challenges. GPU (Graphics Processing Unit) technology has been utilized recently to accelerate general purpose scientific and engineering computing. GPUs offer extreme thread-level parallelism with hundreds of simple cores. Their data-parallel execution model can rapidly […]
Aug, 9
Real-Time All-in-Focus Video-Based Rendering Using A Network Camera Array
We present a real-time video-based rendering system using a network camera array. Our system consists of 64 commodity network cameras that are connected to a single PC through a Gigabit Ethernet. To render a high-quality novel view, we estimate a view-dependent per-pixel depth map in real-time by using a layered representation. The rendering algorithm is […]
Aug, 9
Graphics Processing Units for Handhelds
During the past few years, mobile phones and other handheld devices have gone from only handling dull text-based menu systems to, on an increasing number of models, being able to render high-quality three-dimensional graphics at high frame rates. This paper is a survey of the special considerations that must be taken when designing graphics processing […]
Aug, 9
Geospatial visualization using hardware accelerated real-time volume rendering
We present a visualization framework using direct volume rendering techniques that achieves real-time performance and high image quality. The visualization program runs on a desktop as well as in an immersive environment. The application is named HurricaneVis, and it uses OpenGL, GLSL and VTK. For immersive visualization VRJuggler is added. To achieve real-time rendering rates […]
Aug, 9
Performance Evaluation of Feature Extraction Algorithm on GPGPU
Nvidia’s GPGPU based Compute Unified Device Architecture (CUDA) is a software platform for massively parallel high-performance computing on GPU. It provide several key abstractions- a hierarchy of thread block, shared memory, and barrier synchronization. This model has proven quite successful at programming multithreaded many core GPUs and scale transparently to hundreds of cores: many industry […]
Aug, 9
Cache Miss Analysis for GPU Programs Based on Stack Distance Profile
Using the graphics processing unit (GPU) to accelerate the general purpose computation has attracted much attention from both the academia and industry due to GPU’s powerful computing capacity. Thus optimization of GPU programs has become a popular research direction. In order to support the general purpose computing more efficiently, GPU has integrated the general data […]

