Posts
Feb, 1
Uncontracted Rys Quadrature Implementation of up to G Functions on Graphical Processing Units
An implementation is presented of an uncontracted Rys quadrature algorithm for electron repulsion integrals, including up to g functions on graphical processing units (GPUs). The general GPU programming model, the challenges associated with implementing the Rys quadrature on these highly parallel emerging architectures, and a new approach to implementing the quadrature are outlined. The performance […]
Jan, 31
Towards Automated Learning of Object Detectors
Recognizing arbitrary objects in images or video sequences is a difficult task for a computer vision system. We work towards automated learning of object detectors from video sequences (without user interaction). Our system uses object motion as an important cue to detect independently moving objects in the input sequence. The largest object is always taken […]
Jan, 31
Batched Multi Triangulation
The multi triangulation framework (MT) is a very general approach for managing adaptive resolution in triangle meshes. The key idea is arranging mesh fragments at different resolution in a directed acyclic graph (DAG) which encodes the dependencies between fragments, thereby encompassing a wide class of multiresolution approaches that use hierarchies or DAGs with predefined topology. […]
Jan, 31
Advanced Multi-Frame Rate Rendering Techniques
Multi-frame rate rendering is a parallel rendering technique that renders interactive parts of the scene on one graphics card while the rest of the scene is rendered asynchronously on a second graphics card. The resulting color and depth images of both render processes are composited and displayed. This paper presents advanced multi-frame rate rendering techniques, […]
Jan, 31
Isosurface Extraction and View-Dependent Filtering from Time-Varying Fields Using Persistent Time-Octree (PTOT)
We develop a new algorithm for isosurface extraction and view-dependent filtering from large time-varying fields, by using a novel persistent time-octree (PTOT) indexing structure. Previously, the persistent octree (POT) was proposed to perform isosurface extraction and view-dependent filtering, which combines the advantages of the interval tree (for optimal searches of active cells) and of the […]
Jan, 31
Scientific Computing on Heterogeneous Architectures
The CPU has traditionally been the computational work horse in scientific computing, but we have seen a tremendous increase in the use of accelerators, such as Graphics Processing Units (GPUs), in the last decade. These architectures are used because they consume less power and offer higher performance than equivalent CPU solutions. They are typically also […]
Jan, 31
OpenRCL: Low-Power High-Performance Computing with Reconfigurable Devices
This work presents the Open Reconfigurable Computing Language (OpenRCL) system designed to enable low-power high-performance reconfigurable computing with imperative programming language such as C/C++. The key idea is to expose the FPGA platform as a compiler target for applications expressed in the OpenCL paradigm. To this end, we present a combination of low-level virtual machine […]
Jan, 31
Simulation and visualization of the Saint-Venant system using GPUs
We consider three high-resolution schemes for computing shallow-water waves as described by the Saint-Venant system and discuss how to develop highly efficient implementations using graphical processing units (GPUs). The schemes are well-balanced for lake-at-rest problems, handle dry states, and support linear friction models. The first two schemes handle dry states by switching variables in the […]
Jan, 31
Highly interactive computational steering for coupled 3D flow problems utilizing multiple GPUs
Most computational fluid dynamics (CFD) simulations require massive computational power which is usually provided by traditional High Performance Computing (HPC) environments. Although interactivity of the simulation process is highly appreciated by scientists and engineers, due to limitations of typical HPC environments, present CFD simulations are usually executed non interactively. A recent trend is to harness […]
Jan, 31
Top-Performance Tokenization and Small-Ruleset Regular Expression Matching: A Quantitative Performance Analysis and Optimization Study on the Cell/B.E. Processor
In the last decade, the volume of unstructured data that Internet and enterprise applications create and consume has been growing at impressive rates. The tools we use to process these data are search engines, business analytics suites, natural-language processors and XML processors. These tools rely on tokenization, a form of regular expression matching aimed at […]
Jan, 31
Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization
We explore the intersection between an emerging class of architectures and a prominent workload: GPGPUs (General-Purpose Graphics Processing Units) and regular expression matching, respectively. It is a challenging task because this workload — with its irregular, non-coalesceable memory access patterns — is very different from the regular, numerical workloads that run efficiently on GPGPUs. Small-ruleset […]
Jan, 30
A Novel Monte Carlo Noise Reduction Operator
We propose a novel Monte Carlo noise reduction operator in this article. We apply and extend the standard bilateral filtering method and build a new local adaptive noise reduction kernel. It first computes an initial estimate for the value of each pixel, and then applies bilateral filtering using this initial estimate in its range filter […]