2695

Posts

Jan, 20

Contouring for Power Systems Using Graphical Processing Units

To improve situational awareness in power systems, one useful tool used in control centers is bus (or substation) data contouring. Traditionally, the methods developed have used CPU processing, leading to long contour rendering times that reduce interactivity with the visualization. To improve interactivity and increase the data rate which can be handled, contouring methods utilizing […]
Jan, 20

SPRAT: Runtime processor selection for energy-aware computing

A commodity personal computer (PC) can be seen as a hybrid computing system equipped with two different kinds of processors, i.e. CPU and a graphics processing unit (GPU). Since the superiorities of GPUs in the performance and the power efficiency strongly depend on the system configuration and the data size determined at the runtime, a […]
Jan, 20

Evolution of image filters on graphics processor units using Cartesian Genetic Programming

Graphics processor units are fast, inexpensive parallel computing devices. Recently there has been great interest in harnessing this power for various types of scientific computation, including genetic programming. In previous work, we have shown that using the graphics processor provides dramatic speed improvements over a standard CPU in the context of fitness evaluation. In this […]
Jan, 20

Real-Time GPU-Based Voxel Carving with Systematic Occlusion Handling

We present an approach to compute the visual hulls of multiple people in real-time in the presence of occlusions. We prove that the resulting visual hulls are correct and minimal under occlusions. Our proposed algorithm runs completely on the GPU with framerates up to 50fps for multiple people using only one computer equipped with off-the-shelf […]
Jan, 20

Fast development of dense linear algebra codes on graphics processors

We present an application programming interface (API) for the C programming language that facilitates the development of dense linear algebra algorithms on graphics processors applying the FLAME methodology. The interface, built on top of the NVIDIA CUBLAS library, implements all the computational functionality of the FLAME/C interface. In addition, the API includes data transference routines […]
Jan, 20

Direct N-body Kernels for Multicore Platforms

We present an inter-architectural comparison of single-and double-precision direct n-body implementations on modern multicore platforms, including those based on the Intel Nehalem and AMD Barcelona systems, the Sony-Toshiba-IBM PowerXCell/8i processor, and NVIDA Tesla C870 and C1060 GPU systems. We compare our implementations across platforms on a variety of proxy measures, including performance, coding complexity, and […]
Jan, 20

Motion Estimation with Non-Local Total Variation Regularization

State-of-the-art motion estimation algorithms suffer from three major problems: Poorly textured regions, occlusions and small scale image structures. Based on the Gestalt principles of grouping we propose to incorporate a low level image segmentation process in order to tackle these problems. Our new motion estimation algorithm is based on non-local total variation regularization which allows […]
Jan, 20

Graph Analysis with High-Performance Computing

Large, complex graphs arise in many settings including the Internet, social networks, and communication networks. To study such data sets, the authors explored the use of high-performance computing (HPC) for graph algorithms. They found that the challenges in these applications are quite different from those arising in traditional HPC applications and that massively multithreaded machines […]
Jan, 20

TEDI: efficient shortest path query answering on graphs

Efficient shortest path query answering in large graphs is enjoying a growing number of applications, such as ranked keyword search in databases, social networks, ontology reasoning and bioinformatics. A shortest path query on a graph finds the shortest path for the given source and target vertices in the graph. Current techniques for efficient evaluation of […]
Jan, 20

Practical and Robust Stenciled Shadow Volumes for Hardware-Accelerated Rendering

Twenty-five years ago, Crow published the shadow volume approach for determining shadowed regions in a scene. A decade ago, Heidmann described a hardware-accelerated stencil buffer-based shadow volume algorithm. Unfortunately hardware-accelerated stenciled shadow volume techniques have not been widely adopted by 3D games and applications due in large part to the lack of robustness of described […]
Jan, 19

Accelerating Quadrature Methods for Option Valuation

This paper presents an architecture for FPGA acceleration of quadrature methods used for pricing complex options, such as discrete barrier, Bermudan, and American options. The architecture can be optimized for speed and power consumption by exploiting pipelining and parallelism to produce efficient implementations in reconfigurable logic. An optimised implementation using Graphics Processing Units (GPUs) is […]
Jan, 19

Implicit Parallel Time Integrators

In this work, we discuss a family of parallel implicit time integrators for multi-core and potentially multi-node or multi-gpgpu systems. The method is an extension of Revisionist Integral Deferred Correction (RIDC) by Christlieb, Macdonald and Ong (SISC-2010) which constructed parallel explicit time integrators. The key idea is to re-write the defect correction framework so that, […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org