11371

Posts

Feb, 2

Efficient Virtual Shadow Maps for Many Lights

Recently, several algorithms have been introduced that enable real-time performance for many lights in applications such as games. In this paper, we explore the use of hardware-supported virtual cube-map shadows to efficiently implement high-quality shadows from hundreds of light sources in real time and within a bounded memory footprint. In addition, we explore the utility […]
Feb, 2

Task migration of DSP application specified with a DFG and implemented with the BSP computing model on a CPU-GPU cluster

Nowadays computer applications are becoming heavier and require, at the same time, real-time results. The Heterogeneous clusters with their computing power represent a good solution to this request. However, it is possible that during the execution, a computing element of the cluster becomes defaulting, needs maintenance, or that the load needs to be re-balanced. In […]
Feb, 2

Optimized Deep Learning Architectures with Fast Matrix Operation Kernels on Parallel Platform

In this paper, we introduce an optimized deep learning architecture with flexible layer structures and fast matrix operation kernels on parallel computing platform (e.g. NIVDIA’s GPU). Carefully layer-wise designed strategies are conducted to integrate different kinds of deep architectures into a uniform neural training-testing system. Our fast matrix operation kernels are implemented in deep architectures’ […]
Feb, 2

High energy electromagnetic particle transportation on the GPU

We present massively parallel high energy electromagnetic particle transportation through a finely segmented detector on a Graphics Processing Unit (GPU). Simulating events of energetic particle decay in a general-purpose high energy physics (HEP) detector requires intensive computing resources, due to the complexity of the geometry as well as physics processes applied to particles copiously produced […]
Feb, 1

A TBB-CUDA Implementation for Background Removal in a video-based Fire Detection System

This paper presents a parallel TBB-CUDA implementation for the acceleration single-Gaussian distribution model, which is effective for background removal in the video-based Fire Detection System. In this framework, TBB mainly deals with initializing work of the estimated Gaussian model running on CPU, and CUDA performs background removal and adaption of the model running on GPU. […]
Feb, 1

Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs

We present a new approach for combining k-d trees and graphics processing units for nearest neighbor search. It is well known that a direct combination of these tools leads to a non-satisfying performance due to conditional computations and suboptimal memory accesses. To alleviate these problems, we propose a variant of the classical k-d tree data […]
Feb, 1

Speeding Up Object Detection: Fast Resizing in the Integral Image Domain

In this paper, we present an approach to resize integral images directly in the integral image domain. For the special case of resizing by a power of two, we propose a highly parallelizable variant of our approach, which is identical to bilinear resizing in the image domain in terms of results, but requires fewer operations […]
Feb, 1

High Performance Computing of Dynamic Structural Response Analysis for the Integrated Earthquake Simulation

This paper proposes an application of high performance computing (HPC) to dynamic structural response analysis (DSRA) in order to enhance the capability and increase the efficiency of integrated earthquake simulation (IES). Object Based Structural Analysis (OBASAN) is a candidate DSRA program for IES. With OBASAN, the reliability of structural damage prediction can be increased by […]
Feb, 1

Survey on Efficient Linear Solvers for Porous Media Flow Models on Recent Hardware Architectures

In the pastfew years, High Performance Computing (HPC) technologies led to General Purpose Processing on Graphics Processing Units (GPGPU) and many-core architectures. These emerging technologies offer massive processing units and are interesting for porous media flow simulators may used for CO2 geological sequestration or Enhanced Oil Recovery (EOR) simulation. However the crucial point is "are […]
Jan, 30

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

The ongoing hardware evolution exhibits an escalation in the number, as well as in the heterogeneity, of the computing resources. The pressure to maintain reasonable levels of performance and portability, forces the application developers to leave the traditional programming paradigms and explore alternative solutions. PaStiX is a parallel sparse direct solver, based on a dynamic […]
Jan, 30

Towards Efficient Risk Quantification-Using GPUs and Variance Reduction Technique

Value-at-Risk (VaR) provides information about global risk in trading. The request for high speed calculation about VaR is rising because financial institutions need to measure the risk in real time. Researchers in HPC also recently turned their attention on this kind of demanding applications. In this master thesis, we introduce two complementary and different strategies […]
Jan, 30

A Novel Graphical Processing Unit Method for Power Systems Security Analysis

There is an increasing need for computational power to drive software tools used in power systems planning and operations, since the emergence of modern energy markets and recent renewable generation technology fundamentally alters how energy flows through the existing power grid. While special-purpose hardware, including supercomputers, has been explored for this purpose, inexpensive commodity hardware […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: