11232

Posts

Jan, 6

The Parallel Bayesian Toolbox for High-performance Bayesian Filtering in Metrology

The Bayesian theorem is the most used instrument for stochastic inferencing in nonlinear dynamic systems and also the fundament of measurement uncertainty evaluation in the GUM. Many powerful algorithms have been derived and applied to numerous problems. The most widely used algorithms are the broad family of Kalman filters (KFs), the grid-based filters and the […]
Jan, 6

Parallel numerical simulation of two-phase flow model in porous media using distributed and shared memory architectures

A two-phase (water and oil) flow model in a homogeneous porous media is studied, considering immiscible and incompressible displacement. This model is numerically solved using the Finite Volume Method (FVM) and we compare four numerical schemes for the approximation of fluxes on the faces of the discrete volumes. We describe briefly how to obtain the […]
Jan, 6

PySPH: A Python framework for SPH

We present an open source, object oriented framework for Smoothed Particle Hydrodynamics called PySPH. The framework is written in the high level, Python programming language and is designed to be user friendly, flexible and application agnostic. PySPH supports distributed memory computing using the message passing paradigm and (limited) shared memory like parallel processing on hybrid […]
Jan, 5

GLSL Essentials

This book is a practical guide to the OpenGL Shading Language, which contains several real-world examples that will allow you to grasp the core concepts easily and the use of the GLSL for graphics rendering applications. If you want upgrade your skills, or are new to shader programming and want to learn about graphic programming, […]
Jan, 5

Real-Time Radio Wave Propagation for Mobile Ad-Hoc Network Emulation using GPGPUs

The accurate simulation and emulation of mobile radios requires the computation of RF propagation path loss in order to accurately predict connectivity and signal interference. There are many algorithms available for computing the RF propagation path loss between wireless devices including the Longley-Rice model, the transmission line matrix (TLM), ray-tracing, and the parabolic equation method. […]
Jan, 5

Large scale 3D shape retrieval by exploiting multi-core and GPU

This paper addresses the problem of 3D shape retrieval in large databases of 3D objects (large retrieval). While this problem is emerging and interesting as the size of 3D object databases grows rapidly, the main two issues the community has to focus on are: computational efficiency of 3D object retrieval and the quality of retrieved […]
Jan, 5

Applying the Parallel GPU Model to Radiation Therapy Treatment

With current advances in high performance computing, particularly the applications of GPUs, it is easy to see the need for a model for GPU algorithm development. We developed a model which offers a multi-grained approach intended to accommodate nearly any GPU. Radiation therapy is one of the most effective forms of cancer treatment available. In […]
Jan, 5

DEF-G: Declarative Framework for GPU Environment

DEF-G is a declarative language and framework for the efficient generation of OpenCL GPU applications. Using our proof-of-concept DEF-G implementation, run-time and lines-of-code comparisons are provided for three well-known algorithms (Sobel image filtering, breadth-first search and all-pairs shortest path), each evaluated on three different platforms. The DEF-G declarative language and corresponding OpenCL kernels generated complete […]
Jan, 5

Opportunities for Parallelism in Matrix Multiplication

BLIS is a new framework for rapid instantiation of the BLAS. We describe how BLIS extends the "GotoBLAS approach" to implementing matrix multiplication (GEMM). While GEMM was previously implemented as three loops around an inner kernel, BLIS exposes two additional loops within that inner kernel, casting the computation in terms of the BLIS micro-kernel so […]
Jan, 5

Crack-free rendering of dynamically tesselated B-Rep models

We propose a versatile pipeline to render B-Rep models interactively, precisely and without rendering-related artifacts such as cracks. Our rendering method is based on dynamic surface evaluation using both tesselation and ray-casting, and direct GPU surface trimming. An initial rendering of the scene is performed using dynamic tesselation. The algorithm we propose reliably detects then […]
Jan, 5

A Push-Relabel-Based Maximum Cardinality Bipartite Matching Algorithm on GPUs

We design, develop, and evaluate an atomic- and lock-free GPU implementation of the push-relabel algorithm in the context of finding maximum cardinality matchings in bipartite graphs. The problem has applications on computer science, scientific computing, bioinformatics, and other areas. Although the GPU parallelization of the push-relabel technique has been investigated in the context of flow […]
Jan, 5

Parallel Irradiance Caching on the GPU

While ray tracing is highly parallelizable in concept, the Radiance suite of programs for architectural global illumination simulation was written for serial execution and makes use of certain heuristic techniques that are not easily performed in parallel environments. It uses irradiance caching to store and reuse the results of expensive indirect irradiation computations. The irradiance […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: