10386

Posts

Aug, 18

Permutation Index and GPU to Solve efficiently Many Queries

Similarity search is a fundamental operation for applications that deal with multimedia data. For a query in a multimedia database it is meaningless to look for elements exactly equal to a given one as query. Instead, we need to measure the similarity (or dissimilarity) between the query object and each object of the database. The […]
Aug, 18

Encrypting video streams using OpenCL code on-demand

The amount of multimedia information transmitted through the web is very high and increasing. Generally, this kind data is not correctly protected, since users do not appreciate the information that images and videos may contain. In this work, we present an architecture for managing safely multimedia transmission channels. The idea is to encrypt and encode […]
Aug, 18

Fast and Flexible: Parallel Packet Processing with GPUs and Click

We introduce Snap, a framework for packet processing that outperforms traditional software routers by exploiting the parallelism available on modern GPUs. While obtaining high performance, it remains extremely flexible, with packet-processing tasks implemented as simple modular elements that are composed to build fully functional routers and switches. Snap is based on the Click modular router, […]
Aug, 17

Solving 3D viscous incompressible Navier-Stokes equations using CUDA

A CUDA implementation of the 3D viscous incompressible Navier-Stokes equations is proposed using as advection operator the BFECC (Back and Forth Error Compensation and Correction) scheme. The Poisson problem for pressure is solved with a CG (Conjugated Gradient) preconditioning the system with FFTs (Fast Fourier Transforms). Study cases such as Lid-Driven Cavity and Flow Past […]
Aug, 17

Performance Analysis of a Symmetric Cryptography Algorithm on GPU and GPU Cluster

This article presents a performance analysis of the symmetric encryption algorithm AES (Advanced Encryption Standard) on a machine with one GPU and a cluster of GPUs, for cases in which the memory required by the algorithm is more than that of a GPU. Two implementations were carried out, based on C language, that use the […]
Aug, 17

Formal specification and verification of OpenCL Kernel optimization

Computing general problems using the graphical processing unit (GPU) of a device is an emerging field. The parallel structure of the GPU allows for massive concurrency, when executing a program. Therefore, by executing (a part of) the code on the GPU, a previously unused resource can be used, to achieve a speed-up of an application. […]
Aug, 17

Acceleration of Feynman loop integrals in high-energy physics on many core GPUs

The current and future colliders in high-energy physics require theorists to carry out a large scale computation for a precise comparison between experimental results and theoretical ones. In a perturbative approach several methods to evaluate Feynman loop integrals which appear in the theoretical calculation of cross-sections are well established in the one-loop level, however, more […]
Aug, 17

Studying the core-cusp problem in cold dark matter halos using N-body simulations on GPU clusters

The discrepancy in the mass-density profile of dark matter halos between simulations and observations, the core-cusp problem, is a long-standing open question in the standard paradigm of cold dark matter cosmology. Here, we study the dynamical response of dark matter halos to oscillations of the galactic potential which are induced by a cycle of gas […]
Aug, 16

Accelerating Random Forests on CPUs and GPUs for Object-Class Image Segmentation

Random forests are a machine learning method that has recently become popular in the computer vision community to solve image segmentation and object detection tasks. Existing random forest implementations are either general purpose and not efficiently applicable for image segmentation or focus only on the speed of prediction. The implementation for the Microsoft Kinect gaming […]
Aug, 16

GPU-Accelerated Scalable Solver for Banded Linear Systems

Solving a banded linear system efficiently is important to many scientific and engineering applications. Current solvers achieve good scalability only on the linear systems that can be partitioned into independent subsystems. In this paper, we present a GPU based, scalable Bi-Conjugate Gradient Stabilized solver that can be used to solve a wide range of banded […]
Aug, 16

Lossless LZW Data Compression Algorithm on CUDA

Data compression is an important area of information and communication technologies it seeks to reduce the number of bits used to store or transmit information. It will efficiently utilizes the memory spaces and allows to transmit data within a limited bandwidth. Most compression process is achieved by removing data redundancy while preserving information content. Data […]
Aug, 16

Towards Path Tracing in Games

We investigate GPU path tracing performance in the context of real-time rendering for games. We propose a reformulation of Russian roulette, as well as an efficient implementation of the path regeneration algorithm by Novak et al. [Novak et al. 2010]. We show that a combination of these algorithms provides high performance for a variety of […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: