10396

Posts

Aug, 20

Using Modularity Metrics to assist Move Method Refactoring of Large System

For large software systems, refactoring activities can be a challenging task, since for keeping component complexity under control the overall architecture as well as many details of each component have to be considered. Product metrics are therefore often used to quantify several parameters related to the modularity of a software system. This paper devises an […]
Aug, 20

Advanced CFD Modeling Using GeForce GPUs

Advanced applications of CFD for multiphysics modelling of electrokinetic, capillary, turbulent and rarefied hypersonic flows is discussed in this paper. Due the complexity of the geometry involved and the underlying physics associated with the phenomena to be studied, multiphysics study requires enormous computational resources. The CFD computations are performed within a parallel environment for accelerating […]
Aug, 20

Implementing Molecular Dynamics on Hybrid High Performance Computers – Three-Body Potentials

The use of coprocessors or accelerators such as graphics processing units (GPUs) has become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high-performance computers, defined as machines with nodes containing more than one type of floating-point processor (e.g. CPU and GPU), […]
Aug, 19

Performance Drawbacks for Matrix Multiplication using Set Associative Cache in GPU devices

Performance of shared memory processors show negative performance impulses (drawbacks) in certain regions for execution of the basic matrix multiplication algorithm. In this paper we continue with analysis of GPU memory hierarchy and corresponding cache memory organization. We give a theoretical analysis why a negative performance impulse appears for specifics problem sizes. The main reason […]
Aug, 19

A Domain-Specific Language and Compiler for Stencil Computations on Short-Vector SIMD and GPU Architectures

Stencil computations are an integral part of applications in a number of scientific computing domains, such as image processing and partial differential equations. We describe a domain-specific language for regular stencil computations, that allows specification of the computations in a concise manner. We describe a multi-target compiler for this DSL, that generates optimized code for […]
Aug, 19

PARIS: A Parallel RSA-Prime Inspection Tool

Modern-day computer security relies heavily on cryptography as a means to protect the data that we have become increasingly reliant on. As the Internet becomes more ubiquitous, methods of security must be better than ever. Validation tools can be leveraged to help increase our confidence and accountability for methods we employ to secure our systems. […]
Aug, 19

Algorithms for Compression on GPUs

This project seeks to produce an algorithm for fast lossless compression of data. This is attempted by utilisation of the highly parallel graphic processor units (GPU), which has been made easier to use in the last decade through simpler access. Especially nVidia has accomplished to provide simpler programming of GPUs with their CUDA architecture. I […]
Aug, 19

Transfer Time Reduction of Data Transfers between CPU and GPU

In real-time video processing data transfer between CPU and GPU is a time critical action; time spent transferring data is processing time lost. Several variants of standard transfer methods were developed and evaluated on nine computers and two smart decision algorithms was designed to help choose the fastest method for each occasion. Results showed that […]
Aug, 18

Towards a Distributed GPU-Accelerated Matrix Inversion

We present an extension of a GPU-based matrix inversion algorithm for distributed memory contexts. Specifically, we implement and evaluate a message-passing variant of the Gauss-Jordan method (GJE) for matrix inversion on a cluster of nodes equipped with GPU hardware accelerators. The experimental evaluation of the proposal shows a significant runtime reduction when compared with both […]
Aug, 18

A GPU implementation for improved granular simulations with LAMMPS

Granular mechanics plays an important role in many branches of science and engineering, from astrophysics applications in planetary and interstellar dust clouds, to processing of industrial mixtures and powders. In this context, a granular simulation model with improved adhesion and friction, is implemented within the open source code LAMMPS (lammps.sandia.gov). The performance of this model […]
Aug, 18

Permutation Index and GPU to Solve efficiently Many Queries

Similarity search is a fundamental operation for applications that deal with multimedia data. For a query in a multimedia database it is meaningless to look for elements exactly equal to a given one as query. Instead, we need to measure the similarity (or dissimilarity) between the query object and each object of the database. The […]
Aug, 18

Encrypting video streams using OpenCL code on-demand

The amount of multimedia information transmitted through the web is very high and increasing. Generally, this kind data is not correctly protected, since users do not appreciate the information that images and videos may contain. In this work, we present an architecture for managing safely multimedia transmission channels. The idea is to encrypt and encode […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: