9203

Posts

Apr, 1

Parallelization of the Cuckoo Search using CUDA Architecture

Cuckoo Search is one of the recent swarm itelligence metaheuritics. It has been succesfuly applied to a number of optimization problems but is stil not very well researched. In this paper we present a parallelized version of the Cuckoo Search algorithm. The parallelization is implemented using CUDA architecture. The algorithm is significantly changed compared to […]
Apr, 1

OpenCL parallel Processing using General Purpose Graphical Processing units – TiViPE software development

The aim of this report to elaborate TiViPE modules that make use of Open Computing Language (OpenCL) programming. OpenCL is available in TiViPE from version 2.1.0. The aim of TiViPE is to integrate different technologies in a seamless way using graphical icons [1]. Due to these icons the user does not need to have in […]
Apr, 1

Geometric Algebra Computing Technology for Accelerated Processing Units

Development on embedded devices, even on today’s hardware, limits us to a minimum of third party-library dependencies due to hardware memory and power restrictions. In setups requiring intense geometric operations on limited hardware, such as in robotics, this problem can often lead to a tedious reimplementation of matrix, vector, and quaternion operations. Furthermore, certain unnecessary […]
Apr, 1

A Discussion of Selected Vienna-Libraries for Computational Science

We address the low popularity of C++ in computational science by introducing a set of orthogonal libraries: The CUDA-, OpenCL-, and OpenMP-enabled linear algebra library ViennaCL, the mesh datastructure library ViennaGrid, a data storage facility named ViennaData, and the symbolic math kernel ViennaMath. Finally, we discuss how these orthogonal components interact within the finite element […]
Mar, 31

Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding

GPU based computing has made significant strides in recent years. Unfortunately, GPU program optimizations can introduce subtle concurrency errors, and so incisive formal bug-hunting methods are essential. This paper presents a new formal bug-hunting method for GPU programs that combine barriers and atomics. We present an algorithm called conflict-directed delay-bounded scheduling algorithm (CD) that exploits […]
Mar, 31

Specification and Verification of GPGPU Programs using Permission-Based Separation Logic

Graphics Processing Units (GPUs) are increasingly used for general-purpose applications because of their low price, energy efficiency and enormous computing power. Considering the importance of GPU applications, it is vital that the behaviour of GPU programs can be specified and proven correct formally. This paper presents our ideas how to verify GPU programs written in […]
Mar, 31

A journey from single-GPU to optimized multi-GPU SPH with CUDA

We present an optimized multi-GPU version of GPUSPH, a CUDA implementation of fluid-dynamics models based on the Smoothed Particle Hydrodynamics (SPH) numerical method. SPH is a well-known Lagrangian model for the simulation of free-surface fluid flows; it exposes a high degree of parallelism and has already been successfully ported to GPU. We extend the GPU-based […]
Mar, 31

A Massively Parallel Associative Memory Based on Sparse Neural Networks

Associative memories store content in such a way that the content can be later retrieved by presenting the memory with a small portion of the content, rather than presenting the memory with an address as in more traditional memories. Associative memories are used as building blocks for algorithms within database engines, anomaly detection systems, compression […]
Mar, 31

CMCpy: Genetic Code-Message Coevolution Models in Python

Code-message coevolution (CMC) models represent coevolution of a genetic code and a population of protein-coding genes ("messages"). Formally, CMC models are sets of quasispecies coupled together for fitness through a shared genetic code. Although CMC models display plausible explanations for the origin of multiple genetic code traits by natural selection, useful modern implementations of CMC […]
Mar, 29

High Performance Computing using GPGPU’s

Computer based simulation software having a basis in numerical methods play a major role in research in the area of natural and physical sciences. These tools allow scientists to attempt problems that are too large to solve using analytical methods. But even these tools can fail to give solutions due to computational or storage limits. […]
Mar, 29

Warp Size Impact in GPUs: Large or Small?

There are a number of design decisions that impact a GPU’s performance. Among such decisions deciding the right warp size can deeply influence the rest of the design. Small warps reduce the performance penalty associated with branch divergence at the expense of a reduction in memory coalescing. Large warps enhance memory coalescing significantly but also […]
Mar, 29

Graphics Processing Unit Acceleration of the Explicit Solution of the Time Domain Volume Integral Equation Using OpenACC

A graphics processing unit (GPU) accelerated implementation of the explicit solution of the time domain volume integral equation (TD-VIE) using the OpenACC application program interface (API) is presented. The use of the OpenACC API, which is based on a collection of compiler directives implementation, allows for the ease of porting as well as the efficient […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org