8363

Posts

Sep, 28

Parallel Execution of Constraint Handling Rules on a Graphical Processing Unit

Graphical Processing Units (GPUs) consist of hundreds of small cores, collectively operating to provide massive computation capabilities. The aim of this work is to utilize this technology to execute Constraint Handling Rules (CHR) which are inherently parallel. A translation scheme is defined to transform a subset of CHR rules to C/C++, then to use a […]
Sep, 27

Increasing the performance of AllToAll variant of self-organizing migration algorithm using CUDA

Modern graphics processing units offer general purpose parallel computing capabilities. Thus they have become a relatively low cost alternative for applications requiring extensive parallel computations. Evolutionary algorithms are especially well suited for parallel SIMD architecture. This paper deals with the modification of AllToAll variation of self-organizing migration algorithm, which has high computational demand for one […]
Sep, 27

Deterministic Parallelism

A program is deterministic if it always produces the same output for a given input. Although sequential programs are often deterministic by default, parallel programs are more susceptible to behaving nondeterministically because instructions from different threads can be interleaved unpredictably. Non-determinism complicates the task of developing and maintaining software because it makes reasoning about program […]
Sep, 27

GPU-based tuning of quantum-inspired genetic algorithm for a combinatorial optimization problem

This paper concerns efficient parameters tuning (meta-optimization) of a state-of-the-art metaheuristic, Quantum-Inspired Genetic Algorithm (QIGA), in a GPU-based massively parallel computing environment (NVidia CUDA technology). A novel approach to parallel implementation of the algorithm has been presented. In a block of threads, each thread transforms a separate quantum individual or different quantum gene; In each […]
Sep, 27

Lattice QCD based on OpenCL

We present an OpenCL-based Lattice QCD application using a heatbath algorithm for the pure gauge case and Wilson fermions in the twisted mass formulation. The implementation is platform independent and can be used on AMD or NVIDIA GPUs, as well as on classical CPUs. On the AMD Radeon HD 5870 our double precision dslash implementation […]
Sep, 27

GPU Acceleration of Image Convolution using Spatially-varying Kernel

Image subtraction in astronomy is a tool for transient object discovery such as asteroids, extra-solar planets and supernovae. To match point spread functions (PSFs) between images of the same field taken at different times a convolution technique is used. Particularly suitable for large-scale images is a computationally intensive spatially-varying kernel. The underlying algorithm is inherently […]
Sep, 26

Improved Row-Grouped CSR Format for Storing of Sparse Matrices on GPU

We present new format for storing sparse matrices on GPU. We compare it with several other formats including CUSPARSE which is today probably the best choice for processing of sparse matrices on GPU in CUDA. Contrary to CUSPARSE which works with common CSR format, our new format requires conversion. However, multiplication of sparse-matrix and vector […]
Sep, 26

GPU Shape Grammars

GPU Shape Grammars provide a solution for interactive procedural generation, tuning and visualization of massive environment elements for both video games and production rendering. Our technique generates detailed models without explicit geometry storage. To this end we reformulate the grammar expansion for generation of detailed models at the tesselation control and geometry shader stages. Using […]
Sep, 26

Enabling Development of OpenCL Applications on FPGA platforms

FPGAs can potentially deliver tremendous acceleration in high-performance server and embedded computing applications. Whether used to augment a processor or as a stand-alone device, these reconfigurable architectures are being deployed in a large number of implementations owing to the massive amounts of parallelism offered. At the same time, a significant challenge encountered in their wide-spread […]
Sep, 26

A Parallel Auxiliary Grid AMG Method for GPU

In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening procedure based on a region quadtree generated from an auxiliary grid. This allows us to explicitly control […]
Sep, 26

Accelerating Iterative SpMV for Discrete Logarithm Problem using GPUs

In the cryptanalytic context, computing discrete logarithms in large cyclic groups using index-calculus-based methods, such as the number field sieve or the function field sieve, requires solving large sparse systems of linear equations modulo the group order. Most of the fast algorithms used to solve such systems — e.g., the conjugate gradient or the Lanczos […]
Sep, 25

GPF: a framework for general packet classification on GPU co-processors

This thesis explores the design and experimental implementation of GPF, a novel protocol-independent, multi-match packet classification framework. This framework is targeted and optimised for flexible, efficient execution on NVIDIA GPU platforms through the CUDA API, but should not be difficult to port to other platforms, such as OpenCL, in the future. GPF was conceived and […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: