8359

Posts

Sep, 27

Lattice QCD based on OpenCL

We present an OpenCL-based Lattice QCD application using a heatbath algorithm for the pure gauge case and Wilson fermions in the twisted mass formulation. The implementation is platform independent and can be used on AMD or NVIDIA GPUs, as well as on classical CPUs. On the AMD Radeon HD 5870 our double precision dslash implementation […]
Sep, 27

GPU Acceleration of Image Convolution using Spatially-varying Kernel

Image subtraction in astronomy is a tool for transient object discovery such as asteroids, extra-solar planets and supernovae. To match point spread functions (PSFs) between images of the same field taken at different times a convolution technique is used. Particularly suitable for large-scale images is a computationally intensive spatially-varying kernel. The underlying algorithm is inherently […]
Sep, 26

Improved Row-Grouped CSR Format for Storing of Sparse Matrices on GPU

We present new format for storing sparse matrices on GPU. We compare it with several other formats including CUSPARSE which is today probably the best choice for processing of sparse matrices on GPU in CUDA. Contrary to CUSPARSE which works with common CSR format, our new format requires conversion. However, multiplication of sparse-matrix and vector […]
Sep, 26

GPU Shape Grammars

GPU Shape Grammars provide a solution for interactive procedural generation, tuning and visualization of massive environment elements for both video games and production rendering. Our technique generates detailed models without explicit geometry storage. To this end we reformulate the grammar expansion for generation of detailed models at the tesselation control and geometry shader stages. Using […]
Sep, 26

Enabling Development of OpenCL Applications on FPGA platforms

FPGAs can potentially deliver tremendous acceleration in high-performance server and embedded computing applications. Whether used to augment a processor or as a stand-alone device, these reconfigurable architectures are being deployed in a large number of implementations owing to the massive amounts of parallelism offered. At the same time, a significant challenge encountered in their wide-spread […]
Sep, 26

A Parallel Auxiliary Grid AMG Method for GPU

In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening procedure based on a region quadtree generated from an auxiliary grid. This allows us to explicitly control […]
Sep, 26

Accelerating Iterative SpMV for Discrete Logarithm Problem using GPUs

In the cryptanalytic context, computing discrete logarithms in large cyclic groups using index-calculus-based methods, such as the number field sieve or the function field sieve, requires solving large sparse systems of linear equations modulo the group order. Most of the fast algorithms used to solve such systems — e.g., the conjugate gradient or the Lanczos […]
Sep, 25

GPF: a framework for general packet classification on GPU co-processors

This thesis explores the design and experimental implementation of GPF, a novel protocol-independent, multi-match packet classification framework. This framework is targeted and optimised for flexible, efficient execution on NVIDIA GPU platforms through the CUDA API, but should not be difficult to port to other platforms, such as OpenCL, in the future. GPF was conceived and […]
Sep, 25

Implementation and Analysis of AES Encryption on GPU

GPU is continuing its trend of vastly outperforming CPU while becoming more general purpose. In order to improve the efficiency of AES algorithm, this paper proposed a CUDA implementation of Electronic Codebook (ECB) mode encoding process and Cipher Feedback (CBC) mode decoding process on GPU. In our implementation, the frequently accessed T-boxes were allocated on […]
Sep, 25

Speeding up Scoring Module of Mass Spectrometry Based Protein Identification by GPU

Database searching is a main method for protein identification in shotgun proteomics, and many research efforts are dedicated to improving its effectiveness. However, the efficiency of database searching is facing a serious challenge, due to the ever fast growth of protein and peptide databases resulted from genome translations, enzymatic digestions, and post-translational modifications (PTMs). On […]
Sep, 25

GPU Accelerated Lambert Solution Methods for the Orbital Targeting Problem

Lamberts problem is concerned with the determination of an orbit that connects two position vectors within a specified time of flight. It must often be solved millions of times, especially when one is conducting global searches for possible gravity assist missions, which requires fast efficient solutions. The orbital targeting problem lends itself well to parallel […]
Sep, 25

GPU in Physics Computation: Case Geant4 Navigation

General purpose computing on graphic processing units (GPU) is a potential method of speeding up scientific computation with low cost and high energy efficiency. We experimented with the particle physics simulation toolkit Geant4 used at CERN to benchmark its geometry navigation functionality on a GPU. The goal was to find out whether Geant4 physics simulations […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org