Posts
Sep, 26
A Parallel Auxiliary Grid AMG Method for GPU
In this paper, we develop a new parallel auxiliary grid algebraic multigrid (AMG) method to leverage the power of graphic processing units (GPUs). In the construction of the hierarchical coarse grid, we use a simple and fixed coarsening procedure based on a region quadtree generated from an auxiliary grid. This allows us to explicitly control […]
Sep, 26
Accelerating Iterative SpMV for Discrete Logarithm Problem using GPUs
In the cryptanalytic context, computing discrete logarithms in large cyclic groups using index-calculus-based methods, such as the number field sieve or the function field sieve, requires solving large sparse systems of linear equations modulo the group order. Most of the fast algorithms used to solve such systems — e.g., the conjugate gradient or the Lanczos […]
Sep, 25
GPF: a framework for general packet classification on GPU co-processors
This thesis explores the design and experimental implementation of GPF, a novel protocol-independent, multi-match packet classification framework. This framework is targeted and optimised for flexible, efficient execution on NVIDIA GPU platforms through the CUDA API, but should not be difficult to port to other platforms, such as OpenCL, in the future. GPF was conceived and […]
Sep, 25
Implementation and Analysis of AES Encryption on GPU
GPU is continuing its trend of vastly outperforming CPU while becoming more general purpose. In order to improve the efficiency of AES algorithm, this paper proposed a CUDA implementation of Electronic Codebook (ECB) mode encoding process and Cipher Feedback (CBC) mode decoding process on GPU. In our implementation, the frequently accessed T-boxes were allocated on […]
Sep, 25
Speeding up Scoring Module of Mass Spectrometry Based Protein Identification by GPU
Database searching is a main method for protein identification in shotgun proteomics, and many research efforts are dedicated to improving its effectiveness. However, the efficiency of database searching is facing a serious challenge, due to the ever fast growth of protein and peptide databases resulted from genome translations, enzymatic digestions, and post-translational modifications (PTMs). On […]
Sep, 25
GPU Accelerated Lambert Solution Methods for the Orbital Targeting Problem
Lamberts problem is concerned with the determination of an orbit that connects two position vectors within a specified time of flight. It must often be solved millions of times, especially when one is conducting global searches for possible gravity assist missions, which requires fast efficient solutions. The orbital targeting problem lends itself well to parallel […]
Sep, 25
GPU in Physics Computation: Case Geant4 Navigation
General purpose computing on graphic processing units (GPU) is a potential method of speeding up scientific computation with low cost and high energy efficiency. We experimented with the particle physics simulation toolkit Geant4 used at CERN to benchmark its geometry navigation functionality on a GPU. The goal was to find out whether Geant4 physics simulations […]
Sep, 24
Mobile Computational Photography, IS&T/SPIE Electronic Imaging 2013, EI 2013
Conference EI20D This conference is intended to bring together world class researchers and practitioners that develop and deploy imaging technologies to enable novel solutions for mobile photography. Submissions are accepted on theory, application, and experience. The scope of the conference includes: Computation * computational image enhancement (e.g., noise reduction, super resolution, image stabilization) * computational […]
Sep, 24
Sound Speed Optimization Using Image Texture on CUDA
The Compute Unified Device Architecture (CUDA) is a brand new parallel processing platform making use of the unified shader design of the most current Graphics Processing Units (GPUs) from NVIDIA. In this paper, we apply this revolutionary new technology to implement the sound speed optimization (SSO) with image texture analysis for medical ultrasound imaging. The […]
Sep, 24
Resolution of the Vlasov-Maxwell system by PIC Discontinuous Galerkin method on GPU with OpenCL
We present an implementation of a Vlasov-Maxwell solver for multicore processors. The Vlasov equation describes the evolution of charged particles in an electromagnetic field, solution of the Maxwell equations. The Vlasov equation is solved by a Particle-In-Cell method (PIC), while the Maxwell system is computed by a Discontinuous Galerkin method. We use the OpenCL framework, […]
Sep, 24
A Hardware-Accelerated Parallel Implementation of a Two-Dimensional Scheme for Free Surface Flows
This contribution concerns the verification and performance assessment of a hardware-accelerated parallel implementation of an algorithm for the semi-implicit finite difference method for solving the vertically integrated shallow water equations including a non-linear treatment of wetting and drying and conservative advection schemes. Instead of adapting an existing serial, OpenMP-, or MPI-parallelised code with all necessary […]
Sep, 24
ACO on Multiple GPUs with CUDA for Faster Solution of QAPs
In this paper, we implement ACO algorithms on a PC which has 4 GTX 480 GPUs. We implement two types of ACO models; the island model, and the master/slave model. When we compare the island model and the master/slave model, the island model shows promising speedup values on class (iv) QAP instances. On the other […]