Posts
Jan, 10
Graphics Processor Unit (GPU) Acceleration of Finite-Difference Frequency-Domain (FDFD) Method
Recently, many numerical methods that are developed for the solution of electromagnetic problems have greatly benefited from the hardware accelerated scientific computing capability provided by graphics processing units (GPUs) and orders of magnitude speed-up factors have been reported. Among these methods, the finite-difference frequency-domain (FDFD) method as well can be accelerated substantially by utilizing an […]
Jan, 10
A Hybrid Circular Queue Method for Iterative Stencil Computations on GPUs
In this paper, we present a hybrid circular queue method that can significantly boost the performance of stencil computations on GPU by carefully balancing usage of registers and shared-memory. Unlike earlier methods that rely on circular queues predominantly implemented using indirectly addressable shared memory, our hybrid method exploits a new reuse pattern spanning across the […]
Jan, 10
Parallelizing Kernel Polynomial Method Applying Graphics Processing Units
The Kernel Polynomial Method (KPM) is one of the fast diagonalization methods used for simulations of quantum systems in research fields of condensed matter physics and chemistry. The algorithm has a difficulty to be parallelized on a cluster computer or a supercomputer due to the fine-grain recursive calculations. This paper proposes an implementation of the […]
Jan, 10
Acceleration of AES encryption on CUDA GPU
GPU exhibits the capability for applications with a high level of parallelism despite its low cost. The support of integer and logical instructions by the latest generation of GPUs enables us to implement cipher algorithms more easily. However, decisions such as parallel processing granularity and memory allocation impose a heavy burden on programmers. Therefore, this […]
Jan, 9
Providing Source Code Level Portability Between CPU and GPU with MapCG
Graphics processing units (GPU) have taken an important role in the general purpose computing market in recent years. At present, the common approach to programming GPU units is to write GPU specific code with low level GPU APIs such as CUDA. Although this approach can achieve good performance, it creates serious portability issues as programmers […]
Jan, 9
Gauge Fixing in Lattice QCD on GPUs
Quantum Chromodynamics (QCD) [1, 2] is the theory of the strong interaction which is responsible for the hadron spectrum and therefore for all matter in our everyday life. QCD, being a quantum field theory and part of the standard model of elementary particles, describes the interactions between color-charged quarks and gluons. Hadrons, e.g., protons, neutrons […]
Jan, 9
A new parallel tool for classification of remotely sensed imagery
In this paper, we describe a new tool for classification of remotely sensed images. Our processing chain is based on three main parts: (1) pre-processing, performed using morphological profiles which model both the spatial (high resolution) and the spectral (color) information available from the scenes; (2) classification, which can be performed in unsupervised fashion using […]
Jan, 9
Top-k Queries Processing With Uncertain Data on Graphics Processing Units
Considering the complex uncertain database, top-k query processing in uncertain databases is semantically and computationally different from classical top-k processing. Score is not the only factor we should concern. The interplay between score and membership uncertainty makes computation complex. Powerful computing capability of Graphic Processing Unit(GPU) is needed in the processing of this kind of […]
Jan, 9
Designing Numerical Solvers for Next Generation High Performance Computing
High Performance Computing (HPC) is moving towards massive scales of parallelism. The changes in hardware towards large scale on chip parallelism requires the re-writing of existing solvers for various Computational Fluid Dynamics (CFD) problems. The aim of the project is to write and optimise novel solvers for various common CFD numerical problems that can take […]
Jan, 9
LU Factorization for Accelerator-based Systems
Multicore architectures enhanced with multiple GPUs are likely to become mainstream High Performance Computing (HPC) platforms in a near future. In this paper, we present the design and implementation of an LU factorization using tile algorithm that can fully exploit the potential of such platforms in spite of their complexity. We use a methodology derived […]
Jan, 9
Neural Network Simulation: The recognition application
This paper presents the GPU mapping of the recognition algorithm of a Convolution Neural Network (CNN). This work is based on a C-implementation of the application. The mapping to GPU was performed through different approaches which are explained in detail. The improvements achieved by each approach are presented as well as the overall speed up […]
Jan, 9
Spatial Sorting Algorithms for Parallel Computing in Networks
Many basic techniques in computer science have been founded on the assumption that physical computing resources are scarce but orderly, and that the cost of effective direct communication between physically distant parts of a computer system is affordable. In large scale cluster computing installations, fine-grained parallel computing hardware, or wireless mesh networks, these familiar assumptions […]