Posts
Aug, 30
Path Integral Approaches and Graphics Processing Unit Tools for Quantum Molecular Dynamics Simulations
This thesis details both the technical and theoretical aspects of performing path integrals through classical Molecular Dynamics (MD) simulations. In particular, Graphics Processing Unit (GPU) computing is used to augment the Path Integral Molecular Dynamics (PIMD) portion of the widely available Molecular Modelling Tool Kit (MMTK) library. This same PIMD code is also extended in […]
Aug, 30
Case Studies in Acceleration of Heston’s Stochastic Volatility Financial Engineering Model: GPU, Cloud and FPGA Implementations
Here we present a comparative insight of the performance of the Heston stochastic volatility model on different acceleration platforms. This model was tested against a MacBook’s CPU, a Techila grid server hosted on Microsoft’s Azure cloud, a GPU node hosted by Boston Ltd, and an FPGA node hosted by Maxeler Technologies Ltd. Temporal data was […]
Aug, 30
Parallel Data List Processing on Multicore-GPU Platforms
Multicore-GPU platforms are now common and affordable, yet capitalising on their parallel processing capability is not straightforward. Existing sequential and parallel software must be tuned, or designed anew, to efficiently capitalise on these platforms. This paper presents the design of parallel data list processing in multicore-GPU platforms, wherein application data is organised into various lists, […]
Aug, 28
GPUVerify: A Verifier for GPU Kernels
We present a technique for verifying race- and divergencefreedom of GPU kernels that are written in mainstream kernel programming languages such as OpenCL and CUDA. Our approach is founded on a novel formal operational semantics for GPU programming termed synchronous, delayed visibility (SDV) semantics. The SDV semantics provides a precise definition of barrier divergence in […]
Aug, 28
Intelligent Edge Detection using a CUDA Simulator of Multilayer Neural Network Based on Multi-Valued Neurons
In this paper, we consider the edge detection problem using an intelligent approach. We use a multilayer neural network based on multi-valued neurons (MLMVN) as an intelligent edge enhancer. MLMVN is a complex-valued neural network and it has many advantages over classical neural networks. It significantly outperforms a classical multilayer feedforward neural network in terms […]
Aug, 28
Performance Comparison Between Cg-based and CUDA-based Matrix Multiplications
In this paper, we compare the performances of Cg-based and CUDA-based GPU programming APIs. In particular, their performances on squared matrix multiplications are considered. We also discuss other aspects of these widely-used GPU programming APIs. This work can help gain insight on various applications that involve matrix multiplication that are better suited for a specific […]
Aug, 28
Optimization Techniques for CUDA Application
In this paper, we summarize our experiment results of applying various optimization techniques for CUDA application running on NVIDIA Fermi GPUs. Our experiments on matrix multiplication and breadth first search algorithms show that optimization techniques such as coalesced global memory access, conflict-free shared memory access and data pre-fetching improve the performance of applications running on […]
Aug, 28
A Research of MapReduce with GPU Acceleration
MapReduce is an efficient distributed computing model on large data sets. The data processing is fully distributed on huge amount of nodes, and a MapReduce cluster is of highly scalable. However, single-node performance is gradually to be a bottleneck in computeintensive jobs, which makes it difficult to extend the MapReduce model to wider application fields […]
Aug, 27
A Unified Optimizing Compiler Framework for Different GPGPU Architectures
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performance GPGPU programs: effective utilization of GPU memory hierarchy and judicious management of parallelism. The input to our compiler is a naive GPU kernel function, which is functionally correct but without […]
Aug, 27
Low-Latency Elliptic Curve Scalar Multiplication
This paper presents a low-latency algorithm designed for parallel computer architectures to compute the scalar multiplication of elliptic curve points based on approaches from cryptographic side-channel analysis. A graphics processing unit implementation using a standardized elliptic curve over a 224-bit prime field, complying with the new 112-bit security level, computes the scalar multiplication in 1.9 […]
Aug, 27
An Implementation of Coincidence Algorithm on Graphic Processing Units
Genetic Algorithms (GAs) are powerful search techniques. However when they are applied to complex problems, they consume large computation power. One of the choices to make them faster is to use a parallel implementation. This paper presents a parallel implementation of Combinatorial Optimisation with Coincidence Algorithm (COIN) on Graphic Processing Units. COIN is a modern […]
Aug, 27
Perceptually Optimized Real-Time Computer Graphics
Perceptual optimization, the application of human visual perception models to remove imperceptible components in a graphics system, has been proven effective in achieving significant computational speedup. Previous implementations of this technique have focused on spatial level of detail reduction, which typically results in noticeable degradation of image quality. This thesis introduces refresh rate modulation (RRM), […]