Posts
Jan, 19
Accelerating Quadrature Methods for Option Valuation
This paper presents an architecture for FPGA acceleration of quadrature methods used for pricing complex options, such as discrete barrier, Bermudan, and American options. The architecture can be optimized for speed and power consumption by exploiting pipelining and parallelism to produce efficient implementations in reconfigurable logic. An optimised implementation using Graphics Processing Units (GPUs) is […]
Jan, 19
Implicit Parallel Time Integrators
In this work, we discuss a family of parallel implicit time integrators for multi-core and potentially multi-node or multi-gpgpu systems. The method is an extension of Revisionist Integral Deferred Correction (RIDC) by Christlieb, Macdonald and Ong (SISC-2010) which constructed parallel explicit time integrators. The key idea is to re-write the defect correction framework so that, […]
Jan, 19
Energy efficient biomolecular simulations with FPGA-based reconfigurable computing
Reconfigurable computing (RC) is being investigated as a hardware solution for improving time-to-solution for biomolecular simulations. A number of popular molecular dynamics (MD) codes are used to study various aspects of biomolecules. These codes are now capable of simulating nanosecond time-scale trajectories per day on conventional microprocessor-based hardware, but biomolecular processes often occur at the […]
Jan, 19
Towards microsecond biological molecular dynamics simulations on hybrid processors
Biomolecular simulations continue to become an increasingly important component of molecular biochemistry and biophysics investigations. Performance improvements in the simulations based on molecular dynamics (MD) codes are widely desired. This is particularly driven by the rapid growth of biological data due to improvements in experimental techniques. Unfortunately, the factors, which allowed past performance improvements of […]
Jan, 19
Optimal Utilization of Heterogeneous Resources for Biomolecular Simulations
Biomolecular simulations have traditionally benefited from increases in the processor clock speed and coarse-grain inter-node parallelism on large-scale clusters. With stagnating clock frequencies, the evolutionary path for performance of microprocessors is maintained by virtue of core multiplication. Graphical processing units (GPUs) offer revolutionary performance potential at the cost of increased programming complexity. Furthermore, it has […]
Jan, 19
Accelerating 3D Fourier migration with graphics processing units
Computational cost is a major factor that inhibits the practical application of 3D depth migration. We have developed a fast parallel scheme to speed up 3D wave-equation depth migration on a parallel computing device, i.e., on graphics processing units (GPUs). The third-order optimized generalized-screen propagator is used to take advantage of the built-in software implementation […]
Jan, 19
Introduction to the Report “Interlanguages and Synchronic Models of Computation.”
A novel language system has given rise to promising alternatives to standard formal and processor network models of computation. An interstring linked with a abstract machine environment, shares sub-expressions, transfers data, and spatially allocates resources for the parallel evaluation of dataflow. Formal models called the a-Ram family are introduced, designed to support interstring programming languages […]
Jan, 19
Space and the Synchronic A-Ram
Space is a circuit oriented, spatial programming language designed to exploit the massive parallelism available in a novel formal model of computation called the Synchronic A-Ram, and physically related FPGA and reconfigurable architectures. Space expresses variable grained MIMD parallelism, is modular, strictly typed, and deterministic. Barring operations associated with memory allocation and compilation, modules cannot […]
Jan, 19
Interlanguages and synchronic models of computation
A novel language system has given rise to promising alternatives to standard formal and processor network models of computation. An interstring linked with a abstract machine environment, shares sub-expressions, transfers data, and spatially allocates resources for the parallel evaluation of dataflow. Formal models called the a-Ram family are introduced, designed to support interstring programming languages […]
Jan, 19
Relational query coprocessing on graphics processors
Graphics processors (GPUs) have recently emerged as powerful coprocessors for general purpose computation. Compared with commodity CPUs, GPUs have an order of magnitude higher computation power as well as memory bandwidth. Moreover, new-generation GPUs allow writes to random memory locations, provide efficient interprocessor communication through on-chip local memory, and support a general purpose parallel programming […]
Jan, 18
Secure 3D graphics for virtual machines
In this paper a new approach to API remoting for GPU virtualisation is described which aims to reduce the amount of trusted code involved in 3D rendering for guest VMs. To achieve this it uses a modular driver framework to export large proportions of complex 3D graphics drivers into the guest’s domain. It further provides […]
Jan, 18
Message passing for GPGPU clusters: CudaMPI
We present and analyze two new communication libraries, cudaMPI and glMPI, that provide an MPI-like message passing interface to communicate data stored on the graphics cards of a distributed-memory parallel computer. These libraries can help applications that perform general purpose computations on these networked GPU clusters. We explore how to efficiently support both point-to-point and […]