Posts
Jan, 19
Optimal Utilization of Heterogeneous Resources for Biomolecular Simulations
Biomolecular simulations have traditionally benefited from increases in the processor clock speed and coarse-grain inter-node parallelism on large-scale clusters. With stagnating clock frequencies, the evolutionary path for performance of microprocessors is maintained by virtue of core multiplication. Graphical processing units (GPUs) offer revolutionary performance potential at the cost of increased programming complexity. Furthermore, it has […]
Jan, 19
Accelerating 3D Fourier migration with graphics processing units
Computational cost is a major factor that inhibits the practical application of 3D depth migration. We have developed a fast parallel scheme to speed up 3D wave-equation depth migration on a parallel computing device, i.e., on graphics processing units (GPUs). The third-order optimized generalized-screen propagator is used to take advantage of the built-in software implementation […]
Jan, 19
Introduction to the Report “Interlanguages and Synchronic Models of Computation.”
A novel language system has given rise to promising alternatives to standard formal and processor network models of computation. An interstring linked with a abstract machine environment, shares sub-expressions, transfers data, and spatially allocates resources for the parallel evaluation of dataflow. Formal models called the a-Ram family are introduced, designed to support interstring programming languages […]
Jan, 19
Space and the Synchronic A-Ram
Space is a circuit oriented, spatial programming language designed to exploit the massive parallelism available in a novel formal model of computation called the Synchronic A-Ram, and physically related FPGA and reconfigurable architectures. Space expresses variable grained MIMD parallelism, is modular, strictly typed, and deterministic. Barring operations associated with memory allocation and compilation, modules cannot […]
Jan, 19
Interlanguages and synchronic models of computation
A novel language system has given rise to promising alternatives to standard formal and processor network models of computation. An interstring linked with a abstract machine environment, shares sub-expressions, transfers data, and spatially allocates resources for the parallel evaluation of dataflow. Formal models called the a-Ram family are introduced, designed to support interstring programming languages […]
Jan, 19
Relational query coprocessing on graphics processors
Graphics processors (GPUs) have recently emerged as powerful coprocessors for general purpose computation. Compared with commodity CPUs, GPUs have an order of magnitude higher computation power as well as memory bandwidth. Moreover, new-generation GPUs allow writes to random memory locations, provide efficient interprocessor communication through on-chip local memory, and support a general purpose parallel programming […]
Jan, 18
Secure 3D graphics for virtual machines
In this paper a new approach to API remoting for GPU virtualisation is described which aims to reduce the amount of trusted code involved in 3D rendering for guest VMs. To achieve this it uses a modular driver framework to export large proportions of complex 3D graphics drivers into the guest’s domain. It further provides […]
Jan, 18
Message passing for GPGPU clusters: CudaMPI
We present and analyze two new communication libraries, cudaMPI and glMPI, that provide an MPI-like message passing interface to communicate data stored on the graphics cards of a distributed-memory parallel computer. These libraries can help applications that perform general purpose computations on these networked GPU clusters. We explore how to efficiently support both point-to-point and […]
Jan, 18
Blasting through lattice calculations using CUDA
Modern graphics hardware is designed for highly parallel numerical tasks and provides significant cost and performance benefits. Graphics hardware vendors are now making available development tools to support general purpose high performance computing. Nvidia’s CUDA platform, in particular, offers direct access to graphics hardware through a programming language similar to C. Using the CUDA platform […]
Jan, 18
An exploration of CUDA and CBEA for a gravitational wave data-analysis application (Einstein@Home)
We present a detailed approach for making use of two new computer hardware architectures — CBEA and CUDA — for accelerating a scientific data-analysis application (Einstein@Home). Our results suggest that both the architectures suit the application quite well and the achievable performance in the same software developmental time-frame, is nearly identical.
Jan, 18
A novel approach for implementing Steganography with computing power obtained by combining Cuda and Matlab
With the current development of multiprocessor systems, strive for computing data on such processor have also increased exponentially. If the multi core processors are not fully utilized, then even though we have the computing power the speed is not available to the end users for their respective applications. In accordance to this, the users or […]
Jan, 18
A Multi-Stage CUDA Kernel for Floyd-Warshall
We present a new implementation of the Floyd-Warshall All-Pairs Shortest Paths algorithm on CUDA. Our algorithm runs approximately 5 times faster than the previously best reported algorithm. In order to achieve this speedup, we applied a new technique to reduce usage of on-chip shared memory and allow the CUDA scheduler to more effectively hide instruction […]