Posts
Feb, 14
Hypercubic Storage Layout and Transforms in Arbitrary Dimensions using GPUs and CUDA
Many simulations in the physical sciences are expressed in terms of rectilinear arrays of variables. It is attractive to develop such simulations for use in 1-, 2-, 3- or arbitrary physical dimensions and also in a manner that supports exploitation of data-parallelism on fast modern processing devices. We report on data layouts and transformation algorithms […]
Feb, 14
Comparing Intra- and Inter-Processor Parallelism on Multi-Core Cell Processors for Scientific Simulations
The Cell Broadband Engine (Cell BE) multi-core processor from the STI consortium of Sony, Toshiba and IBM is a powerful but complex processing device that has attracted much attention since its inclusion in Sony PlayStation (PS3) gaming consoles. We report on some performance experiments using the multicore Synergistic Processing Elements (SPE) concurrency capabilities of this […]
Feb, 14
Biomolecular electrostatics using a fast multipole BEM on up to 512 GPUs and a billion unknowns
We present teraflop-scale calculations of biomolecular electrostatics enabled by the combination of algorithmic and hardware acceleration. The algorithmic acceleration is achieved with the fast multipole method (FMM) in conjunction with a boundary element method (BEM) formulation of the continuum electrostatic model, as well as the BIBEE approximation to BEM. The hardware acceleration is achieved through […]
Feb, 14
An FPGA-based Torus Communication Network
We describe the design and FPGA implementation of a 3D torus network (TNW) to provide nearest-neighbor communications between commodity multi-core processors. The aim of this project is to build up tightly interconnected and scalable parallel systems for scientific computing. The design includes the VHDL code to implement on latest FPGA devices a network processor, which […]
Feb, 14
16th International Workshop on High-Level Parallel Programming Models and Supportive Environments, HIPS 2011
The 16th HIPS workshop, to be held as a full-day meeting at the IPDPS 2011 conference in Anchorage, focuses on high-level programming of multiprocessors, compute clusters, and massively parallel machines. Like previous workshops in the series, which was established in 1996, this event serves as a forum for research in the areas of parallel applications, […]
Feb, 13
Speed and Portability issues for Random Number Generation on Graphical Processing Units with CUDA and other Processing Accelerators
Generating quality random numbers is a performance-critical application for many scientific simulations. Modern processing acceleration techniques such as: graphical co-processing units(GPUs), multi-core conventional CPUs; special purpose multicore CPUs; and parallel computing approaches such as multi-threading on shared memory or message passing on clusters, all offer ways to speed up random number generation (RNG). Providing fast […]
Feb, 13
Cluster and Fast-Update Simulations of Regular and Rewired Lattice Ising Models Using CUDA and Graphical Processing Units
Models such as the Ising system in computational physics are still important tools for analysing phase transitions and universal behaviours for new irregular and distorted lattice networks. Data-parallelism can be exploited to speed up such simulations as well as their analysis using general purpose graphical processing units (GPU) and other accelerating devices. We report on […]
Feb, 13
Automated and parallel code generation for finite-differencing stencils with arbitrary data types
Finite-Differencing and other regular and direct approaches to solving partial differential equations (PDEs) are methods that fit well on data-parallel computer systems. These problems continue to arise in many application areas of computational science and engineering but still offer some programming challenges as they are not readily incorporated into a general standard software library that […]
Feb, 13
Visualising spins and clusters in regular and small-world Ising models with GPUs
Visualising computational simulation models of solid state physical systems is a hard problem for dense lattice models. Fly-throughs and cutaways can aid viewer understanding of a simulated system. Interactive time model parameter updates and overlaying of measurements and graticules, cluster colour labelling and other visual highlighting cues can also enhance user intuition of the model’s […]
Feb, 13
Data-Parallelism and GPUs for Lattice Gas Fluid Simulations
Lattice gas cellular automata (LGCA) models provide a relatively fast means of simulating fluid flow and can give both quantitative and qualitative insights into flow patterns around complex obstacles. Symmetry requirements inherent in the Navier-Stokes equation mandate that lattice-gas approximations to the full field equations be run on triangular lattices in two dimensions and on […]
Feb, 13
GPU-based Multi-Volume Rendering of Complex Data in Neuroscience and Neurosurgery
Recent advances in image acquisition technology and its availability in the medical and bio-medical fields have lead to an unprecedented amount of high-resolution imaging data. However, the inherent complexity of this data, caused by its tremendous size, complex structure or multi-modality poses several challenges for current visualization tools. Recent developments in graphics hardware architecture have […]
Feb, 13
Comparison of GPU Architectures for Asynchronous Communication with Finite-Differencing Applications
Graphical Processing Units (GPUs) are good data-parallel performance accelerators for solving regular mesh partial differential equations (PDEs) whereby low-latency communications and high compute to communications ratios can yield very high levels of computational efficiency. Finite-difference time-domain methods still play an important role for many PDE applications. Iterative multi-grid and multilevel algorithms can converge faster than […]