Posts
Jan, 6
An asymmetric distributed shared memory model for heterogeneous parallel systems
Heterogeneous computing combines general purpose CPUs with accelerators to efficiently execute both sequential control-intensive and data-parallel phases of applications. Existing programming models for heterogeneous computing rely on programmers to explicitly manage data transfers between the CPU system memory and accelerator memory.
Jan, 6
Efficient gather and scatter operations on graphics processors
Gather and scatter are two fundamental data-parallel operations, where a large number of data items are read (gathered) from or are written (scattered) to given locations. In this paper, we study these two operations on graphics processing units (GPUs).
Jan, 6
q-state Potts model metastability study using optimized GPU-based Monte Carlo algorithms
We implemented a GPU based parallel code to perform Monte Carlo simulations of the two dimensional q-state Potts model. The algorithm is based on a checkerboard update scheme and assigns independent random numbers generators to each thread (one thread per spin). The implementation allows to simulate systems up to ~10^9 spins with an average time […]
Jan, 6
Fully-3D GPU PET reconstruction
Fully-3D iterative tomographic image reconstruction is computationally very demanding. Graphics Processing Unit (GPU) have been proposed for many years as potentially accelerators in complex scientific problems, but it has not been until the recent advances in the programmability of GPUs that the best available reconstruction codes have started to be implemented to be run on […]
Jan, 5
Speeding Up Homomorpic Hashing Using GPUs
Homomorphic hash functions (HHFs) have been applied into peer-to-peer networks with erasure coding or network coding to defend against pollution attacks. Unfortunately HHFs are computationally expensive for contemporary CPUs, This paper to exploit the computing power of graphic processing units (GPUs) for homomorphic hashing. Specifically, we demonstrate how to use NVIDIA GPUs and the computer […]
Jan, 5
CULLIDE: interactive collision detection between complex models in large environments using graphics hardware
We present a novel approach for fast collision detection between multiple deformable and breakable objects in a large environment using graphics hardware. Our algorithm takes into account low bandwidth to and from the graphics cards and computes a potentially colliding set (PCS) using visibility queries. It involves no precomputation and proceeds in multiple stages: PCS […]
Jan, 5
Parallel evolutionary algorithms on graphics processing unit
Evolutionary algorithms (EAs) are effective and robust methods for solving many practical problems such as feature selection, electrical circuit synthesis, and data mining. However, they may execute for a long time for some difficult problems, because several fitness evaluations must be performed. A promising approach to overcome this limitation is to parallelize these algorithms. In […]
Jan, 5
Data-parallel computing
Data parallelism is a key concept in leveraging the power of today’s manycore GPUs.
Jan, 5
Significantly Improved Performances Of The Cryptographically Generated Addresses Thanks To ECC And GPGPU
Cryptographically Generated Addresses (CGA) are today mainly used with the Secure Neighbor Discovery Protocol (SEND). Despite CGA generalization, current standards only show how to construct CGA with the RSA algorithm and SHA-1 hash function. This limitation may prevent new usages of CGA and SEND in mobile environments where nodes are energy and storage limited. In […]
Jan, 5
A design case study: CPU vs. GPGPU vs. FPGA
This paper describes our winning submission for the Absolute Performance category of the MEMOCODE 2009 Design Contest. We show that our GPGPU-based design achieves performance within a factor of four of theoretical maximum performance for the implemented algorithm. This result was reached after a short design-cycle of 2 man-days, which indicates that the NVIDIA CUDA […]
Jan, 5
Revisiting sorting for GPGPU stream architectures
This poster presents efficient strategies for sorting large sequences of fixed-length keys (and values) using GPGPU stream processors. Compared to the state-of-the-art, our radix sorting methods exhibit speedup of at least 2x for all generations of NVIDIA GPGPUs, and up to 3.7x for current GT200-based models. Our implementations demonstrate sorting rates of 482 million key-value […]