4930

Posts

Jul, 22

Parallel implementation of Multi-dimensional Ensemble Empirical Mode Decomposition

In this paper, we propose and evaluate two parallel implementations of Multi-dimensional Ensemble Empirical Mode Decomposition (MEEMD) for multi-core (CPU) and many-core (GPU) architectures. Relative to a sequential C implementation, our double precision GPU implementation, using the CUDA programming model, achieves up to 48.6x speedup on NVIDIA Tesla C2050. Our multi-core CPU implementation, using the […]
Jul, 22

GPGPU-Accelerated Parallel and Fast Simulation of Thousand-Core Platforms

The multicore revolution and the ever-increasing complexity of computing systems is dramatically changing sys-tem design, analysis and programming of computing platforms. Future architectures will feature hundreds to thousands of simple processors and on-chip memories connected through a network-on-chip. Architectural simulators will remain primary tools for design space exploration, software development and performance evaluation of these […]
Jul, 22

A characterization of the Rodinia benchmark suite with comparison to contemporary CMP workloads

The recently released Rodinia benchmark suite enables users to evaluate heterogeneous systems including both accelerators, such as GPUs, and multicore CPUs. As Rodinia sees higher levels of acceptance, it becomes important that researchers understand this new set of benchmarks, especially in how they differ from previous work. In this paper, we present recent extensions to […]
Jul, 22

Real time image reconstruction using GPUs for a surgical PET imaging probe system

We present an on-line list-mode image reconstruction system using GPUs for a surgical PET imaging probe system. We used the nVidia GeForce 9800GTX+ and CUDA to reconstruct images. The proposed system can generate a three-dimensional image from simulated data in 70 msec. We also compared the processing time with respect to the number of LORs […]
Jul, 22

Vendors Draw up a New Graphics-Hardware Approach

For the past five years, there have been two major approaches to providing graphics hardware in PCs, notebooks, game consoles, and workstations. The older technique has been to put graphics processing units (GPUs) on video cards. For example, Nvidia places its G92 GPU on its GeForce GTS 250 card, and AMD places its RV770 GPU […]
Jul, 22

Real-time 3D video synthesis from binocular capture system based on commodity graphic hardware

In this paper, a real-time 3D video synthesis method suitable for implementation on commodity graphic hardware is presented. The system consists of pre-calibrated binocular stereo cameras and an NVIDIA GeForce 8 Series graphic card. Recently, most research has focused on improving the quality of depth maps, which is usually time-consuming and unsuitable for real-time reconstruction. […]
Jul, 22

AES Encryption Implementation on CUDA GPU and Its Analysis

GPU has a good performance ratio and exhibits the capability for applications with high level of parallelism despite its inexpensive price. The support of integer and logical instructions on the latest generation of GPU makes us to implement cipher algorithms easier with the same instructions. However the decisions such as parallel processing granularity or memory […]
Jul, 22

Large Scale Simulations of the Euler Equations on GPU Clusters

The paper investigates the scalability of a parallel Euler solver, using the Vijayasundaram method, on a GPU cluster with 32 Nvidia Geforce GTX 295 boards. The aim of this research is to enable large scale fluid dynamics simulations with up to one billion elements. We investigate communication protocols for the GPU cluster to compensate for […]
Jul, 22

Mapping the Arnold web with a GPU-supercomputer

The Arnold diffusion constitutes a dynamical phenomenon which may occur in the phase space of a non-integrable Hamiltonian system whenever the number of the system degrees of freedom is $M geq 3$. The diffusion is mediated by a web-like structure of resonance channels, which penetrates the phase space and allows the system to explore the […]
Jul, 22

Accelerating Radio Astronomy Cross-Correlation with Graphics Processing Units

We present a highly parallel implementation of the cross-correlation of time-series data using graphics processing units (GPUs), which is scalable to hundreds of independent inputs and suitable for the processing of signals from "Large-N" arrays of many radio antennas. The computational part of the algorithm, the X-engine, is implementated efficiently on Nvidia’s Fermi architecture, sustaining […]
Jul, 22

A Comparison of xPU Platforms Exemplified with Ray Tracing Algorithms

Over the years, faster hardware – with higher clock rates – has been the usual way to improve computing times in computer graphics. Aside from highly costly parallel solutions only affordable by big industries – like the movie industry -, there was no alternative available to desktop users. Nevertheless, this scenario is dramatically changing with […]
Jul, 22

A History-Based Performance Prediction Model with Profile Data Classification for Automatic Task Allocation in Heterogeneous Computing Systems

In this paper, we propose a runtime performance prediction model for automatic selection of accelerators to execute kernels in OpenCL. The proposed method is a history-based approach that uses profile data for performance prediction. The profile data are classified into some groups, from each of which its own performance model is derived. As the execution […]

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us: