12479
Quentin Avril, Valerie Gouranton, Bruno Arnaldi
We have presented several contributions on the collision detection optimization centered on hardware performance. We focus on the first step (Broad-phase) and propose three new ways of parallelization of the well-known Sweep and Prune algorithm. We first developed a multi-core model takes into account the number of available cores. Multi-core architecture enables us to distribute […]
View View   Download Download (PDF)   
Pritam Prakash Shete, Venkat P. P. K., Dinesh M. Sarode, Mohini Laghate, S. K. Bose
In this paper, we present Compute Unified Device Architecture i.e. CUDA based pyramidal image blending algorithm using an object oriented design patterns. This algorithm is an essential part of an image stitching process for a seamless panoramic mosaic. The CUDA framework is a novel GPU programming framework from NVIDIA. We introduce an object oriented framework […]
View View   Download Download (PDF)   
Cedric Augonnet
Multicore machines equipped with accelerators are becoming increasingly popular in the High Performance Computing ecosystem. Hybrid architectures provide significantly improved energy efficiency, so that they are likely to generalize in the Manycore era. However, the complexity introduced by these architectures has a direct impact on programmability, so that it is crucial to provide portable abstractions […]
View View   Download Download (PDF)   
Pritam Prakash Shete, Venkat P. P. K., Dinesh M. Sarode, Mohini Laghate, S. K. Bose, A. G. Apte
In this paper, we propose and implement the object oriented framework for the CUDA based pyramidal image blending. This algorithm is an essential part of an image stitching process for a seamless panoramic mosaic. The CUDA framework is a novel GPU programming framework from NVIDIA. It offers a complex integration framework and require more than […]
View View   Download Download (PDF)   
Pritam Prakash Shete, Venkat P. P. K., S. K. Bose
We propose and implement a pyramidal image blending algorithm using modern programmable graphic processing units. This algorithm is an essential part of an image stitching process for a seamless panoramic mosaic. The CUDA framework is a novel GPU programming framework from NVIDIA. We realize significant acceleration in computations of the pyramidal image blending algorithm by […]
View View   Download Download (PDF)   
Vaibhav Saxena, Yogish Sabharwal, Pramod Bhatotia
The slow progress in memory access latencies in comparison to CPU speeds has resulted in memory accesses dominating code performance. While architectural enhancements have benefited applications with data locality and sequential access, random memory access still remains a cause for concern. Several benchmarks have been proposed to evaluate the random memory access performance on multicore […]
View View   Download Download (PDF)   
Quentin Avril, Valerie Gouranton, Bruno Arnaldi
In this paper we present a new technique to dynamically adapt the first step (broad phase) of the collision detection process on hardware architecture during simulation. Our approach enables to face the unpredictable evolution of the simulation scenario (this includes addition of complex objects, deletion, split into several objects, …). Our technique of dynamic adaptation […]
View View   Download Download (PDF)   
Ke-yan Liu, Tong Zhang, Lei Wang
In this paper, a hybrid parallel computing framework is proposed for video understanding and retrieval. It is a unified computing architecture based on the Map-Reduce programming model, which supports multi-core and GPU architectures. A key task scheduler is designed for the parallelization of computation tasks. The SVM method is used to train models for video […]
W.C. Barker, S. Thada
The Siemens ECAT HRRT PET scanner has the potential to produce images of the human brain with spatial resolution better than 3 mm. MOLAR (a motion-compensation OSEM List-mode Algorithm for Resolution-recovery) was developed to provide reconstructions of HRRT data with the best possible accuracy and precision. However, a computer cluster is required to generate reconstructions […]
Naoyuki Ichimura
Local invariant features have been widely used as fundamental elements for image matching and object recognition. Although dense sampling of local features is useful in achieving an improved performance in image matching and object recognition, it results in increased computational costs for feature extraction. The purpose of this paper is to develop fast computational techniques […]
View View   Download Download (PDF)   
Cedric Augonnet, Samuel Thibault, Raymond Namyst
Multicore machines equipped with accelerators are becoming increasingly popular. The TOP500-leading RoadRunner machine is probably the most famous example of a parallel computer mixing IBM Cell Broadband Engines and AMD opteron processors. Other architectures, featuring GPU accelerators, are expected to appear in the near future. To fully tap into the potential of these hybrid machines, […]
Cedric Augonnet, Raymond Namyst
Approaching the theoretical performance of heterogeneous multicore architectures, equipped with specialized accelerators, is a challenging issue. Unlike regular CPUs that can transparently access the whole global memory address range, accelerators usually embed local memory on which they perform all their computations using a specific instruction set. While many research efforts have been devoted to offloading […]
View View   Download Download (PDF)   
Page 1 of 212

* * *

* * *

Follow us on Twitter

HGPU group

1735 peoples are following HGPU @twitter

Like us on Facebook

HGPU group

368 people like HGPU on Facebook

HGPU group © 2010-2016 hgpu.org

All rights belong to the respective authors

Contact us: