Posts
Mar, 26
Experimental Fault-Tolerant Synchronization for Reliable Computation on Graphics Processors
Graphics processors (GPUs) are emerging as a promising platform for highly parallel, compute-intensive, general-purpose computations, which usually need support for inter-process synchronization. Using the traditional lock-based synchronization (e.g. mutual exclusion) makes the computation vulnerable to faults caused by both scientists’ inexperience and hardware transient errors. It is notoriously difficult for scientists to deal with deadlocks […]
Mar, 26
Hyperspectral Unmixing on GPUs and Multi-Core Processors: A Comparison
One of the main problems in the analysis of remotely sensed hyperspectral data cubes is the presence of mixed pixels, which arise when the spatial resolution of the sensor is not able to separate spectrally distinct materials. Due to this reason, spectral unmixing has become one of the most important tasks for hyperspectral data exploitation. […]
Mar, 24
Parallel Interpretation of L-system Based on CUDA
When L-systems are applied to large and detailed 3D objects, the inherent serial geometric interpretation limits the speed of image generation. To accelerate the interpreting procedure, a Graphic Processing Unit (GPU) based method utilizing Compute Unified Device Architecture (CUDA) is proposed in this paper. The focused approach involves two phases: first is a sequential scan […]
Mar, 24
A Dynamic IP Lookup Architecture using Parallel Multiple Hash in GPU-based Software Router
As the fiber propagation velocity grows and the routing scale expands, IP lookup speed becomes the major bottleneck of high-performance network. Its efficiency directly determines the throughput of the entire routing channel. Recently, Graphics Processing Units (GPUs), highly parallel, flexibility for program and low price, is widely adopted in different areas including software router. In […]
Mar, 24
Recurrence quantification analysis in images with CUDA
This work has as first goal develop an parallel algorithm considerably faster than it serial pair. The algorithm is based on recurrence plots idea, that is a square matrix where it is possible detect many complex behaviours (e.g. with the recurrence quantification analysis technique) of dynamical systems. In the results and conclusion part, we will […]
Mar, 24
Efficiently Mapping the AES Encryption Algorithm on GPUs
Warped AES is a high performance heterogeneous GPU/CPU-SSE parallel method for encryption using GPUs. Considering the performance of encryption in GPU memory alone, our algorithm outperforms current published implementations on comparable hardware. In our ongoing research, we have also devised a speculative method for high throughput encryption on GPUs, while preserving low latency to client […]
Mar, 24
How to Benefit from AMD, Intel and Nvidia Accelerator Technologies in Scilab
This paper presents how to use, in Scilab, accelerator technologies from AMD, Intel and Nvidia in a exible and portable manner. The proposed approach aims at simplifying speeding up Scilab programs in an incremental process thanks to directives based parallel programming provided by CAPS OpenHMPP technology.
Mar, 23
A Novel Data Structure for Particle System Simulation based on GPU with the Use of Neighborhood Grids
Simulation and visualization of particles in real-time can be a computationally intensive task. This intensity comes from diverse factories, being one of them is the O(n^2) complexity of the traversal algorithm, necessary for the proximity queries of all pair of particles that decide the need to compute collisions. Previous works reduced this complexity by considerably […]
Mar, 23
Depth-First Search versus Jurema Search on GPU Branch-and-Bound Algorithms: a case study
Branch-and-Bound (B&B) is a general problem solving paradigm and it has been successfully used to prove the optimality of combinatorial optimization problems. The development of GPU-based parallel Branch-and-Bound algorithm is a brandnew and challenging topic on high performance computing and combinatorial optimization, motivated by GPU’s high performance and low cost. This work presents a strategy […]
Mar, 23
Real-Time Implementation of the Pixel Purity Index Algorithm for Endmember Identification on GPUs
Spectral unmixing amounts at automatically finding the signatures of pure spectral components (called endmembersin the hyperspectral imaging literature) and their associated abundance fractions in each pixel of the hyperspectral image. Many algorithms have been proposed to automatically find spectral endmembers in hyperspectral data sets. Perhaps one of the most popular ones is the pixel purity […]
Mar, 23
GRay: a Massively Parallel GPU-Based Code for Ray Tracing in Relativistic Spacetimes
We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This GPU-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single precision floating-point arithmetic on a single GPU exceeds 300 […]
Mar, 23
Kernelet: High-Throughput GPU Kernel Executions with Dynamic Slicing and Scheduling
Graphics processors, or GPUs, have recently been widely used as accelerators in the shared environments such as clusters and clouds. In such shared environments, many kernels are submitted to GPUs from different users, and throughput is an important metric for performance and total ownership cost. Despite the recently improved runtime support for concurrent GPU kernel […]