10834

Posts

Oct, 27

MPI Parallelization of GPU-based Lattice Boltzmann Simulations

In this thesis, a MPI parallelized LBM code for a Multi-GPU platform has been designed and implemented. The primary goal of the thesis is research on efficient and scalable Multi-GPU LBM code, which exploits advanced features of the modern GPUs, to adopt optimization techniques like overlapping of work and communication in heterogeneous CPU-GPU clusters. In […]
Oct, 27

Airborne Downward Looking Sparse Linear Array 3-D SAR Heterogeneous Parallel Simulation

The airborne downward looking sparse linear array three dimensional synthetic aperture radar (DLSLA 3-D SAR) operates nadir observation with the along-track synthetic aperture formulated by platform movement and the cross-track synthetic aperture formulated by physical sparse linear array. Considering the lack of DLSLA 3-D SAR data in the current preliminary study stage, it is very […]
Oct, 27

Real-Time Stereo Matching using Adaptive Window based Disparity Refinement

In this paper, we propose a real-time stereo matching method based on adaptive window, aiming at the trade-off between accuracy and efficiency in current local stereo matching, Considering that the Census transform has good adaptability to image amplitude distortion, but may introduce matching ambiguities in regions with noise or similar local structures, we combine the […]
Oct, 26

International Conference on Computational Science, ICCS 2014

The International Conference on Computational Science is an annual conference that brings together researchers and scientists from mathematics and computer science as basic computing disciplines, researchers from various application areas who are pioneering computational methods in sciences such as physics, chemistry, life sciences, and engineering, as well as in arts and humanitarian fields, to discuss […]
Oct, 26

22nd ACME Conference on Computational Mechanics

The purpose of this conference is to share state-of-the-art research findings and experience across the full range of Computational Mechanics. The conference organising committee is particularly keen to encourage the participation of young researchers, including PhD students and research assistants. The Conference will emphasize on recent developments in the field of Computational Mechanics through a […]
Oct, 26

Scalable Simulation of Tsunamis Generated by Submarine Landslides on GPU clusters

In this work we describe a GPU implementation of a first order two-layer Savage-Hutter type model introduced by E. D. Fernandez-Nieto et al in 2008 to simulate tsunamis generated by underwater landslides using the CUDA framework over structured meshes. We also describe an extension of this implementation which exploits the parallel power of a GPU […]
Oct, 26

A New Approach of Performance Analysis of Certain Graph Algorithms

Computer Network based problems often require searching a node from another and finding a path from one node to another. To solve this we use graph algorithms. Solving these problems takes a lot of time and knowledge when solved manually. For this purpose graph algorithms where devised and solving these problems became easier but the […]
Oct, 26

A Parallel Depth-aided Exemplar-based Inpainting for Real-time View Synthesis on GPU

Synthesizing new images from given image pair and their corresponding depth maps is an essential function for many 3D video applications. Exemplar-based inpainting methods have been proposed in recent years to be used to restore newly synthesized images by strategically filling the missing pixels which don’t have any references due to occlusion. Due to the […]
Oct, 25

A Datalog Engine for GPUs

We present the design and evaluation of a Datalog engine for execution in Graphics Processing Units (GPUs). The engine evaluates recursive and non-recursive Datalog queries using a bottom-up approach based on typical relational operators. It includes a memory management scheme that automatically swaps data between memory in the host platform (a multicore) and memory in […]
Oct, 25

Online Performance Projection for Clusters with Heterogeneous GPUs

We present a fully automated approach to project the relative performance of an OpenCL program over different GPUs. Performance projections can be made within a small amount of time, and the projection overhead stays relatively constant with the input data size. As a result, the technique can help runtime tools make dynamic decisions about which […]
Oct, 25

An Empirical Study of Intel Xeon Phi

With at least 50 cores, Intel Xeon Phi is a true many-core architecture. Featuring fairly powerful cores, two cache levels, and very fast interconnections, the Xeon Phi can get a theoretical peak of 1000 GFLOPs and over 240 GB/s. These numbers, as well as its flexibility – it can be used both as a coprocessor […]
Oct, 25

GGAS: Global GPU Address Spaces for Efficient Communication in Heterogeneous Clusters

Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics applications, but are also employed to accelerate computationally intensive general-purpose tasks. For utmost performance, GPUs are distributed throughout the cluster to process parallel programs. In fact, many recent high-performance systems in the TOP500 list are heterogeneous architectures. Despite being highly effective […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: