high performance computing on graphics processing units: hgpu.org

Posts

Oct, 26

International Conference on Computational Science, ICCS 2014

The International Conference on Computational Science is an annual conference that brings together researchers and scientists from mathematics and computer science as basic computing disciplines, researchers from various application areas who are pioneering computational methods in sciences such as physics, chemistry, life sciences, and engineering, as well as in arts and humanitarian fields, to discuss […]

Oct, 26

22nd ACME Conference on Computational Mechanics

The purpose of this conference is to share state-of-the-art research findings and experience across the full range of Computational Mechanics. The conference organising committee is particularly keen to encourage the participation of young researchers, including PhD students and research assistants. The Conference will emphasize on recent developments in the field of Computational Mechanics through a […]

Oct, 26

Scalable Simulation of Tsunamis Generated by Submarine Landslides on GPU clusters

In this work we describe a GPU implementation of a first order two-layer Savage-Hutter type model introduced by E. D. Fernandez-Nieto et al in 2008 to simulate tsunamis generated by underwater landslides using the CUDA framework over structured meshes. We also describe an extension of this implementation which exploits the parallel power of a GPU […]

CUDA

Oct, 26

A New Approach of Performance Analysis of Certain Graph Algorithms

Computer Network based problems often require searching a node from another and finding a path from one node to another. To solve this we use graph algorithms. Solving these problems takes a lot of time and knowledge when solved manually. For this purpose graph algorithms where devised and solving these problems became easier but the […]

CUDA

Oct, 26

A Parallel Depth-aided Exemplar-based Inpainting for Real-time View Synthesis on GPU

Synthesizing new images from given image pair and their corresponding depth maps is an essential function for many 3D video applications. Exemplar-based inpainting methods have been proposed in recent years to be used to restore newly synthesized images by strategically filling the missing pixels which don’t have any references due to occlusion. Due to the […]

CUDA

Oct, 25

A Datalog Engine for GPUs

We present the design and evaluation of a Datalog engine for execution in Graphics Processing Units (GPUs). The engine evaluates recursive and non-recursive Datalog queries using a bottom-up approach based on typical relational operators. It includes a memory management scheme that automatically swaps data between memory in the host platform (a multicore) and memory in […]

CUDA

Oct, 25

Online Performance Projection for Clusters with Heterogeneous GPUs

We present a fully automated approach to project the relative performance of an OpenCL program over different GPUs. Performance projections can be made within a small amount of time, and the projection overhead stays relatively constant with the input data size. As a result, the technique can help runtime tools make dynamic decisions about which […]

OpenCL

Oct, 25

An Empirical Study of Intel Xeon Phi

With at least 50 cores, Intel Xeon Phi is a true many-core architecture. Featuring fairly powerful cores, two cache levels, and very fast interconnections, the Xeon Phi can get a theoretical peak of 1000 GFLOPs and over 240 GB/s. These numbers, as well as its flexibility – it can be used both as a coprocessor […]

Oct, 25

GGAS: Global GPU Address Spaces for Efficient Communication in Heterogeneous Clusters

Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics applications, but are also employed to accelerate computationally intensive general-purpose tasks. For utmost performance, GPUs are distributed throughout the cluster to process parallel programs. In fact, many recent high-performance systems in the TOP500 list are heterogeneous architectures. Despite being highly effective […]

CUDA

Oct, 25

Efficient SDS Simulations on Multi-GPU Nodes of XSEDE High-end Clusters

Efficiently studying Sodium Dodecyl Sulfate (SDS) molecules’ formations in the presence of different molar concentrations on high-end GPU clusters whose nodes share accelerators exposes us to several challenges, including the need to dynamically adapt the job lengths. Neither virtualization nor lightweight OS solutions can easily support generality, portability, and maintainability in concert. Our solution complements […]

CUDA

Oct, 24

Parallel GPU algorithms for alternate-triangular finite difference schemes

Parallel algorithms for modern high performance computing systems are required for fast modelling of high dimensional convection-diffusion processes in air. Such algorithms, designed for alternate-triangular finite difference splitting schemes applied to convection-diffusion equation, have been considered. An algorithm for single GPU systems and an algorithm for clusters with graphical processors has been described, algorithms’ performance […]

OpenCL

Oct, 24

Modeling system for GPU parallel tasks performance simulation

A flexible and extensible simulation tool architecture, called gpusim, is proposed for heterogeneous grid systems with graphics accelerators. The tool is based on open source Java framework GridSim. Checking for models adequacy and their initial investigation has been performed using known examples of parallel computation problems. The tool allows choosing the most optimal setting parameters […]

CUDA