10827

Posts

Oct, 26

A Parallel Depth-aided Exemplar-based Inpainting for Real-time View Synthesis on GPU

Synthesizing new images from given image pair and their corresponding depth maps is an essential function for many 3D video applications. Exemplar-based inpainting methods have been proposed in recent years to be used to restore newly synthesized images by strategically filling the missing pixels which don’t have any references due to occlusion. Due to the […]
Oct, 25

A Datalog Engine for GPUs

We present the design and evaluation of a Datalog engine for execution in Graphics Processing Units (GPUs). The engine evaluates recursive and non-recursive Datalog queries using a bottom-up approach based on typical relational operators. It includes a memory management scheme that automatically swaps data between memory in the host platform (a multicore) and memory in […]
Oct, 25

Online Performance Projection for Clusters with Heterogeneous GPUs

We present a fully automated approach to project the relative performance of an OpenCL program over different GPUs. Performance projections can be made within a small amount of time, and the projection overhead stays relatively constant with the input data size. As a result, the technique can help runtime tools make dynamic decisions about which […]
Oct, 25

An Empirical Study of Intel Xeon Phi

With at least 50 cores, Intel Xeon Phi is a true many-core architecture. Featuring fairly powerful cores, two cache levels, and very fast interconnections, the Xeon Phi can get a theoretical peak of 1000 GFLOPs and over 240 GB/s. These numbers, as well as its flexibility – it can be used both as a coprocessor […]
Oct, 25

GGAS: Global GPU Address Spaces for Efficient Communication in Heterogeneous Clusters

Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics applications, but are also employed to accelerate computationally intensive general-purpose tasks. For utmost performance, GPUs are distributed throughout the cluster to process parallel programs. In fact, many recent high-performance systems in the TOP500 list are heterogeneous architectures. Despite being highly effective […]
Oct, 25

Efficient SDS Simulations on Multi-GPU Nodes of XSEDE High-end Clusters

Efficiently studying Sodium Dodecyl Sulfate (SDS) molecules’ formations in the presence of different molar concentrations on high-end GPU clusters whose nodes share accelerators exposes us to several challenges, including the need to dynamically adapt the job lengths. Neither virtualization nor lightweight OS solutions can easily support generality, portability, and maintainability in concert. Our solution complements […]
Oct, 24

Parallel GPU algorithms for alternate-triangular finite difference schemes

Parallel algorithms for modern high performance computing systems are required for fast modelling of high dimensional convection-diffusion processes in air. Such algorithms, designed for alternate-triangular finite difference splitting schemes applied to convection-diffusion equation, have been considered. An algorithm for single GPU systems and an algorithm for clusters with graphical processors has been described, algorithms’ performance […]
Oct, 24

Modeling system for GPU parallel tasks performance simulation

A flexible and extensible simulation tool architecture, called gpusim, is proposed for heterogeneous grid systems with graphics accelerators. The tool is based on open source Java framework GridSim. Checking for models adequacy and their initial investigation has been performed using known examples of parallel computation problems. The tool allows choosing the most optimal setting parameters […]
Oct, 24

A Framework for Management of Distributed Data Processing and Event Selection for the Icecube Neutrino Observatory

IceCube is a one-gigaton neutrino detector designed to detect high-energy cosmic neutrinos. It is located at the geographic South Pole and was completed at the end of 2010. Simulation and data processing for IceCube requires a significant amount of computational power. We describe the design and functionality of IceProd, a management system based on Python, […]
Oct, 24

Towards a Unified Sentiment Lexicon (USL) based on Graphics Processing Units (GPUs)

This paper presents an approach to create what we have called a Unified Sentiment Lexicon (USL). This approach aims at aligning, unifying and expanding the set of sentiment lexicons which are available on the web in order to increase their robustness of coverage. A sentiment lexicon is a critical and essential resource for tagging subjective […]
Oct, 24

A multi-Teraflop Constituency Parser using GPUs

Constituency parsing with rich grammars remains a computational challenge. Graphics Processing Units (GPUs) have previously been used to accelerate CKY chart evaluation, but gains over CPU parsers were modest. In this paper, we describe a collection of new techniques that enable chart evaluation at close to the GPU’s practical maximum speed (a Teraflop), or around […]
Oct, 24

gEMpicker: A Highly Parallel GPU-Accelerated Particle Picking Tool for Cryo-Electron Microscopy

BACKGROUND: Picking images of particles in cryo-electron micrographs is an important step in solving the 3D structures of large macromolecular assemblies. However, in order to achieve sub-nanometre resolution it is often necessary to capture and process many thousands or even several millions of 2D particle images. Thus, a computational bottleneck in reaching high resolution is […]

Recent source codes

* * *

* * *

HGPU group © 2010-2025 hgpu.org

All rights belong to the respective authors

Contact us:

contact@hpgu.org