9211

Posts

Apr, 4

C Language Extensions for Hybrid CPU/GPU Programming with StarPU

Modern platforms used for high-performance computing (HPC) include machines with both general-purpose CPUs, and "accelerators", often in the form of graphical processing units (GPUs). StarPU is a C library that addresses this problem by providing users with ways to define "tasks" to be executed on CPUs or GPUs, along with the dependencies among them, and […]
Apr, 3

GPU Accelerated Automated Feature Extraction from Satellite Images

The availability of large volumes of remote sensing data insists on higher degree of automation in feature extraction, making it a need of the hour. Fusing data from multiple sources, such as panchromatic, hyper spectral and LiDAR sensors, enhances the probability of identifying and extracting features such as buildings, vegetation or bodies of water by […]
Apr, 3

Scaling up scientific computations by using map-reduce-like control flow on NUMA architectures

The clock speed of current CPUs and RAM has stopped scaling with Moore’s Law. Yet the scale of applications in science and engineering continues to increase. In order to address this scaling of applications, newer NUMA architectures are emerging. These include parallel disks, hybrid CPU-GPU, and many-core CPUs. Existing CPU-based algorithms, as well as legacy […]
Apr, 3

Astrophysical data mining with GPU. A case study: genetic classification of globular clusters

We present a multi-purpose genetic algorithm, designed and implemented with GPGPU / CUDA parallel computing technology. The model was derived from our CPU serial implementation, named GAME (Genetic Algorithm Model Experiment). It was successfully tested and validated on the detection of candidate Globular Clusters in deep, wide-field, single band HST images. The GPU version of […]
Apr, 3

The Stencil Processing Unit: GPGPU Done Right

As computing moves to exascale, it will be dominated by energy-efficiency. We propose a new GPU-like accelerator called the Stencil Processing Unit (SPU), for implementing dense stencil computations in an energy-efficient manner. We address all the levels of the programming stack, from architecture, programming API, runtime system and compilation. First, a simple architectural innovation to […]
Apr, 3

Synchronization and Ordering Semantics in Hybrid MPI+GPU Programming

Despite the vast interest in accelerator-based systems, programming large multinode GPUs is still a complex task, particularly with respect to optimal data movement across the host-GPU PCIe connection and then across the network. In order to address such issues, GPU-integrated MPI solutions have been developed that integrate GPU data movement into existing MPI implementations. Currently […]
Apr, 1

Solving RFIC Simulation Tasks Using GPU Computations

New generation of General Purpose Graphic Processing Unit (GPGPU) cards with their large computation power allow to approach difficult tasks from Radio Frequency Integrated Circuits (RFICs) modeling area. Using different electromagnetic modeling methods, the Finite Element Method (FEM) and the Finite Integration Technique (FIT), to model Radio Frequency Integrated Circuit (RFIC) devices, large linear equations […]
Apr, 1

Parallelization of the Cuckoo Search using CUDA Architecture

Cuckoo Search is one of the recent swarm itelligence metaheuritics. It has been succesfuly applied to a number of optimization problems but is stil not very well researched. In this paper we present a parallelized version of the Cuckoo Search algorithm. The parallelization is implemented using CUDA architecture. The algorithm is significantly changed compared to […]
Apr, 1

OpenCL parallel Processing using General Purpose Graphical Processing units – TiViPE software development

The aim of this report to elaborate TiViPE modules that make use of Open Computing Language (OpenCL) programming. OpenCL is available in TiViPE from version 2.1.0. The aim of TiViPE is to integrate different technologies in a seamless way using graphical icons [1]. Due to these icons the user does not need to have in […]
Apr, 1

Geometric Algebra Computing Technology for Accelerated Processing Units

Development on embedded devices, even on today’s hardware, limits us to a minimum of third party-library dependencies due to hardware memory and power restrictions. In setups requiring intense geometric operations on limited hardware, such as in robotics, this problem can often lead to a tedious reimplementation of matrix, vector, and quaternion operations. Furthermore, certain unnecessary […]
Apr, 1

A Discussion of Selected Vienna-Libraries for Computational Science

We address the low popularity of C++ in computational science by introducing a set of orthogonal libraries: The CUDA-, OpenCL-, and OpenMP-enabled linear algebra library ViennaCL, the mesh datastructure library ViennaGrid, a data storage facility named ViennaData, and the symbolic math kernel ViennaMath. Finally, we discuss how these orthogonal components interact within the finite element […]
Mar, 31

Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding

GPU based computing has made significant strides in recent years. Unfortunately, GPU program optimizations can introduce subtle concurrency errors, and so incisive formal bug-hunting methods are essential. This paper presents a new formal bug-hunting method for GPU programs that combine barriers and atomics. We present an algorithm called conflict-directed delay-bounded scheduling algorithm (CD) that exploits […]

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: