high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Development of High-Performance Software Components for Emerging Architectures

Development of High-Performance Software Components for Emerging Architectures

Stefan Lemvig Glimberg, Allan Peter Engsig-Karup, Allan S. Nielsen, Bernd Dammann

Technical University of Denmark

November 18, 2014

@inbook{a73de37aefd2495da4d2ba98f54b1103,

title={Development of software components for heterogeneous many-core architectures},

publisher={Taylor & Francis},

author={Glimberg, Stefan Lemvig and Engsig-Karup, Allan Peter and Nielsen, Allan S. and Dammann, Bernd},

note={2013;6},

year={2013},

editor={Raphaël Couturier},

isbn={978-1-4665-7162-4},

pages={73–104},

booktitle={Designing Scientific Applications on GPUs}

}

Download (PDF)

View

Source

Source codes

Package:

GLAS library

2457

views

Massively parallel processors, such as graphical processing units (GPUs), have in recent years proven to be effective for a vast amount of scientific appli- cations. Today, most desktop computers are equipped with one or more pow- erful GPUs, offering heterogeneous high-performance computing to a broad range of scientific researchers and software developers. Though GPUs are now programmable and can be highly effective compute units, they still pose challenges for software developers to fully utilize their efficiency. Sequential legacy codes are not always easily parallelized, and the time spent on conversion might not pay off in the end. This is particular true for heterogenous computers, where the architectural differences between the main- and co-processor can be so significant, that they call for completely different optimization strate- gies. The cache hierarchy management of CPUs and GPUs being an evident example hereof. In the past, industrial companies were able to boost application performance solely by upgrading their hardware systems, with an overt balance between investment and performance speedup. Today, the picture is different, not only do they have to invest in new hardware, but also account for the adaption and training of their software developers. What traditionally used to be a hardware problem, addressed by the chip manufacturers, has now become a software problem for application developers.
Software libraries can be a tremendous help for developers as they make it easier to implement an application, without having to know about the complexity of the underlying computer hardware, known as opacity [1]. The ul- timate goal for a successful library is to simplify the process of writing new software and thus to increase developer productivity. Since programmable heterogeneous CPU/GPU systems are a rather new phenomenon, there is yet a limited number of established software libraries that take full advantage of such heterogeneous high performance systems, and there are no de-facto design standards for such systems either. Some existing libraries for conventional ho- mogeneous systems have already added support for offloading computational intense operations onto co-processing GPUs. However, this approach comes with the cost of frequent memory transfers across the low bandwidth PCIe bus.

Tags: Computer science, CUDA, GPGPU, iterative methods, Mathematics, multigrid, Package, parallel computing, Partial differential equations, Software development

March 12, 2014 by apek

No votes yet.

Please wait...

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

gpu_tracker: Python package for tracking and profiling GPU utilization in both desktop and high-performance computing environments

* * *

high performance computing on graphics processing units: hgpu.org

Development of High-Performance Software Components for Emerging Architectures

Package:

Recent source codes

QArray

Celerity: High-level C++ for Accelerator Clusters

CIFAR-10 Airbench: 94% on CIFAR-10 in 3.29 second

gpu_tracker: Context manager and CLI that tracks the computational-resource-usage of a code block or shell command, particularly the GPU usage

LOOPer: a polyhedral compiler for expressing fast and portable data parallel algorithms

OpenMC Monte Carlo Code

Polygeist: C/C++ frontend for MLIR

Parallel Gaussian process with kernel approximation in CUDA

Optical flow algorithms for SYCL

OpenMP5-Offload-OpenMC-Intel-PVC

Most viewed papers (last 30 days)

Development of High-Performance Software Components for Emerging Architectures

Package:

Share this:

Recent source codes

Most viewed papers (last 30 days)