high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Exploring Scalability in C++ Parallel STL Implementations

Exploring Scalability in C++ Parallel STL Implementations

Ruben Laso, Diego Krupitza, Sascha Hunold

Faculty of Informatics, TU Wien, Vienna, Austria

Proceedings of the 53rd International Conference on Parallel Processing (ICPP’24), 2024

DOI:10.1145/3673038.3673065

@inproceedings{laso2024exploring,

title={Exploring Scalability in C++ Parallel STL Implementations},

author={Laso, Ruben and Krupitza, Diego and Hunold, Sascha},

booktitle={Proceedings of the 53rd International Conference on Parallel Processing},

pages={284–293},

year={2024}

}

Download (PDF)

View

Source

Source codes

Package:

pSTL-Bench: Micro-benchmark suite designed to evaluate the scalability and efficiency of parallel C++ STL implementations

1240

views

Since the advent of parallel algorithms in the C++17 Standard Template Library (STL), the STL has become a viable framework for creating performance-portable applications. Given multiple existing implementations of the parallel algorithms, a systematic, quantitative performance comparison is essential for choosing the appropriate implementation for a particular hardware configuration. In this work, we introduce a specialized set of micro-benchmarks to assess the scalability of the parallel algorithms in the STL. By selecting different backends, our micro-benchmarks can be used on multi-core systems and GPUs. Using the suite, in a case study on AMD and Intel CPUs and NVIDIA GPUs, we were able to identify substantial performance disparities among different implementations, including GCC+TBB, GCC+HPX, Intel’s compiler with TBB, or NVIDIA’s compiler with OpenMP and CUDA.

Tags: Benchmarking, Computer science, CUDA, nVidia, nVidia Ampere A2, OpenMP, Package, Performance, Tesla T4

September 1, 2024 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Exploring Scalability in C++ Parallel STL Implementations

Package:

Your response

Recent source codes

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

CuFuzz: An API-Knowledge-Graph Coverage-Driven Fuzzing Framework for CUDA Libraries

AutoPass: Evidence-Guided LLM Agents for Compiler Performance Tuning

Probe-and-Refine Tuning of Repository Guidance for AI Coding Agents

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

Most viewed papers (last 30 days)

Exploring Scalability in C++ Parallel STL Implementations

Package:

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)