high performance computing on graphics processing units: hgpu.org

C++ on GPUs Using OpenACC and the PGI Accelerator Compilers, webinar

Thursday, May 22, 2014, 2:00 PM ET / 11:00 AM PT / 18:00 GMT (Duration: 1 hour)

https://event.on24.com/eventRegistration/EventLobbyServlet?target=registration.jsp&eventid=780837&sessionid=1&key=6DF7F4C3EBDC377E23F84C7DF73C173A&sourcepage=register

The fastest supercomputers and clusters use a 64-bit host processor with one or more accelerators per node, most commonly GPUs. These compute accelerators exploit a high degree of parallelism to maximize performance and power efficiency. There are several challenges to effective and productive use of accelerators, the most important of which are managing data movement between host and device memories and exposing enough parallelism to benefit from the GPU or accelerator architecture.

The OpenACC API is a directive-based programming interface, similar to OpenMP, that allows a programmer to maintain a single source program for both host and host+accelerator systems. OpenACC allows the application to control data allocation and movement as well as computation placement.

There are several OpenACC implementations, which are mature and stable for C and Fortran, but C++ support has lagged, in part due to the complexities introduced with C++ classes, member functions, implicit ‘this’ pointer arguments, private data and templates. This webinar will present the latest support for C++ classes in the PGI Accelerator compilers. We will look at how to annotate classes and member functions with OpenACC directives to generate accelerated C++ programs. We will also discuss current limitations and the future directions that OpenACC is designing.

No votes yet.

Please wait...

May 16, 2014 by hgpu

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org

C++ on GPUs Using OpenACC and the PGI Accelerator Compilers, webinar

Your response

Recent source codes

CUDAnalyst (CUDA + Analyst)

CodegenBench

KernelBenchX: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels

CUDA Kernel Fusion Benchmarks

IntelliKit: Agent-first tooling for AMD hardware

DITRON: Distributed Compiler based on Triton for Parallel Systems

CuTile Benchmark Suite: Performance and Productivity Tradeoffs for GPU Kernel Programming on Blackwell Architecture

Agentic Code Optimization via Compiler-LLM Cooperation

Device Virtual Machine (DVM)

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

Most viewed papers (last 30 days)

C++ on GPUs Using OpenACC and the PGI Accelerator Compilers, webinar

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)