high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Historic Learning Approach for Auto-tuning OpenACC Accelerated Scientific Applications

Historic Learning Approach for Auto-tuning OpenACC Accelerated Scientific Applications

Shahzeb Siddiqui, Saber Feki

King Abdullah University of Science and Technology, Kingdom of Saudi Arabia

The Ninth International Workshop on Automatic Performance Tuning (iWAPT), 2014

BibTeX

Download (PDF)

View

Source

2002

views

The performance optimization of scientific applications usually requires an in-depth knowledge of the hardware and software. A performance tuning mechanism is suggested to automatically tune OpenACC parameters to adapt to the execution environment on a given system. A historic learning based methodology is suggested to prune the parameter search space for a more efficient auto-tuning process. This approach is used to tune the OpenACC gang and vector clauses for a better mapping of the compute kernels onto the underlying architecture. Our experiments show a significant performance improvement against the default compiler parameters and drastic reduction in tuning time compared to a brute force search-based approach.

Tags: Computer science, CUDA, nVidia, OpenACC, Performance, Tesla K20

July 3, 2014 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Historic Learning Approach for Auto-tuning OpenACC Accelerated Scientific Applications

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

Historic Learning Approach for Auto-tuning OpenACC Accelerated Scientific Applications

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)