high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » On Static Timing Analysis of GPU Kernels

On Static Timing Analysis of GPU Kernels

Vesa Hirvisalo

Aalto University, Espoo, Finland

14th International Workshop on Worst-Case Execution Time Analysis (WCET), 2014

DOI:10.4230/OASIcs.WCET.2014.43

BibTeX

Download (PDF)

View

Source

2014

views

We study static timing analysis of programs running on GPU accelerators. Such programs follow a data parallel programming model that allows massive parallelism on manycore processors. Data parallel programming and GPUs as accelerators have received wide use during the recent years. The timing analysis of programs running on single core machines is well known and applied also in practice. However for multicore and manycore machines, timing analysis presents a significant but yet not properly solved problem. In this paper, we present static timing analysis of GPU kernels based on a method that we call abstract CTA simulation. Cooperative Thread Arrays (CTA) are the basic execution structure that GPU devices use in their operation that proceeds in thread groups called warps. Abstract CTA simulation is based on static analysis of thread divergence in warps and their abstract scheduling.

Tags: Computer science, OpenCL, Performance, Timing analysis

July 4, 2014 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

On Static Timing Analysis of GPU Kernels

Your response

Recent source codes

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

exa-AMD: Exascale Accelerated Materials Discovery

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

WiLLM: An Open Wireless LLM Communication System

Vcc: the Vulkan Clang Compiler

hpcbench: A set of benchmarking utilities for biomolecular simulation tools

Most viewed papers (last 30 days)

On Static Timing Analysis of GPU Kernels

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)