Function Call Re-Vectorization
UFMG, Brazil
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), 2017
@inproceedings{moreira:hal-01410186,
title={Function Call Re-Vectorization},
author={Moreira, Rubens E A and Collange, Sylvain and Pereira, Fernando Magno Quintao},
url={https://hal.archives-ouvertes.fr/hal-01410186},
booktitle={ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)},
address={Austin, Texas, United States},
year={2017},
month={Feb},
keywords={Compilers; SIMD; SIMT; Function; Programmability},
pdf={https://hal.archives-ouvertes.fr/hal-01410186/file/Moreira_CallRevectorization_PPoPP17.pdf},
hal_id={hal-01410186},
hal_version={v1}
}
Programming languages such as C for CUDA, OpenCL or ISPC have contributed to increase the programmability of SIMD accelerators and graphics processing units. However, these languages still lack the flexibility offered by lowlevel SIMD programming on explicit vectors. To close this expressiveness gap while preserving performance, this paper introduces the notion of Call Re-Vectorization (CREV). CREV allows changing the dimension of vectorization during the execution of a kernel, exposing it as a nested parallel kernel call. CREV affords programmability close to dynamic parallelism, a feature that allows the invocation of kernels from inside kernels, but at much lower cost. In this paper, we present a formal semantics of CREV, and an implementation of it on the ISPC compiler. We have used CREV to implement some classic algorithms, including string matching, depth first search and Bellman-Ford, with minimum effort. These algorithms, once compiled by ISPC to Intel-based vector instructions, are as fast as state-of-the-art implementations, yet much simpler. Thus, CREV gives developers the elegance of dynamic programming, and the performance of explicit SIMD programming.
December 26, 2016 by hgpu