WCCV: Improving the Vectorization of IF-statements with Warp-Coherent Conditions
University of Münster, Münster, Germany
ACM International Conference on Supercomputing, 2019
@inproceedings{sun2019wccv,
title={WCCV: improving the vectorization of IF-statements with warp-coherent conditions},
author={Sun, Huihui and Fey, Florian and Zhao, Jie and Gorlatch, Sergei},
booktitle={Proceedings of the ACM International Conference on Supercomputing},
pages={319–329},
year={2019},
organization={ACM}
}
When vectorizing programs for modern processors with SIMD extensions, IF-statements pose a challenge: existing vectorization approaches often introduce redundant computations or they resort to inefficient masked instructions. In this paper, we introduce a new notion of warp-coherence for conditions that exhibit coherent run-time behavior on different lanes of a vector register. We demonstrate that warp-coherent conditions appear frequently in practice. We present Warp-Coherent Condition Vectorization (WCCV) – an approach to detecting and optimizing IF-statements with warp-coherent conditions – to efficiently vectorize programs with IF-statements while avoiding the overhead of existing methods. WCCV detects warp-coherent conditions via the affine analysis of conditional boolean expressions and branch predication of IF-statements; the runtime code generated by WCCV avoids redundant computations and masked instructions. We employ auto-tuning to find the optimal benefit-overhead ratio for WCCV. We implement WCCV on top of Region Vectorizer (RV) – an LLVM-based vectorizing compiler, and we conduct experiments on the Rodinia benchmark suite, achieving a mean speedup of 1.14x over the original vectorized and optimized code, and speedup between 0.98x and 7.02x over the scalar code on Skylake with AVX512.
June 30, 2019 by hgpu