Vector Predication Roadmap¶
This proposal defines a roadmap towards native vector predication in LLVM, specifically for vector instructions with a mask and/or an explicit vector length. LLVM currently has no target-independent means to model predicated vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V extension and NEC SX-Aurora. Only some predicated vector operations, such as masked loads and stores, are available through intrinsics [MaskedIR].
The Vector Predication (VP) extensions is a concrete RFC and prototype implementation to achieve native vector predication in LLVM. The VP prototype and all related discussions can be found in the VP patch on Phabricator [VPRFC].
There is a consensus on the semantics/instruction set of VP.
VP intrinsics and attributes are available on IR level.
TTI has capability flags for VP (
Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer), potential integration in Clang with builtins.
VP intrinsics translate to first-class SDNodes (eg
llvm.vp.fdiv.* -> vp_fdiv).
VP legalization (legalize explicit vector length to mask (AVX512), legalize VP SDNodes to pre-existing ones (SSE, NEON)).
Result: Backend development based on VP SDNodes.
Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes that match standard vector IR and VP intrinsics.
Add a matcher context to PatternMatch and context-aware IR Builder APIs.
Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular vector instructions.
Incrementally lift InstCombine/InstSimplify to operate on VP as well as regular IR instructions.
Result: Optimization of VP intrinsics on par with standard vector instructions.
Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP.
DCE transitional APIs.
Result: VP has superseded earlier vector intrinsics.
Vector instructions have an optional mask and vector length parameter. These lower to VP SDNodes (from Stage 2).
Phase out VP intrinsics, only keeping those that are not equivalent to vectorized scalar instructions (reduce, shuffles, ..)
InstCombine/InstSimplify expect predication in regular Instructions (Stage (3) has laid the groundwork).
Result: Native vector predication in IR.
llvm.masked.* intrinsics, https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics