LLVM 20.0.0git
|
A helper class used for scoring candidates for two consecutive lanes. More...
Public Member Functions | |
LookAheadHeuristics (const TargetLibraryInfo &TLI, const DataLayout &DL, ScalarEvolution &SE, const BoUpSLP &R, int NumLanes, int MaxLevel) | |
int | getShallowScore (Value *V1, Value *V2, Instruction *U1, Instruction *U2, ArrayRef< Value * > MainAltOps) const |
int | getScoreAtLevelRec (Value *LHS, Value *RHS, Instruction *U1, Instruction *U2, int CurrLevel, ArrayRef< Value * > MainAltOps) const |
Go through the operands of LHS and RHS recursively until MaxLevel, and return the cummulative score. | |
Static Public Attributes | |
static const int | ScoreConsecutiveLoads = 4 |
Loads from consecutive memory addresses, e.g. load(A[i]), load(A[i+1]). | |
static const int | ScoreSplatLoads = 3 |
The same load multiple times. | |
static const int | ScoreReversedLoads = 3 |
Loads from reversed memory addresses, e.g. load(A[i+1]), load(A[i]). | |
static const int | ScoreMaskedGatherCandidate = 1 |
A load candidate for masked gather. | |
static const int | ScoreConsecutiveExtracts = 4 |
ExtractElementInst from same vector and consecutive indexes. | |
static const int | ScoreReversedExtracts = 3 |
ExtractElementInst from same vector and reversed indices. | |
static const int | ScoreConstants = 2 |
Constants. | |
static const int | ScoreSameOpcode = 2 |
Instructions with the same opcode. | |
static const int | ScoreAltOpcodes = 1 |
Instructions with alt opcodes (e.g, add + sub). | |
static const int | ScoreSplat = 1 |
Identical instructions (a.k.a. splat or broadcast). | |
static const int | ScoreUndef = 1 |
Matching with an undef is preferable to failing. | |
static const int | ScoreFail = 0 |
Score for failing to find a decent match. | |
static const int | ScoreAllUserVectorized = 1 |
Score if all users are vectorized. | |
A helper class used for scoring candidates for two consecutive lanes.
Definition at line 1511 of file SLPVectorizer.cpp.
|
inline |
Definition at line 1520 of file SLPVectorizer.cpp.
References DL.
|
inline |
Go through the operands of LHS
and RHS
recursively until MaxLevel, and return the cummulative score.
U1
and U2
are the users of LHS
and RHS
(that is LHS
and RHS
are operands of U1
and U2
), except at the beginning of the recursion where these are set to nullptr.
For example:
/// A[0] B[0] A[1] B[1] C[0] D[0] B[1] A[1] /// \ / \ / \ / \ / /// + + + + /// G1 G2 G3 G4 ///
The getScoreAtLevelRec(G1, G2) function will try to match the nodes at each level recursively, accumulating the score. It starts from matching the additions at level 0, then moves on to the loads (level 1). The score of G1 and G2 is higher than G1 and G3, because {A[0],A[1]} and {B[0],B[1]} match with LookAheadHeuristics::ScoreConsecutiveLoads, while {A[0],C[0]} has a score of LookAheadHeuristics::ScoreFail. Please note that the order of the operands does not matter, as we evaluate the score of all profitable combinations of operands. In other words the score of G1 and G4 is the same as G1 and G2. This heuristic is based on ideas described in: Look-ahead SLP: Auto-vectorization in the presence of commutative operations, CGO 2018 by Vasileios Porpodas, Rodrigo C. O. Rocha, Luís F. W. Góes
Definition at line 1739 of file SLPVectorizer.cpp.
References assert(), llvm::SmallSet< T, N, C >::count(), getScoreAtLevelRec(), getShallowScore(), llvm::SmallSet< T, N, C >::insert(), isCommutative(), LHS, RHS, and ScoreFail.
Referenced by llvm::slpvectorizer::BoUpSLP::findBestRootPair(), and getScoreAtLevelRec().
|
inline |
V1
and V2
in consecutive lanes. U1
and U2
are the users of V1
and V2
. Also, checks if V1
and V2
are compatible with instructions in MainAltOps
. Definition at line 1571 of file SLPVectorizer.cpp.
References llvm::all_of(), DL, llvm::ArrayRef< T >::empty(), llvm::ElementCount::getFixed(), llvm::getPointersDiff(), getSameOpcode(), llvm::Value::getType(), llvm::getUnderlyingObject(), getWidenedType(), llvm::ConstantInt::getZExtValue(), isUndefVector(), isValidElementType(), llvm::PatternMatch::m_CombineOr(), llvm::PatternMatch::m_ConstantInt(), llvm::PatternMatch::m_ExtractElt(), llvm::PatternMatch::m_Undef(), llvm::PatternMatch::m_Value(), llvm::PatternMatch::match(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), ScoreAltOpcodes, ScoreConsecutiveExtracts, ScoreConsecutiveLoads, ScoreConstants, ScoreFail, ScoreMaskedGatherCandidate, ScoreReversedExtracts, ScoreReversedLoads, ScoreSameOpcode, ScoreSplat, ScoreSplatLoads, ScoreUndef, and UsesLimit.
Referenced by getScoreAtLevelRec().
|
static |
Score if all users are vectorized.
Definition at line 1565 of file SLPVectorizer.cpp.
|
static |
Instructions with alt opcodes (e.g, add + sub).
Definition at line 1557 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
ExtractElementInst from same vector and consecutive indexes.
Definition at line 1549 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
Loads from consecutive memory addresses, e.g. load(A[i]), load(A[i+1]).
Definition at line 1538 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
|
static |
Score for failing to find a decent match.
Definition at line 1563 of file SLPVectorizer.cpp.
Referenced by getScoreAtLevelRec(), and getShallowScore().
|
static |
A load candidate for masked gather.
Definition at line 1547 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
ExtractElementInst from same vector and reversed indices.
Definition at line 1551 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
Loads from reversed memory addresses, e.g. load(A[i+1]), load(A[i]).
Definition at line 1545 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
Instructions with the same opcode.
Definition at line 1555 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
Identical instructions (a.k.a. splat or broadcast).
Definition at line 1559 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
The same load multiple times.
This should have a better score than ScoreSplat
because it in x86 for a 2-lane vector we can represent it with movddup (reg), xmm0
which has a throughput of 0.5 versus 0.5 for a vector load and 1.0 for a broadcast.
Definition at line 1543 of file SLPVectorizer.cpp.
Referenced by getShallowScore().
|
static |
Matching with an undef is preferable to failing.
Definition at line 1561 of file SLPVectorizer.cpp.
Referenced by getShallowScore().