LLVM 17.0.0git
|
LoopVectorizationCostModel - estimates the expected speedups due to vectorization. More...
Classes | |
struct | RegisterUsage |
A struct that represents some properties of the register usage of a loop. More... | |
Public Types | |
enum | InstWidening { CM_Unknown , CM_Widen , CM_Widen_Reverse , CM_Interleave , CM_GatherScatter , CM_Scalarize } |
Decision that was taken during cost calculation for memory instruction. More... | |
using | ReductionChainMap = SmallMapVector< PHINode *, SmallVector< Instruction *, 4 >, 4 > |
A SmallMapVector to store the InLoop reduction op chains, mapping phi nodes to the chain of instructions representing the reductions. | |
Public Member Functions | |
LoopVectorizationCostModel (ScalarEpilogueLowering SEL, Loop *L, PredicatedScalarEvolution &PSE, LoopInfo *LI, LoopVectorizationLegality *Legal, const TargetTransformInfo &TTI, const TargetLibraryInfo *TLI, DemandedBits *DB, AssumptionCache *AC, OptimizationRemarkEmitter *ORE, const Function *F, const LoopVectorizeHints *Hints, InterleavedAccessInfo &IAI) | |
FixedScalableVFPair | computeMaxVF (ElementCount UserVF, unsigned UserIC) |
bool | runtimeChecksRequired () |
VectorizationFactor | selectVectorizationFactor (const ElementCountSet &CandidateVFs) |
VectorizationFactor | selectEpilogueVectorizationFactor (const ElementCount MaxVF, const LoopVectorizationPlanner &LVP) |
bool | selectUserVectorizationFactor (ElementCount UserVF) |
Setup cost-based decisions for user vectorization factor. | |
std::pair< unsigned, unsigned > | getSmallestAndWidestTypes () |
unsigned | selectInterleaveCount (ElementCount VF, InstructionCost LoopCost) |
void | setCostBasedWideningDecision (ElementCount VF) |
Memory access instruction may be vectorized in more than one way. | |
SmallVector< RegisterUsage, 8 > | calculateRegisterUsage (ArrayRef< ElementCount > VFs) |
void | collectValuesToIgnore () |
Collect values we want to ignore in the cost model. | |
void | collectElementTypesForWidening () |
Collect all element types in the loop for which widening is needed. | |
void | collectInLoopReductions () |
Split reductions into those that happen in the loop, and those that happen outside. | |
bool | useOrderedReductions (const RecurrenceDescriptor &RdxDesc) const |
Returns true if we should use strict in-order reductions for the given RdxDesc. | |
const MapVector< Instruction *, uint64_t > & | getMinimalBitwidths () const |
bool | isProfitableToScalarize (Instruction *I, ElementCount VF) const |
bool | isUniformAfterVectorization (Instruction *I, ElementCount VF) const |
Returns true if I is known to be uniform after vectorization. | |
bool | isScalarAfterVectorization (Instruction *I, ElementCount VF) const |
Returns true if I is known to be scalar after vectorization. | |
bool | canTruncateToMinimalBitwidth (Instruction *I, ElementCount VF) const |
void | setWideningDecision (Instruction *I, ElementCount VF, InstWidening W, InstructionCost Cost) |
Save vectorization decision W and Cost taken by the cost model for instruction I and vector width VF . | |
void | setWideningDecision (const InterleaveGroup< Instruction > *Grp, ElementCount VF, InstWidening W, InstructionCost Cost) |
Save vectorization decision W and Cost taken by the cost model for interleaving group Grp and vector width VF . | |
InstWidening | getWideningDecision (Instruction *I, ElementCount VF) const |
Return the cost model decision for the given instruction I and vector width VF . | |
InstructionCost | getWideningCost (Instruction *I, ElementCount VF) |
Return the vectorization cost for the given instruction I and vector width VF . | |
bool | isOptimizableIVTruncate (Instruction *I, ElementCount VF) |
Return True if instruction I is an optimizable truncate whose operand is an induction variable. | |
void | collectInstsToScalarize (ElementCount VF) |
Collects the instructions to scalarize for each predicated instruction in the loop. | |
void | collectUniformsAndScalars (ElementCount VF) |
Collect Uniform and Scalar values for the given VF . | |
bool | isLegalMaskedStore (Type *DataType, Value *Ptr, Align Alignment) const |
Returns true if the target machine supports masked store operation for the given DataType and kind of access to Ptr . | |
bool | isLegalMaskedLoad (Type *DataType, Value *Ptr, Align Alignment) const |
Returns true if the target machine supports masked load operation for the given DataType and kind of access to Ptr . | |
bool | isLegalGatherOrScatter (Value *V, ElementCount VF=ElementCount::getFixed(1)) |
Returns true if the target machine can represent V as a masked gather or scatter operation. | |
bool | canVectorizeReductions (ElementCount VF) const |
Returns true if the target machine supports all of the reduction variables found for the given VF. | |
bool | isDivRemScalarWithPredication (InstructionCost ScalarCost, InstructionCost SafeDivisorCost) const |
Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem. | |
bool | isScalarWithPredication (Instruction *I, ElementCount VF) const |
Returns true if I is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e. | |
bool | isPredicatedInst (Instruction *I) const |
Returns true if I is an instruction that needs to be predicated at runtime. | |
std::pair< InstructionCost, InstructionCost > | getDivRemSpeculationCost (Instruction *I, ElementCount VF) const |
Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane. | |
bool | memoryInstructionCanBeWidened (Instruction *I, ElementCount VF) |
Returns true if I is a memory instruction with consecutive memory access that can be widened. | |
bool | interleavedAccessCanBeWidened (Instruction *I, ElementCount VF) |
Returns true if I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles. | |
bool | isAccessInterleaved (Instruction *Instr) |
Check if Instr belongs to any interleaved access group. | |
const InterleaveGroup< Instruction > * | getInterleavedAccessGroup (Instruction *Instr) |
Get the interleaved access group that Instr belongs to. | |
bool | requiresScalarEpilogue (ElementCount VF) const |
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop. | |
bool | isScalarEpilogueAllowed () const |
Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation. | |
TailFoldingStyle | getTailFoldingStyle (bool IVUpdateMayOverflow=true) const |
Returns the TailFoldingStyle that is best for the current loop. | |
bool | foldTailByMasking () const |
Returns true if all loop blocks should be masked to fold tail loop. | |
bool | blockNeedsPredicationForAnyReason (BasicBlock *BB) const |
Returns true if the instructions in this block requires predication for any reason, e.g. | |
const ReductionChainMap & | getInLoopReductionChains () const |
Return the chain of instructions representing an inloop reduction. | |
bool | isInLoopReduction (PHINode *Phi) const |
Returns true if the Phi is part of an inloop reduction. | |
InstructionCost | getVectorIntrinsicCost (CallInst *CI, ElementCount VF) const |
Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF. | |
InstructionCost | getVectorCallCost (CallInst *CI, ElementCount VF, Function **Variant, bool *NeedsMask=nullptr) const |
Estimate cost of a call instruction CI if it were vectorized with factor VF. | |
bool | isMoreProfitable (const VectorizationFactor &A, const VectorizationFactor &B) const |
Returns true if the per-lane cost of VectorizationFactor A is lower than that of B. | |
void | invalidateCostModelingDecisions () |
Invalidates decisions already taken by the cost model. | |
std::optional< unsigned > | getVScaleForTuning () const |
Convenience function that returns the value of vscale_range iff vscale_range.min == vscale_range.max or otherwise returns the value returned by the corresponding TLI method. | |
Public Attributes | |
Loop * | TheLoop |
The loop that we evaluate. | |
PredicatedScalarEvolution & | PSE |
Predicated scalar evolution analysis. | |
LoopInfo * | LI |
Loop Info analysis. | |
LoopVectorizationLegality * | Legal |
Vectorization legality. | |
const TargetTransformInfo & | TTI |
Vector target information. | |
const TargetLibraryInfo * | TLI |
Target Library Info. | |
DemandedBits * | DB |
Demanded bits analysis. | |
AssumptionCache * | AC |
Assumption cache. | |
OptimizationRemarkEmitter * | ORE |
Interface to emit optimization remarks. | |
const Function * | TheFunction |
const LoopVectorizeHints * | Hints |
Loop Vectorize Hint. | |
InterleavedAccessInfo & | InterleaveInfo |
The interleave access information contains groups of interleaved accesses with the same stride and close to each other. | |
SmallPtrSet< const Value *, 16 > | ValuesToIgnore |
Values to ignore in the cost model. | |
SmallPtrSet< const Value *, 16 > | VecValuesToIgnore |
Values to ignore in the cost model when VF > 1. | |
SmallPtrSet< Type *, 16 > | ElementTypesInLoop |
All element types found in the loop. | |
SmallVector< VectorizationFactor, 8 > | ProfitableVFs |
Profitable vector factors. | |
LoopVectorizationCostModel - estimates the expected speedups due to vectorization.
In many cases vectorization is not profitable. This can happen because of a number of reasons. In this class we mainly attempt to predict the expected speedup/slowdowns due to the supported instruction set. We use the TargetTransformInfo to query the different backends for the cost of different operations.
Definition at line 1185 of file LoopVectorize.cpp.
using llvm::LoopVectorizationCostModel::ReductionChainMap = SmallMapVector<PHINode *, SmallVector<Instruction *, 4>, 4> |
A SmallMapVector to store the InLoop reduction op chains, mapping phi nodes to the chain of instructions representing the reductions.
Uses a MapVector to ensure deterministic iteration order.
Definition at line 1596 of file LoopVectorize.cpp.
Decision that was taken during cost calculation for memory instruction.
Enumerator | |
---|---|
CM_Unknown | |
CM_Widen | |
CM_Widen_Reverse | |
CM_Interleave | |
CM_GatherScatter | |
CM_Scalarize |
Definition at line 1353 of file LoopVectorize.cpp.
|
inline |
Definition at line 1187 of file LoopVectorize.cpp.
|
inline |
Returns true if the instructions in this block requires predication for any reason, e.g.
because tail folding now requires a predicate or because the block in the original loop was predicated.
Definition at line 1589 of file LoopVectorize.cpp.
References llvm::IRSimilarity::Legal.
SmallVector< LoopVectorizationCostModel::RegisterUsage, 8 > LoopVectorizationCostModel::calculateRegisterUsage | ( | ArrayRef< ElementCount > | VFs | ) |
Definition at line 5974 of file LoopVectorize.cpp.
References llvm::all_of(), llvm::LoopBlocksDFS::beginRPO(), llvm::SmallPtrSetImpl< PtrType >::count(), llvm::dbgs(), llvm::LoopBlocksDFS::endRPO(), llvm::SmallPtrSetImpl< PtrType >::erase(), llvm::VectorType::get(), llvm::ElementCount::getFixed(), llvm::TargetTransformInfo::getRegisterClassForType(), llvm::TargetTransformInfo::getRegisterClassName(), llvm::TargetTransformInfo::getRegUsageForType(), I, llvm::SetVector< T, Vector, Set >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::Type::isTokenTy(), llvm::VectorType::isValidElementType(), llvm::ElementCount::isVector(), llvm::InnerLoopVectorizer::LI, llvm::List, LLVM_DEBUG, llvm::LoopVectorizationCostModel::RegisterUsage::LoopInvariantRegs, llvm::make_range(), llvm::LoopVectorizationCostModel::RegisterUsage::MaxLocalUsers, llvm::LoopBlocksDFS::perform(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::RegUsage, llvm::ArrayRef< T >::size(), llvm::MapVector< KeyT, ValueT, MapType, VectorType >::size(), llvm::SmallPtrSetImplBase::size(), llvm::SmallVectorBase< Size_T >::size(), ToRemove, and llvm::InnerLoopVectorizer::VF.
|
inline |
I
can be truncated to a smaller bitwidth for vectorization factor VF
. Definition at line 1346 of file LoopVectorize.cpp.
References I, and llvm::ElementCount::isVector().
|
inline |
Returns true if the target machine supports all of the reduction variables found for the given VF.
Definition at line 1489 of file LoopVectorize.cpp.
References llvm::all_of(), llvm::IRSimilarity::Legal, and Reduction.
void LoopVectorizationCostModel::collectElementTypesForWidening | ( | ) |
Collect all element types in the loop for which widening is needed.
Definition at line 5683 of file LoopVectorize.cpp.
References assert(), llvm::RecurrenceDescriptor::getOpcode(), llvm::RecurrenceDescriptor::getRecurrenceType(), I, llvm::IRSimilarity::Legal, llvm::TargetTransformInfo::preferInLoopReduction(), PreferInLoopReductions, and llvm::InnerLoopVectorizer::useOrderedReductions().
Referenced by llvm::LoopVectorizePass::processLoop(), and processLoopInVPlanNativePath().
void LoopVectorizationCostModel::collectInLoopReductions | ( | ) |
Split reductions into those that happen in the loop, and those that happen outside.
In loop reductions are collected into InLoopReductionChains.
Definition at line 7431 of file LoopVectorize.cpp.
References llvm::dbgs(), llvm::SmallVectorBase< Size_T >::empty(), llvm::RecurrenceDescriptor::getOpcode(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::RecurrenceDescriptor::getReductionOpChain(), llvm::Value::getType(), I, llvm::IRSimilarity::Legal, LLVM_DEBUG, llvm::TargetTransformInfo::preferInLoopReduction(), PreferInLoopReductions, Reduction, and llvm::InnerLoopVectorizer::useOrderedReductions().
void LoopVectorizationCostModel::collectInstsToScalarize | ( | ElementCount | VF | ) |
Collects the instructions to scalarize for each predicated instruction in the loop.
Definition at line 6192 of file LoopVectorize.cpp.
References llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::begin(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::clear(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::contains(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), I, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::insert(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isZero(), and llvm::InnerLoopVectorizer::VF.
|
inline |
Collect Uniform and Scalar values for the given VF
.
The sets depend on CM decision for Load/Store instructions that may be vectorized as interleave, gather-scatter or scalarized.
Definition at line 1448 of file LoopVectorize.cpp.
References llvm::ElementCount::isScalar().
void LoopVectorizationCostModel::collectValuesToIgnore | ( | ) |
Collect values we want to ignore in the cost model.
Definition at line 7401 of file LoopVectorize.cpp.
References llvm::InnerLoopVectorizer::AC, llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::SmallPtrSetImpl< PtrType >::begin(), llvm::CodeMetrics::collectEphemeralValues(), llvm::SmallVectorTemplateCommon< T, typename >::end(), llvm::SmallPtrSetImpl< PtrType >::end(), llvm::RecurrenceDescriptor::getCastInsts(), llvm::InductionDescriptor::getCastInsts(), I, llvm::SmallVectorImpl< T >::insert(), llvm::IRSimilarity::Legal, Reduction, and SI.
Referenced by llvm::LoopVectorizePass::processLoop().
FixedScalableVFPair LoopVectorizationCostModel::computeMaxVF | ( | ElementCount | UserVF, |
unsigned | UserIC | ||
) |
Definition at line 5080 of file LoopVectorize.cpp.
References llvm::ScalarEvolution::applyLoopGuards(), assert(), llvm::CM_ScalarEpilogueAllowed, llvm::CM_ScalarEpilogueNotAllowedLowTripLoop, llvm::CM_ScalarEpilogueNotAllowedOptSize, llvm::CM_ScalarEpilogueNotAllowedUsePredicate, llvm::CM_ScalarEpilogueNotNeededUsePredicate, llvm::dbgs(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::empty(), llvm::FixedScalableVFPair::FixedVF, llvm::ScalarEvolution::getAddExpr(), llvm::PredicatedScalarEvolution::getBackedgeTakenCount(), llvm::ScalarEvolution::getConstant(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::FixedScalableVFPair::getNone(), llvm::ScalarEvolution::getOne(), llvm::PredicatedScalarEvolution::getSE(), llvm::ScalarEvolution::getSmallConstantTripCount(), llvm::SCEV::getType(), llvm::ScalarEvolution::getURemExpr(), llvm::TargetTransformInfo::hasBranchDivergence(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isNonZero(), llvm::isPowerOf2_32(), llvm::ElementCount::isVector(), llvm::SCEV::isZero(), llvm::IRSimilarity::Legal, LLVM_DEBUG, llvm::InnerLoopVectorizer::ORE, llvm::InnerLoopVectorizer::PSE, llvm::reportVectorizationFailure(), llvm::FixedScalableVFPair::ScalableVF, and useMaskedInterleavedAccesses().
|
inline |
Returns true if all loop blocks should be masked to fold tail loop.
Definition at line 1582 of file LoopVectorize.cpp.
Referenced by llvm::InnerLoopVectorizer::completeLoopSkeleton(), llvm::InnerLoopVectorizer::fixReduction(), and llvm::InnerLoopVectorizer::getOrCreateVectorTripCount().
std::pair< InstructionCost, InstructionCost > LoopVectorizationCostModel::getDivRemSpeculationCost | ( | Instruction * | I, |
ElementCount | VF | ||
) | const |
Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane.
First result is for scalarization (will be invalid for scalable vectors); second is for the safe-divisor strategy.
Definition at line 4497 of file LoopVectorize.cpp.
References assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, CostKind, llvm::TargetTransformInfo::getArithmeticInstrCost(), llvm::TargetTransformInfo::getCFInstrCost(), llvm::TargetTransformInfo::getCmpSelInstrCost(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::TargetTransformInfo::getOperandInfo(), getReciprocalPredBlockProb(), I, llvm::isSafeToSpeculativelyExecute(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::TargetTransformInfo::OperandValueInfo::Kind, llvm::IRSimilarity::Legal, llvm::TargetTransformInfo::OK_AnyValue, llvm::TargetTransformInfo::OK_UniformValue, Operands, llvm::TargetTransformInfo::TCK_RecipThroughput, llvm::ToVectorTy(), and llvm::InnerLoopVectorizer::VF.
|
inline |
Return the chain of instructions representing an inloop reduction.
Definition at line 1600 of file LoopVectorize.cpp.
|
inline |
Get the interleaved access group that Instr
belongs to.
Definition at line 1547 of file LoopVectorize.cpp.
|
inline |
Definition at line 1285 of file LoopVectorize.cpp.
Referenced by llvm::InnerLoopVectorizer::truncateToMinimalBitwidths().
Definition at line 5652 of file LoopVectorize.cpp.
References DL, llvm::MachineFunction::getDataLayout(), llvm::RecurrenceDescriptor::getMinWidthCastToRecurrenceTypeInBits(), llvm::MachineBasicBlock::getParent(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::Type::getScalarSizeInBits(), and llvm::IRSimilarity::Legal.
Referenced by determineVPlanVF().
|
inline |
Returns the TailFoldingStyle that is best for the current loop.
Definition at line 1571 of file LoopVectorize.cpp.
References ForceTailFoldingStyle, and llvm::TargetTransformInfo::getPreferredTailFoldingStyle().
Referenced by llvm::InnerLoopVectorizer::emitIterationCountCheck().
InstructionCost LoopVectorizationCostModel::getVectorCallCost | ( | CallInst * | CI, |
ElementCount | VF, | ||
Function ** | Variant, | ||
bool * | NeedsMask = nullptr |
||
) | const |
Estimate cost of a call instruction CI if it were vectorized with factor VF.
Return the cost of the instruction, including scalarization overhead if it's needed. The flag NeedToScalarize shows if the call needs to be scalarized - i.e. either vector version isn't available, or is too expensive.
Definition at line 3458 of file LoopVectorize.cpp.
References llvm::CallBase::args(), llvm::InnerLoopVectorizer::Cost, CostKind, F, llvm::VFShape::get(), llvm::VectorType::get(), llvm::CallBase::getCalledFunction(), llvm::TargetTransformInfo::getCallInstrCost(), llvm::Type::getContext(), llvm::Function::getFunctionType(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::TargetTransformInfo::getShuffleCost(), llvm::Value::getType(), llvm::VFDatabase::getVectorizedFunction(), llvm::CallBase::isNoBuiltin(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), llvm::IRSimilarity::Legal, llvm::SmallVectorTemplateBase< T, bool >::push_back(), RetTy, llvm::TargetTransformInfo::SK_Broadcast, llvm::TargetTransformInfo::TCK_RecipThroughput, llvm::InnerLoopVectorizer::TLI, llvm::ToVectorTy(), and llvm::InnerLoopVectorizer::VF.
InstructionCost LoopVectorizationCostModel::getVectorIntrinsicCost | ( | CallInst * | CI, |
ElementCount | VF | ||
) | const |
Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF.
Return the cost of the instruction, including scalarization overhead if it's needed.
Definition at line 3534 of file LoopVectorize.cpp.
References llvm::CallBase::args(), Arguments, assert(), llvm::CallBase::getCalledFunction(), llvm::Function::getFunctionType(), llvm::TargetTransformInfo::getIntrinsicInstrCost(), llvm::Value::getType(), llvm::getVectorIntrinsicIDForCall(), MaybeVectorizeType(), llvm::FunctionType::param_begin(), llvm::FunctionType::param_end(), RetTy, llvm::TargetTransformInfo::TCK_RecipThroughput, llvm::InnerLoopVectorizer::TLI, and llvm::InnerLoopVectorizer::VF.
std::optional< unsigned > LoopVectorizationCostModel::getVScaleForTuning | ( | ) | const |
Convenience function that returns the value of vscale_range iff vscale_range.min == vscale_range.max or otherwise returns the value returned by the corresponding TLI method.
Definition at line 5326 of file LoopVectorize.cpp.
References llvm::TargetTransformInfo::getVScaleForTuning().
Referenced by llvm::LoopVectorizePass::processLoop().
|
inline |
Return the vectorization cost for the given instruction I
and vector width VF
.
Definition at line 1407 of file LoopVectorize.cpp.
References assert(), I, and llvm::ElementCount::isVector().
|
inline |
Return the cost model decision for the given instruction I
and vector width VF
.
Return CM_Unknown if this instruction did not pass through the cost modeling.
Definition at line 1391 of file LoopVectorize.cpp.
References assert(), llvm::EnableVPlanNativePath, I, and llvm::ElementCount::isVector().
bool LoopVectorizationCostModel::interleavedAccessCanBeWidened | ( | Instruction * | I, |
ElementCount | VF | ||
) |
Returns true if I
is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles.
Definition at line 4561 of file LoopVectorize.cpp.
References assert(), DL, llvm::getLoadStoreAlignment(), llvm::getLoadStoreType(), hasIrregularType(), I, llvm::TargetTransformInfo::isLegalMaskedLoad(), llvm::TargetTransformInfo::isLegalMaskedStore(), llvm::IRSimilarity::Legal, useMaskedInterleavedAccesses(), and llvm::InnerLoopVectorizer::VF.
|
inline |
Invalidates decisions already taken by the cost model.
Definition at line 1629 of file LoopVectorize.cpp.
|
inline |
Check if Instr
belongs to any interleaved access group.
Definition at line 1541 of file LoopVectorize.cpp.
|
inline |
Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem.
This incorporates an override option so it is not simply a cost comparison.
Definition at line 1499 of file LoopVectorize.cpp.
References llvm::cl::BOU_FALSE, llvm::cl::BOU_TRUE, llvm::cl::BOU_UNSET, ForceSafeDivisor, and llvm_unreachable.
Returns true if the Phi is part of an inloop reduction.
Definition at line 1605 of file LoopVectorize.cpp.
|
inline |
Returns true if the target machine can represent V
as a masked gather or scatter operation.
Definition at line 1473 of file LoopVectorize.cpp.
References llvm::getLoadStoreAlignment(), llvm::getLoadStoreType(), llvm::TargetTransformInfo::isLegalMaskedGather(), llvm::TargetTransformInfo::isLegalMaskedScatter(), and SI.
|
inline |
Returns true if the target machine supports masked load operation for the given DataType
and kind of access to Ptr
.
Definition at line 1466 of file LoopVectorize.cpp.
References llvm::TargetTransformInfo::isLegalMaskedLoad(), llvm::IRSimilarity::Legal, and Ptr.
|
inline |
Returns true if the target machine supports masked store operation for the given DataType
and kind of access to Ptr
.
Definition at line 1459 of file LoopVectorize.cpp.
References llvm::TargetTransformInfo::isLegalMaskedStore(), llvm::IRSimilarity::Legal, and Ptr.
bool LoopVectorizationCostModel::isMoreProfitable | ( | const VectorizationFactor & | A, |
const VectorizationFactor & | B | ||
) | const |
Returns true if the per-lane cost of VectorizationFactor A is lower than that of B.
Definition at line 5338 of file LoopVectorize.cpp.
References A, B, llvm::divideCeil(), llvm::PredicatedScalarEvolution::getSE(), llvm::ScalarEvolution::getSmallConstantMaxTripCount(), and llvm::InnerLoopVectorizer::PSE.
|
inline |
Return True if instruction I
is an optimizable truncate whose operand is an induction variable.
Such a truncate will be removed by adding a new induction variable with the destination type.
Definition at line 1418 of file LoopVectorize.cpp.
References I, llvm::TargetTransformInfo::isTruncateFree(), llvm::IRSimilarity::Legal, and llvm::ToVectorTy().
bool LoopVectorizationCostModel::isPredicatedInst | ( | Instruction * | I | ) | const |
Returns true if I
is an instruction that needs to be predicated at runtime.
The result is independent of the predication mechanism. Superset of instructions that return true for isScalarWithPredication.
Definition at line 4454 of file LoopVectorize.cpp.
References I, llvm::isSafeToSpeculativelyExecute(), and llvm::IRSimilarity::Legal.
|
inline |
I
for vectorization factor VF
. Definition at line 1291 of file LoopVectorize.cpp.
References assert(), llvm::EnableVPlanNativePath, I, and llvm::ElementCount::isVector().
Referenced by createWidenInductionRecipes().
|
inline |
Returns true if I
is known to be scalar after vectorization.
Definition at line 1329 of file LoopVectorize.cpp.
References assert(), llvm::EnableVPlanNativePath, I, and llvm::ElementCount::isScalar().
Referenced by createWidenInductionRecipes().
|
inline |
Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation.
Definition at line 1565 of file LoopVectorize.cpp.
References llvm::CM_ScalarEpilogueAllowed.
Referenced by llvm::InnerLoopVectorizer::vectorizeInterleaveGroup().
bool LoopVectorizationCostModel::isScalarWithPredication | ( | Instruction * | I, |
ElementCount | VF | ||
) | const |
Returns true if I
is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e.
we don't have an alternate strategy such as masking available). VF
is the vectorization factor that will be used to vectorize I
.
Definition at line 4418 of file LoopVectorize.cpp.
References llvm::VectorType::get(), llvm::getLoadStoreAlignment(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), I, llvm::TargetTransformInfo::isLegalMaskedGather(), llvm::TargetTransformInfo::isLegalMaskedScatter(), llvm::ElementCount::isVector(), Ptr, and llvm::InnerLoopVectorizer::VF.
|
inline |
Returns true if I
is known to be uniform after vectorization.
Definition at line 1307 of file LoopVectorize.cpp.
References assert(), llvm::EnableVPlanNativePath, I, and llvm::ElementCount::isScalar().
bool LoopVectorizationCostModel::memoryInstructionCanBeWidened | ( | Instruction * | I, |
ElementCount | VF | ||
) |
Returns true if I
is a memory instruction with consecutive memory access that can be widened.
Definition at line 4630 of file LoopVectorize.cpp.
References assert(), DL, llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), hasIrregularType(), I, llvm::IRSimilarity::Legal, Ptr, and llvm::InnerLoopVectorizer::VF.
|
inline |
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop.
Definition at line 1553 of file LoopVectorize.cpp.
References llvm::ElementCount::isVector().
Referenced by llvm::InnerLoopVectorizer::completeLoopSkeleton(), llvm::EpilogueVectorizerEpilogueLoop::createEpilogueVectorizedLoopSkeleton(), llvm::InnerLoopVectorizer::createVectorLoopSkeleton(), llvm::InnerLoopVectorizer::emitIterationCountCheck(), llvm::EpilogueVectorizerMainLoop::emitIterationCountCheck(), llvm::EpilogueVectorizerEpilogueLoop::emitMinimumVectorEpilogueIterCountCheck(), llvm::InnerLoopVectorizer::emitSCEVChecks(), llvm::InnerLoopVectorizer::fixFixedOrderRecurrence(), llvm::InnerLoopVectorizer::fixReduction(), llvm::InnerLoopVectorizer::fixVectorizedLoop(), and llvm::InnerLoopVectorizer::getOrCreateVectorTripCount().
bool LoopVectorizationCostModel::runtimeChecksRequired | ( | ) |
Definition at line 4880 of file LoopVectorize.cpp.
References llvm::dbgs(), llvm::PredicatedScalarEvolution::getPredicate(), llvm::SCEVPredicate::isAlwaysTrue(), llvm::IRSimilarity::Legal, LLVM_DEBUG, llvm::InnerLoopVectorizer::ORE, llvm::InnerLoopVectorizer::PSE, and llvm::reportVectorizationFailure().
VectorizationFactor LoopVectorizationCostModel::selectEpilogueVectorizationFactor | ( | const ElementCount | MaxVF, |
const LoopVectorizationPlanner & | LVP | ||
) |
Definition at line 5576 of file LoopVectorize.cpp.
References llvm::dbgs(), llvm::VectorizationFactor::Disabled(), EnableEpilogueVectorization, EpilogueVectorizationForceVF, llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::LoopVectorizationPlanner::hasPlanWithVF(), llvm::details::FixedOrScalableQuantity< ElementCount, unsigned >::isKnownLT(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), and LLVM_DEBUG.
Referenced by llvm::LoopVectorizePass::processLoop().
unsigned LoopVectorizationCostModel::selectInterleaveCount | ( | ElementCount | VF, |
InstructionCost | LoopCost | ||
) |
Definition at line 5727 of file LoopVectorize.cpp.
References llvm::any_of(), assert(), llvm::bit_floor(), llvm::dbgs(), llvm::TargetTransformInfo::enableAggressiveInterleaving(), EnableIndVarRegisterHeur, EnableLoadStoreRuntimeInterleave, F, ForceTargetMaxScalarInterleaveFactor, ForceTargetMaxVectorInterleaveFactor, ForceTargetNumScalarRegs, ForceTargetNumVectorRegs, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::TargetTransformInfo::getMaxInterleaveFactor(), llvm::TargetTransformInfo::getNumberOfRegisters(), llvm::TargetTransformInfo::getRegisterClassName(), llvm::PredicatedScalarEvolution::getSE(), getSmallBestKnownTC(), llvm::InstructionCost::getValue(), InterleaveSmallLoopScalarReduction, llvm::ElementCount::isScalar(), llvm::InstructionCost::isValid(), llvm::ElementCount::isVector(), llvm::IRSimilarity::Legal, LLVM_DEBUG, MaxNestedScalarReductionIC, llvm::InnerLoopVectorizer::PSE, Reduction, SmallLoopCost, TinyTripCountInterleaveThreshold, and llvm::InnerLoopVectorizer::VF.
Referenced by llvm::LoopVectorizePass::processLoop().
|
inline |
Setup cost-based decisions for user vectorization factor.
Definition at line 1222 of file LoopVectorize.cpp.
VectorizationFactor LoopVectorizationCostModel::selectVectorizationFactor | ( | const ElementCountSet & | CandidateVFs | ) |
CandidateVFs
. If UserVF is not ZERO then this vectorization factor will be selected if vectorization is possible. Definition at line 5445 of file LoopVectorize.cpp.
References assert(), llvm::CallingConv::C, llvm::VectorizationFactor::Cost, llvm::SmallSet< T, N, C >::count(), llvm::dbgs(), emitInvalidCostRemarks(), EnableCondStoresVectorization, llvm::LoopVectorizeHints::FK_Enabled, llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::InstructionCost::getMax(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), llvm::InstructionCost::isValid(), LLVM_DEBUG, llvm::InnerLoopVectorizer::ORE, llvm::reportVectorizationFailure(), llvm::VectorizationFactor::ScalarCost, llvm::SmallSet< T, N, C >::size(), and llvm::VectorizationFactor::Width.
void LoopVectorizationCostModel::setCostBasedWideningDecision | ( | ElementCount | VF | ) |
Memory access instruction may be vectorized in more than one way.
Form of instruction after vectorization depends on cost. This function takes cost-based decisions for Load/Store instructions and collects them in a map. This decisions map is used for building the lists of loop-uniform and loop-scalar instructions. The calculated cost is saved with widening decision in order to avoid redundant calculations.
Definition at line 6880 of file LoopVectorize.cpp.
References llvm::append_range(), assert(), llvm::InnerLoopVectorizer::Cost, llvm::SmallVectorBase< Size_T >::empty(), llvm::ElementCount::getFixed(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), I, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), llvm::IRSimilarity::Legal, llvm::SmallVectorImpl< T >::pop_back_val(), llvm::TargetTransformInfo::prefersVectorizedAddressing(), Ptr, llvm::SmallVectorTemplateBase< T, bool >::push_back(), SI, and llvm::InnerLoopVectorizer::VF.
|
inline |
Save vectorization decision W
and Cost
taken by the cost model for interleaving group Grp
and vector width VF
.
Broadcast this decicion to all instructions inside the group. But the cost will be assigned to one instruction only.
Definition at line 1372 of file LoopVectorize.cpp.
References assert(), Cost, llvm::InterleaveGroup< InstTy >::getFactor(), llvm::InterleaveGroup< InstTy >::getInsertPos(), llvm::InterleaveGroup< InstTy >::getMember(), I, and llvm::ElementCount::isVector().
|
inline |
Save vectorization decision W
and Cost
taken by the cost model for instruction I
and vector width VF
.
Definition at line 1364 of file LoopVectorize.cpp.
References assert(), Cost, I, and llvm::ElementCount::isVector().
|
inline |
Returns true if we should use strict in-order reductions for the given RdxDesc.
This is true if the -enable-strict-reductions flag is passed, the IsOrdered flag of RdxDesc is set and we do not allow reordering of FP operations.
Definition at line 1278 of file LoopVectorize.cpp.
References llvm::RecurrenceDescriptor::isOrdered().
Referenced by llvm::InnerLoopVectorizer::useOrderedReductions().
AssumptionCache* llvm::LoopVectorizationCostModel::AC |
Assumption cache.
Definition at line 1877 of file LoopVectorize.cpp.
DemandedBits* llvm::LoopVectorizationCostModel::DB |
Demanded bits analysis.
Definition at line 1874 of file LoopVectorize.cpp.
SmallPtrSet<Type *, 16> llvm::LoopVectorizationCostModel::ElementTypesInLoop |
All element types found in the loop.
Definition at line 1898 of file LoopVectorize.cpp.
const LoopVectorizeHints* llvm::LoopVectorizationCostModel::Hints |
Loop Vectorize Hint.
Definition at line 1885 of file LoopVectorize.cpp.
Referenced by llvm::InnerLoopVectorizer::emitMemRuntimeChecks(), and llvm::InnerLoopVectorizer::emitSCEVChecks().
InterleavedAccessInfo& llvm::LoopVectorizationCostModel::InterleaveInfo |
The interleave access information contains groups of interleaved accesses with the same stride and close to each other.
Definition at line 1889 of file LoopVectorize.cpp.
LoopVectorizationLegality* llvm::LoopVectorizationCostModel::Legal |
Vectorization legality.
Definition at line 1865 of file LoopVectorize.cpp.
Referenced by isIndvarOverflowCheckKnownFalse().
LoopInfo* llvm::LoopVectorizationCostModel::LI |
Loop Info analysis.
Definition at line 1862 of file LoopVectorize.cpp.
OptimizationRemarkEmitter* llvm::LoopVectorizationCostModel::ORE |
Interface to emit optimization remarks.
Definition at line 1880 of file LoopVectorize.cpp.
SmallVector<VectorizationFactor, 8> llvm::LoopVectorizationCostModel::ProfitableVFs |
Profitable vector factors.
Definition at line 1901 of file LoopVectorize.cpp.
PredicatedScalarEvolution& llvm::LoopVectorizationCostModel::PSE |
Predicated scalar evolution analysis.
Definition at line 1859 of file LoopVectorize.cpp.
Referenced by isIndvarOverflowCheckKnownFalse().
Definition at line 1882 of file LoopVectorize.cpp.
Referenced by isIndvarOverflowCheckKnownFalse().
Loop* llvm::LoopVectorizationCostModel::TheLoop |
The loop that we evaluate.
Definition at line 1856 of file LoopVectorize.cpp.
Referenced by isIndvarOverflowCheckKnownFalse().
const TargetLibraryInfo* llvm::LoopVectorizationCostModel::TLI |
Target Library Info.
Definition at line 1871 of file LoopVectorize.cpp.
const TargetTransformInfo& llvm::LoopVectorizationCostModel::TTI |
Vector target information.
Definition at line 1868 of file LoopVectorize.cpp.
Referenced by isIndvarOverflowCheckKnownFalse().
SmallPtrSet<const Value *, 16> llvm::LoopVectorizationCostModel::ValuesToIgnore |
Values to ignore in the cost model.
Definition at line 1892 of file LoopVectorize.cpp.
SmallPtrSet<const Value *, 16> llvm::LoopVectorizationCostModel::VecValuesToIgnore |
Values to ignore in the cost model when VF > 1.
Definition at line 1895 of file LoopVectorize.cpp.