Go to the documentation of this file.
40 #define DEBUG_TYPE "synthetic-counts-propagation"
46 cl::desc(
"Initial value of synthetic entry count"));
52 cl::desc(
"Initial synthetic entry count for inline functions."));
57 cl::desc(
"Initial synthetic entry count for cold functions."));
62 auto MayHaveIndirectCalls = [](
Function &
F) {
63 for (
auto *U :
F.users()) {
64 if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
72 if (
F.isDeclaration())
74 if (
F.hasFnAttribute(Attribute::AlwaysInline) ||
75 F.hasFnAttribute(Attribute::InlineHint)) {
79 }
else if (
F.hasLocalLinkage() && !MayHaveIndirectCalls(
F)) {
84 F.hasFnAttribute(Attribute::NoInline)) {
88 SetCount(&
F, InitialCount);
108 CallBase &CB = *cast<CallBase>(*Edge.first);
116 Scaled64 BBCount(
BFI.getBlockFreq(CSBB).getFrequency(), 0);
117 BBCount /= EntryFreq;
118 BBCount *= Counts[Caller];
126 auto F =
N->getFunction();
127 if (!
F ||
F->isDeclaration())
134 for (
auto Entry : Counts) {
A set of analyses that are preserved following a run of a transformation pass.
This is an optimization pass for GlobalISel generic memory operations.
We currently emits eax Perhaps this is what we really should generate is Is imull three or four cycles eax eax The current instruction priority is based on pattern complexity The former is more complex because it folds a load so the latter will not be emitted Perhaps we should use AddedComplexity to give LEA32r a higher priority We should always try to match LEA first since the LEA matching code does some estimate to determine whether the match is profitable if we care more about code then imull is better It s two bytes shorter than movl leal On a Pentium M
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
std::pair< Optional< WeakTrackingVH >, CallGraphNode * > CallRecord
A pair of the calling instruction (a call or invoke) and the call graph node being called.
The basic data container for the call graph of a Module of IR.
FunctionAnalysisManager FAM
PreservedAnalyses run(Module &M, ModuleAnalysisManager &MAM)
static void propagate(const CallGraphType &CG, GetProfCountTy GetProfCount, AddCountTy AddCount)
Propgate synthetic entry counts on a callgraph CG.
LLVM Basic Block Representation.
static cl::opt< int > ColdSyntheticCount("cold-synthetic-count", cl::Hidden, cl::init(5), cl::ZeroOrMore, cl::desc("Initial synthetic entry count for cold functions."))
Initial synthetic count assigned to cold functions.
ModuleAnalysisManager MAM
cl::opt< int > InitialSyntheticCount
A node in the call graph for a module.
Analysis pass which computes BlockFrequencyInfo.
Function * getCaller()
Helper to get the caller (the parent function).
An efficient, type-erasing, non-owning reference to a callable.
Function::ProfileCount ProfileCount
Function::ProfileCount ProfileCount
static cl::opt< int > InlineSyntheticCount("inline-synthetic-count", cl::Hidden, cl::init(15), cl::ZeroOrMore, cl::desc("Initial synthetic entry count for inline functions."))
Initial synthetic count assigned to inline functions.
initializer< Ty > init(const Ty &Val)
A Module instance is used to store all the information related to an LLVM module.
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
ScaledNumber< uint64_t > Scaled64
static void initializeCounts(Module &M, function_ref< void(Function *, uint64_t)> SetCount)
const BasicBlock * getParent() const
Base class for all callable instructions (InvokeInst and CallInst) Holds everything related to callin...
A container for analyses that lazily runs them and caches their results.
An analysis over an "outer" IR unit that provides access to an analysis manager over an "inner" IR un...
Class to represent profile counts.