Go to the documentation of this file.
28 "amdgpu-stress-function-calls",
30 cl::desc(
"Force all functions to be noinline"),
39 AMDGPUAlwaysInline(
bool GlobalOpt =
false) :
41 bool runOnModule(
Module &M)
override;
51 "AMDGPU Inline All Functions",
false,
false)
53 char AMDGPUAlwaysInline::
ID = 0;
62 while (!Stack.empty()) {
63 User *U = Stack.pop_back_val();
64 if (!Visited.
insert(U).second)
75 F->removeFnAttr(Attribute::NoInline);
77 FuncsToAlwaysInline.insert(
F);
90 std::vector<GlobalAlias*> AliasesToRemove;
97 if (
Function*
F = dyn_cast<Function>(A.getAliasee())) {
101 A.replaceAllUsesWith(
F);
102 AliasesToRemove.push_back(&A);
111 A->eraseFromParent();
127 unsigned AS = GV.getAddressSpace();
136 = StressCalls ? Attribute::AlwaysInline : Attribute::NoInline;
139 if (!
F.isDeclaration() && !
F.use_empty() &&
140 !
F.hasFnAttribute(IncompatAttr)) {
142 if (!FuncsToAlwaysInline.
count(&
F))
145 FuncsToAlwaysInline.
insert(&
F);
151 F->addFnAttr(Attribute::AlwaysInline);
154 F->addFnAttr(Attribute::NoInline);
156 return !FuncsToAlwaysInline.empty() || !FuncsToNoInline.empty();
159 bool AMDGPUAlwaysInline::runOnModule(
Module &M) {
164 return new AMDGPUAlwaysInline(GlobalOpt);
A set of analyses that are preserved following a run of a transformation pass.
This is an optimization pass for GlobalISel generic memory operations.
We currently emits eax Perhaps this is what we really should generate is Is imull three or four cycles eax eax The current instruction priority is based on pattern complexity The former is more complex because it folds a load so the latter will not be emitted Perhaps we should use AddedComplexity to give LEA32r a higher priority We should always try to match LEA first since the LEA matching code does some estimate to determine whether the match is profitable if we care more about code then imull is better It s two bytes shorter than movl leal On a Pentium M
ModulePass class - This class is used to implement unstructured interprocedural optimizations and ana...
static INITIALIZE_PASS(AMDGPUAlwaysInline, "amdgpu-always-inline", "AMDGPU Inline All Functions", false, false) char AMDGPUAlwaysInline void recursivelyVisitUsers(GlobalValue &GV, SmallPtrSetImpl< Function * > &FuncsToAlwaysInline)
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Triple - Helper class for working with autoconf configuration names.
#define INITIALIZE_PASS(passName, arg, name, cfg, analysis)
Represent the analysis usage information of a pass.
@ LOCAL_ADDRESS
Address space for local memory.
@ InternalLinkage
Rename collisions when linking (static functions).
static bool alwaysInlineImpl(Module &M, bool GlobalOpt)
unsigned ID
LLVM IR allows to use arbitrary numbers as calling convention identifiers.
bool isEntryFunctionCC(CallingConv::ID CC)
ModulePass * createAMDGPUAlwaysInlinePass(bool GlobalOpt=true)
initializer< Ty > init(const Ty &Val)
A Module instance is used to store all the information related to an LLVM module.
static bool EnableLowerModuleLDS
size_type count(ConstPtrType Ptr) const
count - Return 1 if the specified pointer is in the set, 0 otherwise.
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
void append_range(Container &C, Range &&R)
Wrapper function to append a range to a container.
void setPreservesAll()
Set by analyses that do not transform their input at all.
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
@ REGION_ADDRESS
Address space for region memory. (GDS)
static bool EnableFunctionCalls
A templated base class for SmallPtrSet which provides the typesafe interface that is common across al...
A container for analyses that lazily runs them and caches their results.
iterator_range< user_iterator > users()
std::pair< iterator, bool > insert(PtrType Ptr)
Inserts Ptr if and only if there is no element in the container equal to Ptr.