LLVM 22.0.0git
llvm::OpenMPIRBuilder Class Reference

An interface to create LLVM-IR for OpenMP directives. More...

#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"

Classes

class  AtomicInfo
struct  AtomicOpValue
 a struct to pack relevant information while generating atomic Ops More...
struct  CopyOptionsTy
struct  DependData
 A struct to pack the relevant information for an OpenMP depend clause. More...
struct  FinalizationInfo
struct  LocationDescription
 Description of a LLVM-IR insertion point (IP) and a debug/source location (filename, line, column, ...). More...
struct  MapInfosTy
 This structure contains combined information generated for mappable clauses, including base pointers, pointers, sizes, map types, user-defined mappers, and non-contiguous information. More...
struct  MapperAllocas
struct  OutlineInfo
 Helper that contains information about regions we need to outline during finalization. More...
struct  ReductionInfo
 Information about an OpenMP reduction. More...
class  TargetDataInfo
 Struct that keeps the information that should be kept throughout a 'target data' region. More...
struct  TargetDataRTArgs
 Container for the arguments used to pass data to the runtime library. More...
struct  TargetKernelArgs
 Data structure that contains the needed information to construct the kernel args vector. More...
struct  TargetKernelDefaultAttrs
 Container to pass the default attributes with which a kernel must be launched, used to set kernel attributes and populate associated static structures. More...
struct  TargetKernelRuntimeAttrs
 Container to pass LLVM IR runtime values or constants related to the number of teams and threads with which the kernel must be launched, as well as the trip count of the loop, if it is an SPMD or Generic-SPMD kernel. More...

Public Types

enum class  ReductionGenCBKind { Clang , MLIR }
 Enum class for the RedctionGen CallBack type to be used. More...
enum class  EvalKind { Scalar , Complex , Aggregate }
 Enum class for reduction evaluation types scalar, complex and aggregate. More...
enum class  CopyAction : unsigned { RemoteLaneToThread , ThreadCopy }
enum class  DeviceInfoTy { None , Pointer , Address }
enum  EmitMetadataErrorKind { EMIT_MD_TARGET_REGION_ERROR , EMIT_MD_DECLARE_TARGET_ERROR , EMIT_MD_GLOBAL_VAR_LINK_ERROR , EMIT_MD_GLOBAL_VAR_INDIRECT_ERROR }
 The kind of errors that can occur when emitting the offload entries and metadata. More...
enum  BodyGenTy { Priv , DupNoPriv , NoPriv }
 Type of BodyGen to use for region codegen. More...
using InsertPointTy = IRBuilder<>::InsertPoint
 Type used throughout for insertion points.
using InsertPointOrErrorTy = Expected<InsertPointTy>
 Type used to represent an insertion point or an error value.
using FinalizeCallbackTy = std::function<Error(InsertPointTy CodeGenIP)>
 Callback type for variable finalization (think destructors).
using BodyGenCallbackTy
 Callback type for body (=inner region) code generation.
using TaskDupCallbackTy
 Callback type for task duplication function code generation.
using StorableBodyGenCallbackTy
using LoopBodyGenCallbackTy
 Callback type for loop body code generation.
using PrivatizeCallbackTy
 Callback type for variable privatization (think copy & default constructor).
using FileIdentifierInfoCallbackTy
using ReductionGenClangCBTy
 ReductionGen CallBack for Clang.
using ReductionGenCBTy
 ReductionGen CallBack for MLIR.
using ReductionGenAtomicCBTy
 Functions used to generate atomic reductions.
using ReductionGenDataPtrPtrCBTy
using MapValuesArrayTy = SmallVector<Value *, 4>
using MapDeviceInfoArrayTy = SmallVector<DeviceInfoTy, 4>
using MapFlagsArrayTy = SmallVector<omp::OpenMPOffloadMappingFlags, 4>
using MapNamesArrayTy = SmallVector<Constant *, 4>
using MapDimArrayTy = SmallVector<uint64_t, 4>
using MapNonContiguousArrayTy = SmallVector<MapValuesArrayTy, 4>
using MapInfosOrErrorTy = Expected<MapInfosTy &>
using EmitFallbackCallbackTy
 Callback function type for functions emitting the host fallback code that is executed when the kernel launch fails.
using CustomMapperCallbackTy
using TargetTaskBodyCallbackTy
 Callback type for generating the bodies of device directives that require outer target tasks (e.g.
using EmitMetadataErrorReportFunctionTy
 Callback function type.
using FunctionGenCallback
 Functions used to generate a function with the given name.
using GenMapInfoCallbackTy
 Callback type for creating the map infos for the kernel parameters.
using TargetBodyGenCallbackTy
using TargetGenArgAccessorsCallbackTy

Public Member Functions

 OpenMPIRBuilder (Module &M)
 Create a new OpenMPIRBuilder operating on the given module M.
LLVM_ABI ~OpenMPIRBuilder ()
LLVM_ABI void initialize ()
 Initialize the internal state, this will put structures types and potentially other helpers into the underlying module.
void setConfig (OpenMPIRBuilderConfig C)
LLVM_ABI void finalize (Function *Fn=nullptr)
 Finalize the underlying module, e.g., by outlining regions.
LLVM_ABI bool isFinalized ()
 Check whether the finalize function has already run.
LLVM_ABI void addAttributes (omp::RuntimeFunction FnID, Function &Fn)
 Add attributes known for FnID to Fn.
LLVM_ABI std::string createPlatformSpecificName (ArrayRef< StringRef > Parts) const
 Get the create a name using the platform specific separators.
void pushFinalizationCB (const FinalizationInfo &FI)
 Push a finalization callback on the finalization stack.
void popFinalizationCB ()
 Pop the last finalization callback from the finalization stack.
LLVM_ABI InsertPointOrErrorTy createBarrier (const LocationDescription &Loc, omp::Directive Kind, bool ForceSimpleCall=false, bool CheckCancelFlag=true)
 Emitter methods for OpenMP directives.
LLVM_ABI InsertPointOrErrorTy createCancel (const LocationDescription &Loc, Value *IfCondition, omp::Directive CanceledDirective)
 Generator for 'omp cancel'.
LLVM_ABI InsertPointOrErrorTy createCancellationPoint (const LocationDescription &Loc, omp::Directive CanceledDirective)
 Generator for 'omp cancellation point'.
LLVM_ABI Expected< ScanInfo * > scanInfoInitialize ()
 Creates a ScanInfo object, allocates and returns the pointer.
LLVM_ABI InsertPointOrErrorTy createParallel (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, PrivatizeCallbackTy PrivCB, FinalizeCallbackTy FiniCB, Value *IfCondition, Value *NumThreads, omp::ProcBindKind ProcBind, bool IsCancellable)
 Generator for 'omp parallel'.
LLVM_ABI Expected< CanonicalLoopInfo * > createCanonicalLoop (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *TripCount, const Twine &Name="loop")
 Generator for the control flow structure of an OpenMP canonical loop.
LLVM_ABI Expected< SmallVector< llvm::CanonicalLoopInfo * > > createCanonicalScanLoops (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *Start, Value *Stop, Value *Step, bool IsSigned, bool InclusiveStop, InsertPointTy ComputeIP, const Twine &Name, ScanInfo *ScanRedInfo)
 Generator for the control flow structure of an OpenMP canonical loops if the parent directive has an inscan modifier specified.
LLVM_ABI ValuecalculateCanonicalLoopTripCount (const LocationDescription &Loc, Value *Start, Value *Stop, Value *Step, bool IsSigned, bool InclusiveStop, const Twine &Name="loop")
 Calculate the trip count of a canonical loop.
LLVM_ABI Expected< CanonicalLoopInfo * > createCanonicalLoop (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *Start, Value *Stop, Value *Step, bool IsSigned, bool InclusiveStop, InsertPointTy ComputeIP={}, const Twine &Name="loop", bool InScan=false, ScanInfo *ScanRedInfo=nullptr)
 Generator for the control flow structure of an OpenMP canonical loop.
LLVM_ABI CanonicalLoopInfocollapseLoops (DebugLoc DL, ArrayRef< CanonicalLoopInfo * > Loops, InsertPointTy ComputeIP)
 Collapse a loop nest into a single loop.
LLVM_ABI ConstantgetAddrOfDeclareTargetVar (OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause, OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause, bool IsDeclaration, bool IsExternallyVisible, TargetRegionEntryInfo EntryInfo, StringRef MangledName, std::vector< GlobalVariable * > &GeneratedRefs, bool OpenMPSIMD, std::vector< Triple > TargetTriple, Type *LlvmPtrTy, std::function< Constant *()> GlobalInitializer, std::function< GlobalValue::LinkageTypes()> VariableLinkage)
 Retrieve (or create if non-existent) the address of a declare target variable, used in conjunction with registerTargetGlobalVariable to create declare target global variables.
LLVM_ABI void registerTargetGlobalVariable (OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause, OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause, bool IsDeclaration, bool IsExternallyVisible, TargetRegionEntryInfo EntryInfo, StringRef MangledName, std::vector< GlobalVariable * > &GeneratedRefs, bool OpenMPSIMD, std::vector< Triple > TargetTriple, std::function< Constant *()> GlobalInitializer, std::function< GlobalValue::LinkageTypes()> VariableLinkage, Type *LlvmPtrTy, Constant *Addr)
 Registers a target variable for device or host.
LLVM_ABI unsigned getFlagMemberOffset ()
 Get the offset of the OMP_MAP_MEMBER_OF field.
LLVM_ABI omp::OpenMPOffloadMappingFlags getMemberOfFlag (unsigned Position)
 Get OMP_MAP_MEMBER_OF flag with extra bits reserved based on the position given.
LLVM_ABI void setCorrectMemberOfFlag (omp::OpenMPOffloadMappingFlags &Flags, omp::OpenMPOffloadMappingFlags MemberOfFlag)
 Given an initial flag set, this function modifies it to contain the passed in MemberOfFlag generated from the getMemberOfFlag function.
LLVM_ABI InsertPointOrErrorTy applyWorkshareLoop (DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP, bool NeedsBarrier, llvm::omp::ScheduleKind SchedKind=llvm::omp::OMP_SCHEDULE_Default, Value *ChunkSize=nullptr, bool HasSimdModifier=false, bool HasMonotonicModifier=false, bool HasNonmonotonicModifier=false, bool HasOrderedClause=false, omp::WorksharingLoopType LoopType=omp::WorksharingLoopType::ForStaticLoop, bool NoLoop=false, bool HasDistSchedule=false, Value *DistScheduleChunkSize=nullptr)
 Modifies the canonical loop to be a workshare loop.
LLVM_ABI std::vector< CanonicalLoopInfo * > tileLoops (DebugLoc DL, ArrayRef< CanonicalLoopInfo * > Loops, ArrayRef< Value * > TileSizes)
 Tile a loop nest.
LLVM_ABI void unrollLoopFull (DebugLoc DL, CanonicalLoopInfo *Loop)
 Fully unroll a loop.
LLVM_ABI void unrollLoopHeuristic (DebugLoc DL, CanonicalLoopInfo *Loop)
 Fully or partially unroll a loop.
LLVM_ABI void unrollLoopPartial (DebugLoc DL, CanonicalLoopInfo *Loop, int32_t Factor, CanonicalLoopInfo **UnrolledCLI)
 Partially unroll a loop.
LLVM_ABI void applySimd (CanonicalLoopInfo *Loop, MapVector< Value *, Value * > AlignedVars, Value *IfCond, omp::OrderKind Order, ConstantInt *Simdlen, ConstantInt *Safelen)
 Add metadata to simd-ize a loop.
LLVM_ABI void createFlush (const LocationDescription &Loc)
 Generator for 'omp flush'.
LLVM_ABI void createTaskwait (const LocationDescription &Loc)
 Generator for 'omp taskwait'.
LLVM_ABI void createTaskyield (const LocationDescription &Loc)
 Generator for 'omp taskyield'.
LLVM_ABI InsertPointOrErrorTy createTask (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, bool Tied=true, Value *Final=nullptr, Value *IfCondition=nullptr, SmallVector< DependData > Dependencies={}, bool Mergeable=false, Value *EventHandle=nullptr, Value *Priority=nullptr)
 Generator for #omp taskloop
LLVM_ABI InsertPointOrErrorTy createTaskgroup (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB)
 Generator for the taskgroup construct.
LLVM_ABI InsertPointOrErrorTy createReductionsGPU (const LocationDescription &Loc, InsertPointTy AllocaIP, InsertPointTy CodeGenIP, ArrayRef< ReductionInfo > ReductionInfos, ArrayRef< bool > IsByRef, bool IsNoWait=false, bool IsTeamsReduction=false, ReductionGenCBKind ReductionGenCBKind=ReductionGenCBKind::MLIR, std::optional< omp::GV > GridValue={}, unsigned ReductionBufNum=1024, Value *SrcLocInfo=nullptr)
 Design of OpenMP reductions on the GPU.
LLVM_ABI InsertPointOrErrorTy createReductions (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< ReductionInfo > ReductionInfos, ArrayRef< bool > IsByRef, bool IsNoWait=false, bool IsTeamsReduction=false)
 Generator for 'omp reduction'.
InsertPointTy getInsertionPoint ()
 }
bool updateToLocation (const LocationDescription &Loc)
 Update the internal location to Loc.
LLVM_ABI FunctionCallee getOrCreateRuntimeFunction (Module &M, omp::RuntimeFunction FnID)
 Return the function declaration for the runtime function with FnID.
LLVM_ABI FunctiongetOrCreateRuntimeFunctionPtr (omp::RuntimeFunction FnID)
CallInstcreateRuntimeFunctionCall (FunctionCallee Callee, ArrayRef< Value * > Args, StringRef Name="")
LLVM_ABI ConstantgetOrCreateSrcLocStr (StringRef LocStr, uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the source location LocStr.
LLVM_ABI ConstantgetOrCreateDefaultSrcLocStr (uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the default source location.
LLVM_ABI ConstantgetOrCreateSrcLocStr (StringRef FunctionName, StringRef FileName, unsigned Line, unsigned Column, uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the source location identified by the arguments.
LLVM_ABI ConstantgetOrCreateSrcLocStr (DebugLoc DL, uint32_t &SrcLocStrSize, Function *F=nullptr)
 Return the (LLVM-IR) string describing the DebugLoc DL.
LLVM_ABI ConstantgetOrCreateSrcLocStr (const LocationDescription &Loc, uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the source location Loc.
LLVM_ABI ConstantgetOrCreateIdent (Constant *SrcLocStr, uint32_t SrcLocStrSize, omp::IdentFlag Flags=omp::IdentFlag(0), unsigned Reserve2Flags=0)
 Return an ident_t* encoding the source location SrcLocStr and Flags.
LLVM_ABI GlobalValuecreateGlobalFlag (unsigned Value, StringRef Name)
 Create a hidden global flag Name in the module with initial value Value.
LLVM_ABI void emitUsed (StringRef Name, ArrayRef< llvm::WeakTrackingVH > List)
 Emit the llvm.used metadata.
LLVM_ABI GlobalVariableemitKernelExecutionMode (StringRef KernelName, omp::OMPTgtExecModeFlags Mode)
 Emit the kernel execution mode.
LLVM_ABI Error emitCancelationCheckImpl (Value *CancelFlag, omp::Directive CanceledDirective)
 Generate control flow and cleanup for cancellation.
LLVM_ABI InsertPointTy emitTargetKernel (const LocationDescription &Loc, InsertPointTy AllocaIP, Value *&Return, Value *Ident, Value *DeviceID, Value *NumTeams, Value *NumThreads, Value *HostPtr, ArrayRef< Value * > KernelArgs)
 Generate a target region entry call.
LLVM_ABI void emitFlush (const LocationDescription &Loc)
 Generate a flush runtime call.
bool isLastFinalizationInfoCancellable (omp::Directive DK)
 Return true if the last entry in the finalization stack is of kind DK and cancellable.
LLVM_ABI void emitTaskwaitImpl (const LocationDescription &Loc)
 Generate a taskwait runtime call.
LLVM_ABI void emitTaskyieldImpl (const LocationDescription &Loc)
 Generate a taskyield runtime call.
LLVM_ABI ValuegetOrCreateThreadID (Value *Ident)
 Return the current thread ID.
void addOutlineInfo (OutlineInfo &&OI)
 Add a new region that will be outlined later.
LLVM_ABI ValuegetSizeInBytes (Value *BasePtr)
 Computes the size of type in bytes.
LLVM_ABI void emitBranch (BasicBlock *Target)
LLVM_ABI void emitBlock (BasicBlock *BB, Function *CurFn, bool IsFinished=false)
LLVM_ABI Error emitIfClause (Value *Cond, BodyGenCallbackTy ThenGen, BodyGenCallbackTy ElseGen, InsertPointTy AllocaIP={})
 Emits code for OpenMP 'if' clause using specified BodyGenCallbackTy Here is the logic: if (Cond) { ThenGen(); } else { ElseGen(); }.
LLVM_ABI GlobalVariablecreateOffloadMaptypes (SmallVectorImpl< uint64_t > &Mappings, std::string VarName)
 Create the global variable holding the offload mappings information.
LLVM_ABI GlobalVariablecreateOffloadMapnames (SmallVectorImpl< llvm::Constant * > &Names, std::string VarName)
 Create the global variable holding the offload names information.
LLVM_ABI void createMapperAllocas (const LocationDescription &Loc, InsertPointTy AllocaIP, unsigned NumOperands, struct MapperAllocas &MapperAllocas)
 Create the allocas instruction used in call to mapper functions.
LLVM_ABI void emitMapperCall (const LocationDescription &Loc, Function *MapperFunc, Value *SrcLocInfo, Value *MaptypesArg, Value *MapnamesArg, struct MapperAllocas &MapperAllocas, int64_t DeviceID, unsigned NumOperands)
 Create the call for the target mapper function.
LLVM_ABI InsertPointOrErrorTy emitKernelLaunch (const LocationDescription &Loc, Value *OutlinedFnID, EmitFallbackCallbackTy EmitTargetCallFallbackCB, TargetKernelArgs &Args, Value *DeviceID, Value *RTLoc, InsertPointTy AllocaIP)
 Generate a target region entry call and host fallback call.
LLVM_ABI InsertPointOrErrorTy emitTargetTask (TargetTaskBodyCallbackTy TaskBodyCB, Value *DeviceID, Value *RTLoc, OpenMPIRBuilder::InsertPointTy AllocaIP, const SmallVector< llvm::OpenMPIRBuilder::DependData > &Dependencies, const TargetDataRTArgs &RTArgs, bool HasNoWait)
 Generate a target-task for the target construct.
LLVM_ABI void emitOffloadingArraysArgument (IRBuilderBase &Builder, OpenMPIRBuilder::TargetDataRTArgs &RTArgs, OpenMPIRBuilder::TargetDataInfo &Info, bool ForEndCall=false)
 Emit the arguments to be passed to the runtime library based on the arrays of base pointers, pointers, sizes, map types, and mappers.
LLVM_ABI void emitNonContiguousDescriptor (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, MapInfosTy &CombinedInfo, TargetDataInfo &Info)
 Emit an array of struct descriptors to be assigned to the offload args.
LLVM_ABI Error emitOffloadingArrays (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, MapInfosTy &CombinedInfo, TargetDataInfo &Info, CustomMapperCallbackTy CustomMapperCB, bool IsNonContiguous=false, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr)
 Emit the arrays used to pass the captures and map information to the offloading runtime library.
LLVM_ABI Error emitOffloadingArraysAndArgs (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, TargetDataInfo &Info, TargetDataRTArgs &RTArgs, MapInfosTy &CombinedInfo, CustomMapperCallbackTy CustomMapperCB, bool IsNonContiguous=false, bool ForEndCall=false, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr)
 Allocates memory for and populates the arrays required for offloading (offload_{baseptrs|ptrs|mappers|sizes|maptypes|mapnames}).
LLVM_ABI void createOffloadEntry (Constant *ID, Constant *Addr, uint64_t Size, int32_t Flags, GlobalValue::LinkageTypes, StringRef Name="")
 Creates offloading entry for the provided entry ID ID, address Addr, size Size, and flags Flags.
LLVM_ABI void createOffloadEntriesAndInfoMetadata (EmitMetadataErrorReportFunctionTy &ErrorReportFunction)
LLVM_ABI InsertPointTy createCopyPrivate (const LocationDescription &Loc, llvm::Value *BufSize, llvm::Value *CpyBuf, llvm::Value *CpyFn, llvm::Value *DidIt)
 Generator for __kmpc_copyprivate.
LLVM_ABI InsertPointOrErrorTy createSingle (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, bool IsNowait, ArrayRef< llvm::Value * > CPVars={}, ArrayRef< llvm::Function * > CPFuncs={})
 Generator for 'omp single'.
LLVM_ABI InsertPointOrErrorTy createMaster (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB)
 Generator for 'omp master'.
LLVM_ABI InsertPointOrErrorTy createMasked (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, Value *Filter)
 Generator for 'omp masked'.
LLVM_ABI InsertPointOrErrorTy emitScanReduction (const LocationDescription &Loc, ArrayRef< llvm::OpenMPIRBuilder::ReductionInfo > ReductionInfos, ScanInfo *ScanRedInfo)
 This function performs the scan reduction of the values updated in the input phase.
LLVM_ABI InsertPointOrErrorTy createScan (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< llvm::Value * > ScanVars, ArrayRef< llvm::Type * > ScanVarsType, bool IsInclusive, ScanInfo *ScanRedInfo)
 This directive split and directs the control flow to input phase blocks or scan phase blocks based on 1.
LLVM_ABI InsertPointOrErrorTy createCritical (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, StringRef CriticalName, Value *HintInst)
 Generator for 'omp critical'.
LLVM_ABI InsertPointTy createOrderedDepend (const LocationDescription &Loc, InsertPointTy AllocaIP, unsigned NumLoops, ArrayRef< llvm::Value * > StoreValues, const Twine &Name, bool IsDependSource)
 Generator for 'omp ordered depend (source | sink)'.
LLVM_ABI InsertPointOrErrorTy createOrderedThreadsSimd (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, bool IsThreads)
 Generator for 'omp ordered [threads | simd]'.
LLVM_ABI InsertPointOrErrorTy createSections (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< StorableBodyGenCallbackTy > SectionCBs, PrivatizeCallbackTy PrivCB, FinalizeCallbackTy FiniCB, bool IsCancellable, bool IsNowait)
 Generator for 'omp sections'.
LLVM_ABI InsertPointOrErrorTy createSection (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB)
 Generator for 'omp section'.
LLVM_ABI InsertPointOrErrorTy createTeams (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, Value *NumTeamsLower=nullptr, Value *NumTeamsUpper=nullptr, Value *ThreadLimit=nullptr, Value *IfExpr=nullptr)
 Generator for #omp teams
LLVM_ABI InsertPointOrErrorTy createDistribute (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB)
 Generator for #omp distribute
LLVM_ABI InsertPointTy createCopyinClauseBlocks (InsertPointTy IP, Value *MasterAddr, Value *PrivateAddr, llvm::IntegerType *IntPtrTy, bool BranchtoEnd=true)
 Generate conditional branch and relevant BasicBlocks through which private threads copy the 'copyin' variables from Master copy to threadprivate copies.
LLVM_ABI CallInstcreateOMPAlloc (const LocationDescription &Loc, Value *Size, Value *Allocator, std::string Name="")
 Create a runtime call for kmpc_Alloc.
LLVM_ABI CallInstcreateOMPFree (const LocationDescription &Loc, Value *Addr, Value *Allocator, std::string Name="")
 Create a runtime call for kmpc_free.
LLVM_ABI CallInstcreateCachedThreadPrivate (const LocationDescription &Loc, llvm::Value *Pointer, llvm::ConstantInt *Size, const llvm::Twine &Name=Twine(""))
 Create a runtime call for kmpc_threadprivate_cached.
LLVM_ABI CallInstcreateOMPInteropInit (const LocationDescription &Loc, Value *InteropVar, omp::OMPInteropType InteropType, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause)
 Create a runtime call for __tgt_interop_init.
LLVM_ABI CallInstcreateOMPInteropDestroy (const LocationDescription &Loc, Value *InteropVar, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause)
 Create a runtime call for __tgt_interop_destroy.
LLVM_ABI CallInstcreateOMPInteropUse (const LocationDescription &Loc, Value *InteropVar, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause)
 Create a runtime call for __tgt_interop_use.
LLVM_ABI InsertPointTy createTargetInit (const LocationDescription &Loc, const llvm::OpenMPIRBuilder::TargetKernelDefaultAttrs &Attrs)
 The omp target interface.
LLVM_ABI void createTargetDeinit (const LocationDescription &Loc, int32_t TeamsReductionDataSize=0, int32_t TeamsReductionBufferLength=1024)
 Create a runtime call for kmpc_target_deinit.
LLVM_ABI Error emitTargetRegionFunction (TargetRegionEntryInfo &EntryInfo, FunctionGenCallback &GenerateFunctionCallback, bool IsOffloadEntry, Function *&OutlinedFn, Constant *&OutlinedFnID)
 Create a unique name for the entry function using the source location information of the current target region.
LLVM_ABI ConstantregisterTargetRegionFunction (TargetRegionEntryInfo &EntryInfo, Function *OutlinedFunction, StringRef EntryFnName, StringRef EntryFnIDName)
 Registers the given function and sets up the attribtues of the function Returns the FunctionID.
LLVM_ABI Expected< Function * > emitUserDefinedMapper (function_ref< MapInfosOrErrorTy(InsertPointTy CodeGenIP, llvm::Value *PtrPHI, llvm::Value *BeginArg)> PrivAndGenMapInfoCB, llvm::Type *ElemTy, StringRef FuncName, CustomMapperCallbackTy CustomMapperCB)
 Emit the user-defined mapper function.
LLVM_ABI InsertPointOrErrorTy createTargetData (const LocationDescription &Loc, InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value *DeviceID, Value *IfCond, TargetDataInfo &Info, GenMapInfoCallbackTy GenMapInfoCB, CustomMapperCallbackTy CustomMapperCB, omp::RuntimeFunction *MapperFunc=nullptr, function_ref< InsertPointOrErrorTy(InsertPointTy CodeGenIP, BodyGenTy BodyGenType)> BodyGenCB=nullptr, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, Value *SrcLocInfo=nullptr)
 Generator for 'omp target data'.
LLVM_ABI InsertPointOrErrorTy createTarget (const LocationDescription &Loc, bool IsOffloadEntry, OpenMPIRBuilder::InsertPointTy AllocaIP, OpenMPIRBuilder::InsertPointTy CodeGenIP, TargetDataInfo &Info, TargetRegionEntryInfo &EntryInfo, const TargetKernelDefaultAttrs &DefaultAttrs, const TargetKernelRuntimeAttrs &RuntimeAttrs, Value *IfCond, SmallVectorImpl< Value * > &Inputs, GenMapInfoCallbackTy GenMapInfoCB, TargetBodyGenCallbackTy BodyGenCB, TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB, CustomMapperCallbackTy CustomMapperCB, const SmallVector< DependData > &Dependencies, bool HasNowait=false, Value *DynCGroupMem=nullptr, omp::OMPDynGroupprivateFallbackType DynCGroupMemFallback=omp::OMPDynGroupprivateFallbackType::Abort)
 Generator for 'omp target'.
LLVM_ABI FunctionCallee createForStaticInitFunction (unsigned IVSize, bool IVSigned, bool IsGPUDistribute)
 Returns __kmpc_for_static_init_* runtime function for the specified size IVSize and sign IVSigned.
LLVM_ABI FunctionCallee createDispatchInitFunction (unsigned IVSize, bool IVSigned)
 Returns __kmpc_dispatch_init_* runtime function for the specified size IVSize and sign IVSigned.
LLVM_ABI FunctionCallee createDispatchNextFunction (unsigned IVSize, bool IVSigned)
 Returns __kmpc_dispatch_next_* runtime function for the specified size IVSize and sign IVSigned.
LLVM_ABI FunctionCallee createDispatchFiniFunction (unsigned IVSize, bool IVSigned)
 Returns __kmpc_dispatch_fini_* runtime function for the specified size IVSize and sign IVSigned.
LLVM_ABI FunctionCallee createDispatchDeinitFunction ()
 Returns __kmpc_dispatch_deinit runtime function.
LLVM_ABI InsertPointTy createAtomicRead (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOrdering AO, InsertPointTy AllocaIP)
 Emit atomic Read for : V = X — Only Scalar data types.
LLVM_ABI InsertPointTy createAtomicWrite (const LocationDescription &Loc, AtomicOpValue &X, Value *Expr, AtomicOrdering AO, InsertPointTy AllocaIP)
 Emit atomic write for : X = Expr — Only Scalar data types.
LLVM_ABI InsertPointOrErrorTy createAtomicUpdate (const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X, Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp, AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr, bool IsIgnoreDenormalMode=false, bool IsFineGrainedMemory=false, bool IsRemoteMemory=false)
 Emit atomic update for constructs: X = X BinOp Expr ,or X = Expr BinOp X For complex Operations: X = UpdateOp(X) => CmpExch X, old_X, UpdateOp(X) Only Scalar data types.
LLVM_ABI InsertPointOrErrorTy createAtomicCapture (const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X, AtomicOpValue &V, Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp, AtomicUpdateCallbackTy &UpdateOp, bool UpdateExpr, bool IsPostfixUpdate, bool IsXBinopExpr, bool IsIgnoreDenormalMode=false, bool IsFineGrainedMemory=false, bool IsRemoteMemory=false)
 Emit atomic update for constructs: — Only Scalar data types V = X; X = X BinOp Expr , X = X BinOp Expr; V = X, V = X; X = Expr BinOp X, X = Expr BinOp X; V = X, V = X; X = UpdateOp(X), X = UpdateOp(X); V = X,.
LLVM_ABI InsertPointTy createAtomicCompare (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOpValue &R, Value *E, Value *D, AtomicOrdering AO, omp::OMPAtomicCompareOp Op, bool IsXBinopExpr, bool IsPostfixUpdate, bool IsFailOnly)
 Emit atomic compare for constructs: — Only scalar data types cond-expr-stmt: x = x ordop expr ?
LLVM_ABI InsertPointTy createAtomicCompare (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOpValue &R, Value *E, Value *D, AtomicOrdering AO, omp::OMPAtomicCompareOp Op, bool IsXBinopExpr, bool IsPostfixUpdate, bool IsFailOnly, AtomicOrdering Failure)
LLVM_ABI CanonicalLoopInfocreateLoopSkeleton (DebugLoc DL, Value *TripCount, Function *F, BasicBlock *PreInsertBefore, BasicBlock *PostInsertBefore, const Twine &Name={})
 Create the control flow structure of a canonical OpenMP loop.
LLVM_ABI void loadOffloadInfoMetadata (Module &M)
 Loads all the offload entries information from the host IR metadata.
LLVM_ABI void loadOffloadInfoMetadata (vfs::FileSystem &VFS, StringRef HostFilePath)
 Loads all the offload entries information from the host IR metadata read from the file passed in as the HostFilePath argument.
LLVM_ABI GlobalVariablegetOrCreateInternalVariable (Type *Ty, const StringRef &Name, std::optional< unsigned > AddressSpace={})
 Gets (if variable with the given name already exist) or creates internal global variable with the specified Name.

Static Public Member Functions

static LLVM_ABI unsigned getOpenMPDefaultSimdAlign (const Triple &TargetTriple, const StringMap< bool > &Features)
 Get the default alignment value for given target.
static LLVM_ABI TargetRegionEntryInfo getTargetEntryUniqueInfo (FileIdentifierInfoCallbackTy CallBack, vfs::FileSystem &VFS, StringRef ParentName="")
 Creates a unique info for a target entry when provided a filename and line number from.
static LLVM_ABI void getKernelArgsVector (TargetKernelArgs &KernelArgs, IRBuilderBase &Builder, SmallVector< Value * > &ArgsVector)
 Create the kernel args vector used by emitTargetKernel.
static LLVM_ABI std::pair< int32_t, int32_t > readThreadBoundsForKernel (const Triple &T, Function &Kernel)
 }
static LLVM_ABI void writeThreadBoundsForKernel (const Triple &T, Function &Kernel, int32_t LB, int32_t UB)
static LLVM_ABI std::pair< int32_t, int32_t > readTeamBoundsForKernel (const Triple &T, Function &Kernel)
 Read/write a bounds on teams for Kernel.
static LLVM_ABI void writeTeamsForKernel (const Triple &T, Function &Kernel, int32_t LB, int32_t UB)

Public Attributes

SmallVector< FinalizationInfo, 8 > FinalizationStack
 The finalization stack made up of finalize callbacks currently in-flight, wrapped into FinalizationInfo objects that reference also the finalization target block and the kind of cancellable directive.
OpenMPIRBuilderConfig Config
 The OpenMPIRBuilder Configuration.
ModuleM
 The underlying LLVM-IR module.
IRBuilder Builder
 The LLVM-IR Builder used to create IR.
StringMap< Constant * > SrcLocStrMap
 Map to remember source location strings.
DenseMap< std::pair< Constant *, uint64_t >, Constant * > IdentMap
 Map to remember existing ident_t*.
OffloadEntriesInfoManager OffloadInfoManager
 Info manager to keep track of target regions.
const Triple T
 The target triple of the underlying module.
SmallVector< OutlineInfo, 16 > OutlineInfos
 Collection of regions that need to be outlined during finalization.
SmallVector< llvm::Function *, 16 > ConstantAllocaRaiseCandidates
 A collection of candidate target functions that's constant allocas will attempt to be raised on a call of finalize after all currently enqueued outline info's have been processed.
std::forward_list< CanonicalLoopInfoLoopInfos
 Collection of owned canonical loop objects that eventually need to be free'd.
std::forward_list< ScanInfoScanInfos
 Collection of owned ScanInfo objects that eventually need to be free'd.
StringMap< GlobalVariable *, BumpPtrAllocatorInternalVars
 An ordered map of auto-generated variables to their unique names.
const std::string ompOffloadInfoName = "omp_offload.info"
 OMP Offload Info Metadata name string.

Detailed Description

An interface to create LLVM-IR for OpenMP directives.

Each OpenMP directive has a corresponding public generator method.

Definition at line 504 of file OMPIRBuilder.h.

Member Typedef Documentation

◆ BodyGenCallbackTy

Initial value:
IRBuilder<>::InsertPoint InsertPointTy
Type used throughout for insertion points.
An efficient, type-erasing, non-owning reference to a callable.

Callback type for body (=inner region) code generation.

The callback takes code locations as arguments, each describing a location where additional instructions can be inserted.

The CodeGenIP may be in the middle of a basic block or point to the end of it. The basic block may have a terminator or be degenerate. The callback function may just insert instructions at that position, but also split the block (without the Before argument of BasicBlock::splitBasicBlock such that the identify of the split predecessor block is preserved) and insert additional control flow, including branches that do not lead back to what follows the CodeGenIP. Note that since the callback is allowed to split the block, callers must assume that InsertPoints to positions in the BasicBlock after CodeGenIP including CodeGenIP itself are invalidated. If such InsertPoints need to be preserved, it can split the block itself before calling the callback.

AllocaIP and CodeGenIP must not point to the same position.

Parameters
AllocaIPis the insertion point at which new alloca instructions should be placed. The BasicBlock it is pointing to must not be split.
CodeGenIPis the insertion point at which the body code should be placed.
Returns
an error, if any were triggered during execution.

Definition at line 649 of file OMPIRBuilder.h.

◆ CustomMapperCallbackTy

Initial value:

Definition at line 2766 of file OMPIRBuilder.h.

◆ EmitFallbackCallbackTy

Initial value:
Expected< InsertPointTy > InsertPointOrErrorTy
Type used to represent an insertion point or an error value.

Callback function type for functions emitting the host fallback code that is executed when the kernel launch fails.

It takes an insertion point as parameter where the code should be emitted. It returns an insertion point that points right after after the emitted code.

Definition at line 2761 of file OMPIRBuilder.h.

◆ EmitMetadataErrorReportFunctionTy

Initial value:
EmitMetadataErrorKind
The kind of errors that can occur when emitting the offload entries and metadata.
Data structure to contain the information needed to uniquely identify a target entry.

Callback function type.

Definition at line 2868 of file OMPIRBuilder.h.

◆ FileIdentifierInfoCallbackTy

Initial value:
std::function<std::tuple<std::string, uint64_t>()>

Definition at line 1535 of file OMPIRBuilder.h.

◆ FinalizeCallbackTy

Callback type for variable finalization (think destructors).

Parameters
CodeGenIPis the insertion point at which the finalization code should be placed.

A finalize callback knows about all objects that need finalization, e.g. destruction, when the scope of the currently generated construct is left at the time, and location, the callback is invoked.

Definition at line 579 of file OMPIRBuilder.h.

◆ FunctionGenCallback

Initial value:
std::function<Expected<Function *>(StringRef FunctionName)>
StringRef - Represent a constant reference to a string, i.e.
Definition StringRef.h:55

Functions used to generate a function with the given name.

Definition at line 3246 of file OMPIRBuilder.h.

◆ GenMapInfoCallbackTy

Initial value:
This structure contains combined information generated for mappable clauses, including base pointers,...

Callback type for creating the map infos for the kernel parameters.

Parameters
CodeGenIPis the insertion point where code should be generated, if any.

Definition at line 3314 of file OMPIRBuilder.h.

◆ InsertPointOrErrorTy

Type used to represent an insertion point or an error value.

Definition at line 558 of file OMPIRBuilder.h.

◆ InsertPointTy

Type used throughout for insertion points.

Definition at line 555 of file OMPIRBuilder.h.

◆ LoopBodyGenCallbackTy

Initial value:
function_ref<Error(InsertPointTy CodeGenIP, Value *IndVar)>
LLVM Value Representation.
Definition Value.h:75

Callback type for loop body code generation.

Parameters
CodeGenIPis the insertion point where the loop's body code must be placed. This will be a dedicated BasicBlock with a conditional branch from the loop condition check and terminated with an unconditional branch to the loop latch.
IndVaris the induction variable usable at the insertion point.
Returns
an error, if any were triggered during execution.

Definition at line 703 of file OMPIRBuilder.h.

◆ MapDeviceInfoArrayTy

◆ MapDimArrayTy

◆ MapFlagsArrayTy

◆ MapInfosOrErrorTy

◆ MapNamesArrayTy

◆ MapNonContiguousArrayTy

◆ MapValuesArrayTy

◆ PrivatizeCallbackTy

Initial value:
InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value &Original,
Value &Inner, Value *&ReplVal)>

Callback type for variable privatization (think copy & default constructor).

Parameters
AllocaIPis the insertion point at which new alloca instructions should be placed.
CodeGenIPis the insertion point at which the privatization code should be placed.
OriginalThe value being copied/created, should not be used in the generated IR.
InnerThe equivalent of Original that should be used in the generated IR; this is equal to Original if the value is a pointer and can thus be passed directly, otherwise it is an equivalent but different value.
ReplValThe replacement value, thus a copy or new created version of Inner.
Returns
The new insertion point where code generation continues and ReplVal the replacement value.

Definition at line 724 of file OMPIRBuilder.h.

◆ ReductionGenAtomicCBTy

Initial value:
std::function<InsertPointOrErrorTy(
The instances of the Type class are immutable: once they are created, they are never changed.
Definition Type.h:45

Functions used to generate atomic reductions.

Such functions take two Values representing pointers to LHS and RHS of the reduction, as well as the element type of these pointers. They are expected to atomically update the LHS to the reduced value.

Definition at line 1578 of file OMPIRBuilder.h.

◆ ReductionGenCBTy

Initial value:
std::function<InsertPointOrErrorTy(
InsertPointTy CodeGenIP, Value *LHS, Value *RHS, Value *&Res)>
Value * RHS
Value * LHS

ReductionGen CallBack for MLIR.

Parameters
CodeGenIPInsertPoint for CodeGen.
LHSPass in the LHS Value to be used for CodeGen.
RHSPass in the RHS Value to be used for CodeGen.

Definition at line 1571 of file OMPIRBuilder.h.

◆ ReductionGenClangCBTy

Initial value:
std::function<InsertPointTy(InsertPointTy CodeGenIP, unsigned Index,
Value **LHS, Value **RHS, Function *CurFn)>

ReductionGen CallBack for Clang.

Parameters
CodeGenIPInsertPoint for CodeGen.
IndexIndex of the ReductionInfo to generate code for.
LHSPtrOptionally used by Clang to return the LHSPtr it used for codegen, used for fixup later.
RHSPtrOptionally used by Clang to return the RHSPtr it used for codegen, used for fixup later.
CurFnOptionally used by Clang to pass in the Current Function as Clang context may be old.

Definition at line 1562 of file OMPIRBuilder.h.

◆ ReductionGenDataPtrPtrCBTy

Initial value:
std::function<InsertPointOrErrorTy(
InsertPointTy, Value *ByRefVal, Value *&Res)>

Definition at line 1581 of file OMPIRBuilder.h.

◆ StorableBodyGenCallbackTy

Initial value:
std::function<Error(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)>
Returns
an error, if any were triggered during execution.

Definition at line 690 of file OMPIRBuilder.h.

◆ TargetBodyGenCallbackTy

◆ TargetGenArgAccessorsCallbackTy

Initial value:
Argument &Arg, Value *Input, Value *&RetVal, InsertPointTy AllocaIP,
InsertPointTy CodeGenIP)>
The Input class is used to parse a yaml document into in-memory structs and vectors.
This class represents an incoming formal argument to a Function.
Definition Argument.h:32

Definition at line 3410 of file OMPIRBuilder.h.

◆ TargetTaskBodyCallbackTy

Initial value:
function_ref<Error(Value *DeviceID, Value *RTLoc,
IRBuilderBase::InsertPoint TargetTaskAllocaIP)>
InsertPoint - A saved insertion point.
Definition IRBuilder.h:291

Callback type for generating the bodies of device directives that require outer target tasks (e.g.

in case of having nowait or depend clauses).

Parameters
DeviceIDThe ID of the device on which the target region will execute.
RTLocSource location identifier \Param TargetTaskAllocaIP Insertion point for the alloca block of the generated task.
Returns
an error, if any were triggered during execution.

Definition at line 2794 of file OMPIRBuilder.h.

◆ TaskDupCallbackTy

Initial value:
InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value *DestPtr,
Value *SrcPtr)>

Callback type for task duplication function code generation.

This is the task duplication function passed to __kmpc_taskloop. It is expected that this function will set up (first)private variables in the duplicated task which have non-trivial (copy-)constructors. Insertion points are handled the same way as for BodyGenCallbackTy.

createTaskloop lays out the task's auxiliary data structure as: { lower bound, upper bound, step, data... }. DestPtr and SrcPtr point to this data.

It is acceptable for the callback to be set to nullptr. In that case no function will be generated and nullptr will be passed as the task duplication function to __kmpc_taskloop.

Parameters
AllocaIPis the insertion point at which new alloca instructions should be placed. The BasicBlock it is pointing to must not be split.
CodeGenIPis the insertion point at which the body code should be placed.
DestPtrThis is a pointer to data inside the newly duplicated task's auxiliary data structure (allocated after the task descriptor.)
SrcPtrThis is a pointer to data inside the original task's auxiliary data structure (allocated after the task descriptor.)
Returns
The insertion point immediately after the generated code, or an error if any occured.

Definition at line 680 of file OMPIRBuilder.h.

Member Enumeration Documentation

◆ BodyGenTy

Type of BodyGen to use for region codegen.

Priv: If device pointer privatization is required, emit the body of the region here. It will have to be duplicated: with and without privatization. DupNoPriv: If we need device pointer privatization, we need to emit the body of the region with no privatization in the 'else' branch of the conditional. NoPriv: If we don't require privatization of device pointers, we emit the body in between the runtime calls. This avoids duplicating the body code.

Enumerator
Priv 
DupNoPriv 
NoPriv 

Definition at line 3309 of file OMPIRBuilder.h.

◆ CopyAction

Enumerator
RemoteLaneToThread 
ThreadCopy 

Definition at line 1654 of file OMPIRBuilder.h.

◆ DeviceInfoTy

Enumerator
None 
Pointer 
Address 

Definition at line 2708 of file OMPIRBuilder.h.

◆ EmitMetadataErrorKind

The kind of errors that can occur when emitting the offload entries and metadata.

Enumerator
EMIT_MD_TARGET_REGION_ERROR 
EMIT_MD_DECLARE_TARGET_ERROR 
EMIT_MD_GLOBAL_VAR_LINK_ERROR 
EMIT_MD_GLOBAL_VAR_INDIRECT_ERROR 

Definition at line 2860 of file OMPIRBuilder.h.

◆ EvalKind

Enum class for reduction evaluation types scalar, complex and aggregate.

Enumerator
Scalar 
Complex 
Aggregate 

Definition at line 1585 of file OMPIRBuilder.h.

◆ ReductionGenCBKind

Enum class for the RedctionGen CallBack type to be used.

Enumerator
Clang 
MLIR 

Definition at line 1550 of file OMPIRBuilder.h.

Constructor & Destructor Documentation

◆ OpenMPIRBuilder()

llvm::OpenMPIRBuilder::OpenMPIRBuilder ( Module & M)
inline

Create a new OpenMPIRBuilder operating on the given module M.

This will not have an effect on M (see initialize)

Definition at line 508 of file OMPIRBuilder.h.

References Builder, M, OffloadInfoManager, and T.

◆ ~OpenMPIRBuilder()

OpenMPIRBuilder::~OpenMPIRBuilder ( )

Definition at line 951 of file OMPIRBuilder.cpp.

References assert(), and OutlineInfos.

Member Function Documentation

◆ addAttributes()

◆ addOutlineInfo()

void llvm::OpenMPIRBuilder::addOutlineInfo ( OutlineInfo && OI)
inline

Add a new region that will be outlined later.

Definition at line 2475 of file OMPIRBuilder.h.

References OutlineInfos.

Referenced by createDistribute(), createParallel(), createTask(), createTeams(), and emitTargetTask().

◆ applySimd()

void OpenMPIRBuilder::applySimd ( CanonicalLoopInfo * Loop,
MapVector< Value *, Value * > AlignedVars,
Value * IfCond,
omp::OrderKind Order,
ConstantInt * Simdlen,
ConstantInt * Safelen )

Add metadata to simd-ize a loop.

If IfCond is not nullptr, the loop is cloned. The metadata which prevents vectorization is added to to the cloned loop. The cloned loop is executed when ifCond is evaluated to false.

Parameters
LoopThe loop to simd-ize.
AlignedVarsThe map which containts pairs of the pointer and its corresponding alignment.
IfCondThe value which corresponds to the if clause condition.
OrderThe enum to map order clause.
SimdlenThe Simdlen length to apply to the simd loop.
SafelenThe Safelen length to apply to the simd loop.

Definition at line 6678 of file OMPIRBuilder.cpp.

References addLoopMetadata(), applyParallelAccessesMetadata(), llvm::Block, Builder, llvm::dyn_cast(), F, FAM, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::CanonicalLoopInfo::getCond(), llvm::CanonicalLoopInfo::getFunction(), llvm::CanonicalLoopInfo::getHeader(), llvm::Type::getInt1Ty(), llvm::LoopInfoBase< BlockT, LoopT >::getLoopFor(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::ConstantInt::getTrue(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::LoopAnalysis::run(), and llvm::MapVector< KeyT, ValueT, MapType, VectorType >::size().

◆ applyWorkshareLoop()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyWorkshareLoop ( DebugLoc DL,
CanonicalLoopInfo * CLI,
InsertPointTy AllocaIP,
bool NeedsBarrier,
llvm::omp::ScheduleKind SchedKind = llvm::omp::OMP_SCHEDULE_Default,
Value * ChunkSize = nullptr,
bool HasSimdModifier = false,
bool HasMonotonicModifier = false,
bool HasNonmonotonicModifier = false,
bool HasOrderedClause = false,
omp::WorksharingLoopType LoopType = omp::WorksharingLoopType::ForStaticLoop,
bool NoLoop = false,
bool HasDistSchedule = false,
Value * DistScheduleChunkSize = nullptr )

Modifies the canonical loop to be a workshare loop.

This takes a LoopInfo representing a canonical loop, such as the one created by createCanonicalLoop and emits additional instructions to turn it into a workshare loop. In particular, it calls to an OpenMP runtime function in the preheader to obtain the loop bounds to be used in the current thread, updates the relevant instructions in the canonical loop and calls to an OpenMP runtime finalization function after the loop.

The concrete transformation is done by applyStaticWorkshareLoop, applyStaticChunkedWorkshareLoop, or applyDynamicWorkshareLoop, depending on the value of SchedKind and ChunkSize.

Parameters
DLDebug location for instructions added for the workshare-loop construct itself.
CLIA descriptor of the canonical loop to workshare.
AllocaIPAn insertion point for Alloca instructions usable in the preheader of the loop.
NeedsBarrierIndicates whether a barrier must be insterted after the loop.
SchedKindScheduling algorithm to use.
ChunkSizeThe chunk size for the inner loop.
HasSimdModifierWhether the simd modifier is present in the schedule clause.
HasMonotonicModifierWhether the monotonic modifier is present in the schedule clause.
HasNonmonotonicModifierWhether the nonmonotonic modifier is present in the schedule clause.
HasOrderedClauseWhether the (parameterless) ordered clause is present.
LoopTypeInformation about type of loop worksharing. It corresponds to type of loop workshare OpenMP pragma.
NoLoopIf true, no-loop code is generated.
HasDistScheduleDefines if the clause being lowered is dist_schedule as this is handled slightly differently
DistScheduleChunkSizeThe chunk size for dist_schedule loop
Returns
Point where to insert code after the workshare construct.

Definition at line 5907 of file OMPIRBuilder.cpp.

References assert(), computeOpenMPScheduleType(), Config, DL, and llvm_unreachable.

◆ calculateCanonicalLoopTripCount()

Value * OpenMPIRBuilder::calculateCanonicalLoopTripCount ( const LocationDescription & Loc,
Value * Start,
Value * Stop,
Value * Step,
bool IsSigned,
bool InclusiveStop,
const Twine & Name = "loop" )

Calculate the trip count of a canonical loop.

This allows specifying user-defined loop counter values using increment, upper- and lower bounds. To disambiguate the terminology when counting downwards, instead of lower bounds we use Start for the loop counter value in the first body iteration.

Consider the following limitations:

  • A loop counter space over all integer values of its bit-width cannot be represented. E.g using uint8_t, its loop trip count of 256 cannot be stored into an 8 bit integer):

    DO I = 0, 255, 1

  • Unsigned wrapping is only supported when wrapping only "once"; E.g. effectively counting downwards:

    for (uint8_t i = 100u; i > 0; i += 127u)

TODO: May need to add additional parameters to represent:

  • Allow representing downcounting with unsigned integers.
  • Sign of the step and the comparison operator might disagree:

    for (int i = 0; i < 42; i -= 1u)

Parameters
LocThe insert and source location description.
StartValue of the loop counter for the first iterations.
StopLoop counter values past this will stop the loop.
StepLoop counter increment after each iteration; negative means counting down.
IsSignedWhether Start, Stop and Step are signed integers.
InclusiveStopWhether Stop itself is a valid value for the loop counter.
NameBase name used to derive instruction names.
Returns
The value holding the calculated trip count.
  • A Step of INT_MIN cannot not be normalized to a positive direction:

Definition at line 5215 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::cast(), llvm::Value::getType(), llvm::CmpInst::ICMP_SLE, llvm::CmpInst::ICMP_SLT, llvm::CmpInst::ICMP_ULE, llvm::CmpInst::ICMP_ULT, and updateToLocation().

Referenced by createCanonicalLoop(), and createCanonicalScanLoops().

◆ collapseLoops()

CanonicalLoopInfo * OpenMPIRBuilder::collapseLoops ( DebugLoc DL,
ArrayRef< CanonicalLoopInfo * > Loops,
InsertPointTy ComputeIP )

Collapse a loop nest into a single loop.

Merges loops of a loop nest into a single CanonicalLoopNest representation that has the same number of innermost loop iterations as the origin loop nest. The induction variables of the input loops are derived from the collapsed loop's induction variable. This is intended to be used to implement OpenMP's collapse clause. Before applying a directive, collapseLoops normalizes a loop nest to contain only a single loop and the directive's implementation does not need to handle multiple loops itself. This does not remove the need to handle all loop nest handling by directives, such as the ordered(<n>) clause or the simd schedule-clause modifier of the worksharing-loop directive.

Example:

for (int i = 0; i < 7; ++i) // Canonical loop "i"
for (int j = 0; j < 9; ++j) // Canonical loop "j"
body(i, j);

After collapsing with Loops={i,j}, the loop is changed to

for (int ij = 0; ij < 63; ++ij) {
int i = ij / 9;
int j = ij % 9;
body(i, j);
}

In the current implementation, the following limitations apply:

  • All input loops have an induction variable of the same type.
  • The collapsed loop will have the same trip count integer type as the input loops. Therefore it is possible that the collapsed loop cannot represent all iterations of the input loops. For instance, assuming a 32 bit integer type, and two input loops both iterating 2^16 times, the theoretical trip count of the collapsed loop would be 2^32 iteration, which cannot be represented in an 32-bit integer. Behavior is undefined in this case.
  • The trip counts of every input loop must be available at ComputeIP. Non-rectangular loops are not yet supported.
  • At each nest level, code between a surrounding loop and its nested loop is hoisted into the loop body, and such code will be executed more often than before collapsing (or not at all if any inner loop iteration has a trip count of 0). This is permitted by the OpenMP specification.
Parameters
DLDebug location for instructions added for collapsing, such as instructions to compute/derive the input loop's induction variables.
LoopsLoops in the loop nest to collapse. Loops are specified from outermost-to-innermost and every control flow of a loop's body must pass through its directly nested loop.
ComputeIPWhere additional instruction that compute the collapsed trip count. If not set, defaults to before the generated loop.
Returns
The CanonicalLoopInfo object representing the collapsed loop.

Definition at line 6190 of file OMPIRBuilder.cpp.

References assert(), Builder, createLoopSkeleton(), DL, F, llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBody(), llvm::CanonicalLoopInfo::getLatch(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::CanonicalLoopInfo::getPreheaderIP(), llvm::IRBuilderBase::InsertPoint::isSet(), Loops, redirectAllPredecessorsTo(), redirectTo(), removeUnusedBlocksFromParent(), llvm::SmallVectorImpl< T >::reserve(), and llvm::SmallVectorImpl< T >::resize().

◆ createAtomicCapture()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
AtomicOpValue & X,
AtomicOpValue & V,
Value * Expr,
AtomicOrdering AO,
AtomicRMWInst::BinOp RMWOp,
AtomicUpdateCallbackTy & UpdateOp,
bool UpdateExpr,
bool IsPostfixUpdate,
bool IsXBinopExpr,
bool IsIgnoreDenormalMode = false,
bool IsFineGrainedMemory = false,
bool IsRemoteMemory = false )

Emit atomic update for constructs: — Only Scalar data types V = X; X = X BinOp Expr , X = X BinOp Expr; V = X, V = X; X = Expr BinOp X, X = Expr BinOp X; V = X, V = X; X = UpdateOp(X), X = UpdateOp(X); V = X,.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion point to be used for alloca instructions.
XThe target atomic pointer to be updated
VMemory address where to store captured value
ExprThe value to update X with.
AOAtomic ordering of the generated atomic instructions
RMWOpThe binary operation used for update. If operation is not supported by atomicRMW, or belong to {FADD, FSUB, BAD_BINOP}. Then a cmpExch based atomic will be generated.
UpdateOpCode generator for complex expressions that cannot be expressed through atomicrmw instruction.
UpdateExprtrue if X is an in place update of the form X = X BinOp Expr or X = Expr BinOp X
IsXBinopExprtrue if X is Left H.S. in Right H.S. part of the update expression, false otherwise. (e.g. true for X = X BinOp Expr)
IsPostfixUpdatetrue if original value of 'x' must be stored in 'v', not an updated one.
Returns
Insertion point after generated atomic capture IR.

Definition at line 10397 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), LLVM_DEBUG, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::Expected< T >::takeError(), updateToLocation(), X, and llvm::AtomicRMWInst::Xchg.

◆ createAtomicCompare() [1/2]

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCompare ( const LocationDescription & Loc,
AtomicOpValue & X,
AtomicOpValue & V,
AtomicOpValue & R,
Value * E,
Value * D,
AtomicOrdering AO,
omp::OMPAtomicCompareOp Op,
bool IsXBinopExpr,
bool IsPostfixUpdate,
bool IsFailOnly )

Emit atomic compare for constructs: — Only scalar data types cond-expr-stmt: x = x ordop expr ?

expr : x; x = expr ordop x ? expr : x; x = x == e ? d : x; x = e == x ? d : x; (this one is not in the spec) cond-update-stmt: if (x ordop expr) { x = expr; } if (expr ordop x) { x = expr; } if (x == e) { x = d; } if (e == x) { x = d; } (this one is not in the spec) conditional-update-capture-atomic: v = x; cond-update-stmt; (IsPostfixUpdate=true, IsFailOnly=false) cond-update-stmt; v = x; (IsPostfixUpdate=false, IsFailOnly=false) if (x == e) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true) r = x == e; if (r) { x = d; } (IsPostfixUpdate=false, IsFailOnly=false) r = x == e; if (r) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true)

Parameters
LocThe insert and source location description.
XThe target atomic pointer to be updated.
VMemory address where to store captured value (for compare capture only).
RMemory address where to store comparison result (for compare capture with '==' only).
EThe expected value ('e') for forms that use an equality comparison or an expression ('expr') for forms that use 'ordop' (logically an atomic maximum or minimum).
DThe desired value for forms that use an equality comparison. If forms that use 'ordop', it should be nullptr.
AOAtomic ordering of the generated atomic instructions.
OpAtomic compare operation. It can only be ==, <, or >.
IsXBinopExprTrue if the conditional statement is in the form where x is on LHS. It only matters for < or >.
IsPostfixUpdateTrue if original value of 'x' must be stored in 'v', not an updated one (for compare capture only).
IsFailOnlyTrue if the original value of 'x' is stored to 'v' only when the comparison fails. This is only valid for the case the comparison is '=='.
Returns
Insertion point after generated atomic capture IR.

Definition at line 10434 of file OMPIRBuilder.cpp.

References createAtomicCompare(), D(), llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(), and X.

Referenced by createAtomicCompare().

◆ createAtomicCompare() [2/2]

◆ createAtomicRead()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicRead ( const LocationDescription & Loc,
AtomicOpValue & X,
AtomicOpValue & V,
AtomicOrdering AO,
InsertPointTy AllocaIP )

Emit atomic Read for : V = X — Only Scalar data types.

Parameters
LocThe insert and source location description.
XThe target pointer to be atomically read
VMemory address where to store atomically read value
AOAtomic ordering of the generated atomic instructions.
AllocaIPInsert point for allocas
Returns
Insertion point after generated atomic read IR.

Definition at line 10070 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::cast(), DL, llvm::AtomicInfo::EmitAtomicLoadLibcall(), llvm::Instruction::eraseFromParent(), llvm::IntegerType::get(), llvm::LoadInst::getAlign(), llvm::Module::getDataLayout(), llvm::Instruction::getModule(), llvm::Type::getScalarSizeInBits(), llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), llvm::Type::isStructTy(), M, llvm::LoadInst::setAtomic(), updateToLocation(), and X.

◆ createAtomicUpdate()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicUpdate ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
AtomicOpValue & X,
Value * Expr,
AtomicOrdering AO,
AtomicRMWInst::BinOp RMWOp,
AtomicUpdateCallbackTy & UpdateOp,
bool IsXBinopExpr,
bool IsIgnoreDenormalMode = false,
bool IsFineGrainedMemory = false,
bool IsRemoteMemory = false )

Emit atomic update for constructs: X = X BinOp Expr ,or X = Expr BinOp X For complex Operations: X = UpdateOp(X) => CmpExch X, old_X, UpdateOp(X) Only Scalar data types.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion point to be used for alloca instructions.
XThe target atomic pointer to be updated
ExprThe value to update X with.
AOAtomic ordering of the generated atomic instructions.
RMWOpThe binary operation used for update. If operation is not supported by atomicRMW, or belong to {FADD, FSUB, BAD_BINOP}. Then a cmpExch based atomic will be generated.
UpdateOpCode generator for complex expressions that cannot be expressed through atomicrmw instruction.
IsXBinopExprtrue if X is Left H.S. in Right H.S. part of the update expression, false otherwise. (e.g. true for X = X BinOp Expr)
Returns
Insertion point after generated atomic update IR.

Definition at line 10161 of file OMPIRBuilder.cpp.

References assert(), Builder, isConflictIP(), llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), LLVM_DEBUG, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::AtomicRMWInst::UMax, llvm::AtomicRMWInst::UMin, updateToLocation(), and X.

◆ createAtomicWrite()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicWrite ( const LocationDescription & Loc,
AtomicOpValue & X,
Value * Expr,
AtomicOrdering AO,
InsertPointTy AllocaIP )

Emit atomic write for : X = Expr — Only Scalar data types.

Parameters
LocThe insert and source location description.
XThe target pointer to be atomically written to
ExprThe value to store.
AOAtomic ordering of the generated atomic instructions.
AllocaIPInsert point for allocas
Returns
Insertion point after generated atomic Write IR.

Definition at line 10122 of file OMPIRBuilder.cpp.

References assert(), Builder, DL, llvm::AtomicInfo::EmitAtomicStoreLibcall(), llvm::Instruction::eraseFromParent(), llvm::IntegerType::get(), llvm::LoadInst::getAlign(), llvm::Module::getDataLayout(), llvm::Instruction::getModule(), llvm::Type::getScalarSizeInBits(), llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), llvm::Type::isStructTy(), M, llvm::StoreInst::setAtomic(), updateToLocation(), and X.

◆ createBarrier()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createBarrier ( const LocationDescription & Loc,
omp::Directive Kind,
bool ForceSimpleCall = false,
bool CheckCancelFlag = true )

Emitter methods for OpenMP directives.

{ Generator for 'omp barrier'

Parameters
LocThe location where the barrier directive was encountered.
KindThe kind of directive that caused the barrier.
ForceSimpleCallFlag to force a simple (=non-cancellation) barrier.
CheckCancelFlagFlag to indicate a cancel barrier return value should be checked and acted upon.
ThreadIDOptional parameter to pass in any existing ThreadID value.
Returns
The insertion point after the barrier.

Definition at line 1121 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), emitCancelationCheckImpl(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), isLastFinalizationInfoCancellable(), and updateToLocation().

Referenced by createSingle(), and emitScanReduction().

◆ createCachedThreadPrivate()

CallInst * OpenMPIRBuilder::createCachedThreadPrivate ( const LocationDescription & Loc,
llvm::Value * Pointer,
llvm::ConstantInt * Size,
const llvm::Twine & Name = Twine("") )

Create a runtime call for kmpc_threadprivate_cached.

Parameters
LocThe insert and source location description.
Pointerpointer to data to be cached
Sizesize of data to be cached
NameName of call Instruction for callinst
Returns
CallInst to the thread private cache call.

Definition at line 7489 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateInternalVariable(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Pointer, Size, and updateToLocation().

◆ createCancel()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createCancel ( const LocationDescription & Loc,
Value * IfCondition,
omp::Directive CanceledDirective )

Generator for 'omp cancel'.

Parameters
LocThe location where the directive was encountered.
IfConditionThe evaluated 'if' clause expression, if any.
CanceledDirectiveThe kind of directive that is cancled.
Returns
The insertion point after the barrier.

Definition at line 1174 of file OMPIRBuilder.cpp.

References Builder, createCancellationPoint(), createRuntimeFunctionCall(), emitCancelationCheckImpl(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm_unreachable, llvm::SplitBlockAndInsertIfThenElse(), and updateToLocation().

◆ createCancellationPoint()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createCancellationPoint ( const LocationDescription & Loc,
omp::Directive CanceledDirective )

Generator for 'omp cancellation point'.

Parameters
LocThe location where the directive was encountered.
CanceledDirectiveThe kind of directive that is cancled.
Returns
The insertion point after the barrier.

Definition at line 1230 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), emitCancelationCheckImpl(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm_unreachable, and updateToLocation().

Referenced by createCancel().

◆ createCanonicalLoop() [1/2]

Expected< CanonicalLoopInfo * > OpenMPIRBuilder::createCanonicalLoop ( const LocationDescription & Loc,
LoopBodyGenCallbackTy BodyGenCB,
Value * Start,
Value * Stop,
Value * Step,
bool IsSigned,
bool InclusiveStop,
InsertPointTy ComputeIP = {},
const Twine & Name = "loop",
bool InScan = false,
ScanInfo * ScanRedInfo = nullptr )

Generator for the control flow structure of an OpenMP canonical loop.

Instead of a logical iteration space, this allows specifying user-defined loop counter values using increment, upper- and lower bounds. To disambiguate the terminology when counting downwards, instead of lower bounds we use Start for the loop counter value in the first body

It calls

See also
calculateCanonicalLoopTripCount for trip count calculations, so limitations of that method apply here as well.
Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the loop body code.
StartValue of the loop counter for the first iterations.
StopLoop counter values past this will stop the loop.
StepLoop counter increment after each iteration; negative means counting down.
IsSignedWhether Start, Stop and Step are signed integers.
InclusiveStopWhether Stop itself is a valid value for the loop counter.
ComputeIPInsertion point for instructions computing the trip count. Can be used to ensure the trip count is available at the outermost loop of a loop nest. If not set, defaults to the preheader of the generated loop.
NameBase name used to derive BB and instruction names.
InScanWhether loop has a scan reduction specified.
ScanRedInfoPointer to the ScanInfo objected created using ScanInfoInitialize.
Returns
An object representing the created control flow structure which can be used for loop-associated directives.

Definition at line 5275 of file OMPIRBuilder.cpp.

References Builder, calculateCanonicalLoopTripCount(), createCanonicalLoop(), llvm::IRBuilderBase::InsertPoint::isSet(), IV, and llvm::ScanInfo::IV.

◆ createCanonicalLoop() [2/2]

Expected< CanonicalLoopInfo * > OpenMPIRBuilder::createCanonicalLoop ( const LocationDescription & Loc,
LoopBodyGenCallbackTy BodyGenCB,
Value * TripCount,
const Twine & Name = "loop" )

Generator for the control flow structure of an OpenMP canonical loop.

This generator operates on the logical iteration space of the loop, i.e. the caller only has to provide a loop trip count of the loop as defined by base language semantics. The trip count is interpreted as an unsigned integer. The induction variable passed to BodyGenCB will be of the same type and run from 0 to TripCount - 1. It is up to the callback to convert the logical iteration variable to the loop counter variable in the loop body.

Parameters
LocThe insert and source location description. The insert location can be between two instructions or the end of a degenerate block (e.g. a BB under construction).
BodyGenCBCallback that will generate the loop body code.
TripCountNumber of iterations the loop body is executed.
NameBase name used to derive BB and instruction names.
Returns
An object representing the created control flow structure which can be used for loop-associated directives.

Definition at line 5115 of file OMPIRBuilder.cpp.

References llvm::CanonicalLoopInfo::assertOK(), Builder, createLoopSkeleton(), llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBodyIP(), llvm::CanonicalLoopInfo::getIndVar(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::spliceBB(), and updateToLocation().

Referenced by createCanonicalLoop(), createCanonicalScanLoops(), and createSections().

◆ createCanonicalScanLoops()

Expected< SmallVector< llvm::CanonicalLoopInfo * > > OpenMPIRBuilder::createCanonicalScanLoops ( const LocationDescription & Loc,
LoopBodyGenCallbackTy BodyGenCB,
Value * Start,
Value * Stop,
Value * Step,
bool IsSigned,
bool InclusiveStop,
InsertPointTy ComputeIP,
const Twine & Name,
ScanInfo * ScanRedInfo )

Generator for the control flow structure of an OpenMP canonical loops if the parent directive has an inscan modifier specified.

If the inscan modifier is specified, the region of the parent is expected to have a scan directive. Based on the clauses in scan directive, the body of the loop is split into two loops: Input loop and Scan Loop. Input loop contains the code generated for input phase of scan and Scan loop contains the code generated for scan phase of scan. From the bodyGen callback of these loops, createScan would be called when a scan directive is encountered from the loop body. createScan based on whether 1. inclusive or exclusive scan is specified and, 2. input loop or scan loop is generated, lowers the body of the for loop accordingly.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the loop body code.
StartValue of the loop counter for the first iterations.
StopLoop counter values past this will stop the loop.
StepLoop counter increment after each iteration; negative means counting down.
IsSignedWhether Start, Stop and Step are signed integers.
InclusiveStopWhether Stop itself is a valid value for the loop counter.
ComputeIPInsertion point for instructions computing the trip count. Can be used to ensure the trip count is available at the outermost loop of a loop nest. If not set, defaults to the preheader of the generated loop.
NameBase name used to derive BB and instruction names.
ScanRedInfoPointer to the ScanInfo objected created using ScanInfoInitialize.
Returns
A vector containing Loop Info of Input Loop and Scan Loop.

Definition at line 5152 of file OMPIRBuilder.cpp.

References assert(), Builder, calculateCanonicalLoopTripCount(), createCanonicalLoop(), emitBlock(), llvm::BasicBlock::getFirstInsertionPt(), llvm::BasicBlock::getTerminator(), llvm::IRBuilderBase::InsertPoint::isSet(), IV, llvm::ScanInfo::IV, llvm::ScanInfo::OMPBeforeScanBlock, llvm::ScanInfo::OMPScanDispatch, llvm::ScanInfo::OMPScanFinish, llvm::ScanInfo::OMPScanInit, llvm::ScanInfo::OMPScanLoopExit, llvm::ScanInfo::Span, llvm::splitBB(), llvm::Error::success(), and updateToLocation().

◆ createCopyinClauseBlocks()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCopyinClauseBlocks ( InsertPointTy IP,
Value * MasterAddr,
Value * PrivateAddr,
llvm::IntegerType * IntPtrTy,
bool BranchtoEnd = true )

Generate conditional branch and relevant BasicBlocks through which private threads copy the 'copyin' variables from Master copy to threadprivate copies.

Parameters
IPinsertion block for copyin conditional
MasterVarPtra pointer to the master variable
PrivateVarPtra pointer to the threadprivate variable
IntPtrTyPointer size type
BranchtoEndCreate a branch between the copyin.not.master blocks
Returns
The insertion point where copying operation to be emitted.

Definition at line 7323 of file OMPIRBuilder.cpp.

References Builder, llvm::BasicBlock::Create(), llvm::Instruction::eraseFromParent(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), llvm::isa_and_nonnull(), llvm::IRBuilderBase::InsertPoint::isSet(), M, and llvm::BasicBlock::splitBasicBlock().

◆ createCopyPrivate()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCopyPrivate ( const LocationDescription & Loc,
llvm::Value * BufSize,
llvm::Value * CpyBuf,
llvm::Value * CpyFn,
llvm::Value * DidIt )

Generator for __kmpc_copyprivate.

Parameters
LocThe source location description.
BufSizeNumber of elements in the buffer.
CpyBufList of pointers to data to be copied.
CpyFnfunction to call for copying data.
DidItflag variable; 1 for 'single' thread, 0 otherwise.
Returns
The insertion position after the CopyPrivate call.

Definition at line 7004 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and updateToLocation().

Referenced by createSingle().

◆ createCritical()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createCritical ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
FinalizeCallbackTy FiniCB,
StringRef CriticalName,
Value * HintInst )

Generator for 'omp critical'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region body code.
FiniCBCallback to finalize variable copies.
CriticalNamename of the lock used by the critical directive
HintInstHint Instruction for hint clause associated with critical
Returns
The insertion position after the critical.

Definition at line 7100 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and updateToLocation().

◆ createDispatchDeinitFunction()

FunctionCallee OpenMPIRBuilder::createDispatchDeinitFunction ( )

Returns __kmpc_dispatch_deinit runtime function.

Definition at line 8098 of file OMPIRBuilder.cpp.

References getOrCreateRuntimeFunction(), and M.

◆ createDispatchFiniFunction()

FunctionCallee OpenMPIRBuilder::createDispatchFiniFunction ( unsigned IVSize,
bool IVSigned )

Returns __kmpc_dispatch_fini_* runtime function for the specified size IVSize and sign IVSigned.

Definition at line 8085 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), and M.

◆ createDispatchInitFunction()

FunctionCallee OpenMPIRBuilder::createDispatchInitFunction ( unsigned IVSize,
bool IVSigned )

Returns __kmpc_dispatch_init_* runtime function for the specified size IVSize and sign IVSigned.

Definition at line 8059 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), and M.

◆ createDispatchNextFunction()

FunctionCallee OpenMPIRBuilder::createDispatchNextFunction ( unsigned IVSize,
bool IVSigned )

Returns __kmpc_dispatch_next_* runtime function for the specified size IVSize and sign IVSigned.

Definition at line 8072 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), and M.

◆ createDistribute()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createDistribute ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
BodyGenCallbackTy BodyGenCB )

Generator for #omp distribute

Parameters
LocThe location where the distribute construct was encountered.
AllocaIPThe insertion points to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.

Definition at line 10769 of file OMPIRBuilder.cpp.

References addOutlineInfo(), llvm::BasicBlock::begin(), Builder, Config, llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::splitBB(), and updateToLocation().

◆ createFlush()

void OpenMPIRBuilder::createFlush ( const LocationDescription & Loc)

Generator for 'omp flush'.

Parameters
LocThe location where the flush directive was encountered

Definition at line 1918 of file OMPIRBuilder.cpp.

References emitFlush(), and updateToLocation().

◆ createForStaticInitFunction()

FunctionCallee OpenMPIRBuilder::createForStaticInitFunction ( unsigned IVSize,
bool IVSigned,
bool IsGPUDistribute )

Returns __kmpc_for_static_init_* runtime function for the specified size IVSize and sign IVSigned.

Will create a distribute call __kmpc_distribute_static_init* if IsGPUDistribute is set.

Definition at line 8039 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), and M.

◆ createGlobalFlag()

GlobalValue * OpenMPIRBuilder::createGlobalFlag ( unsigned Value,
StringRef Name )

Create a hidden global flag Name in the module with initial value Value.

Definition at line 955 of file OMPIRBuilder.cpp.

References llvm::Type::getInt32Ty(), llvm::GlobalValue::HiddenVisibility, M, and llvm::GlobalValue::WeakODRLinkage.

◆ createLoopSkeleton()

CanonicalLoopInfo * OpenMPIRBuilder::createLoopSkeleton ( DebugLoc DL,
Value * TripCount,
Function * F,
BasicBlock * PreInsertBefore,
BasicBlock * PostInsertBefore,
const Twine & Name = {} )

Create the control flow structure of a canonical OpenMP loop.

The emitted loop will be disconnected, i.e. no edge to the loop's preheader and no terminator in the AfterBB. The OpenMPIRBuilder's IRBuilder location is not preserved.

Parameters
DLDebugLoc used for the instructions in the skeleton.
TripCountValue to be used for the trip count.
FFunction in which to insert the BasicBlocks.
PreInsertBeforeWhere to insert BBs that execute before the body, typically the body itself.
PostInsertBeforeWhere to insert BBs that execute after the body.
NameBase name used to derive BB and instruction names.
Returns
The CanonicalLoopInfo that represents the emitted loop.

Definition at line 5048 of file OMPIRBuilder.cpp.

References llvm::PHINode::addIncoming(), llvm::CanonicalLoopInfo::assertOK(), Builder, Cond, llvm::BasicBlock::Create(), DL, F, llvm::Value::getType(), LoopInfos, M, and llvm::Next.

Referenced by collapseLoops(), createCanonicalLoop(), and tileLoops().

◆ createMapperAllocas()

void OpenMPIRBuilder::createMapperAllocas ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
unsigned NumOperands,
struct MapperAllocas & MapperAllocas )

◆ createMasked()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createMasked ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
FinalizeCallbackTy FiniCB,
Value * Filter )

Generator for 'omp masked'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finialize variable copies.
Returns
The insertion position after the masked.

Definition at line 4690 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), llvm::Filter, getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and updateToLocation().

Referenced by emitScanReduction().

◆ createMaster()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createMaster ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
FinalizeCallbackTy FiniCB )

Generator for 'omp master'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finalize variable copies.
Returns
The insertion position after the master.

Definition at line 4666 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and updateToLocation().

◆ createOffloadEntriesAndInfoMetadata()

◆ createOffloadEntry()

void OpenMPIRBuilder::createOffloadEntry ( Constant * ID,
Constant * Addr,
uint64_t Size,
int32_t Flags,
GlobalValue::LinkageTypes ,
StringRef Name = "" )

Creates offloading entry for the provided entry ID ID, address Addr, size Size, and flags Flags.

Definition at line 10864 of file OMPIRBuilder.cpp.

References llvm::Function::addFnAttr(), Config, llvm::dyn_cast(), llvm::offloading::emitOffloadingEntry(), llvm::Value::getName(), M, llvm::object::OFK_OpenMP, Size, and T.

Referenced by createOffloadEntriesAndInfoMetadata().

◆ createOffloadMapnames()

GlobalVariable * OpenMPIRBuilder::createOffloadMapnames ( SmallVectorImpl< llvm::Constant * > & Names,
std::string VarName )

◆ createOffloadMaptypes()

GlobalVariable * OpenMPIRBuilder::createOffloadMaptypes ( SmallVectorImpl< uint64_t > & Mappings,
std::string VarName )

Create the global variable holding the offload mappings information.

Definition at line 9261 of file OMPIRBuilder.cpp.

References llvm::ConstantDataArray::get(), llvm::Value::getType(), llvm::GlobalValue::Global, M, and llvm::GlobalValue::PrivateLinkage.

Referenced by emitOffloadingArrays().

◆ createOMPAlloc()

CallInst * OpenMPIRBuilder::createOMPAlloc ( const LocationDescription & Loc,
Value * Size,
Value * Allocator,
std::string Name = "" )

Create a runtime call for kmpc_Alloc.

Parameters
LocThe insert and source location description.
SizeSize of allocated memory space
AllocatorAllocator information instruction
NameName of call Instruction for OMP_alloc
Returns
CallInst to the OMP_Alloc call

Definition at line 7373 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Size, and updateToLocation().

◆ createOMPFree()

CallInst * OpenMPIRBuilder::createOMPFree ( const LocationDescription & Loc,
Value * Addr,
Value * Allocator,
std::string Name = "" )

Create a runtime call for kmpc_free.

Parameters
LocThe insert and source location description.
AddrAddress of memory space to be freed
AllocatorAllocator information instruction
NameName of call Instruction for OMP_Free
Returns
CallInst to the OMP_Free call

Definition at line 7390 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and updateToLocation().

◆ createOMPInteropDestroy()

CallInst * OpenMPIRBuilder::createOMPInteropDestroy ( const LocationDescription & Loc,
Value * InteropVar,
Value * Device,
Value * NumDependences,
Value * DependenceAddress,
bool HaveNowaitClause )

Create a runtime call for __tgt_interop_destroy.

Parameters
LocThe insert and source location description.
InteropVarvariable to be allocated
Devicedevide to which offloading will occur
NumDependencesnumber of dependence variables
DependenceAddresspointer to dependence variables
HaveNowaitClausedoes nowait clause exist
Returns
CallInst to the __tgt_interop_destroy call

Definition at line 7434 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), M, and updateToLocation().

◆ createOMPInteropInit()

CallInst * OpenMPIRBuilder::createOMPInteropInit ( const LocationDescription & Loc,
Value * InteropVar,
omp::OMPInteropType InteropType,
Value * Device,
Value * NumDependences,
Value * DependenceAddress,
bool HaveNowaitClause )

Create a runtime call for __tgt_interop_init.

Parameters
LocThe insert and source location description.
InteropVarvariable to be allocated
InteropTypetype of interop operation
Devicedevide to which offloading will occur
NumDependencesnumber of dependence variables
DependenceAddresspointer to dependence variables
HaveNowaitClausedoes nowait clause exist
Returns
CallInst to the __tgt_interop_init call

Definition at line 7405 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), M, and updateToLocation().

◆ createOMPInteropUse()

CallInst * OpenMPIRBuilder::createOMPInteropUse ( const LocationDescription & Loc,
Value * InteropVar,
Value * Device,
Value * NumDependences,
Value * DependenceAddress,
bool HaveNowaitClause )

Create a runtime call for __tgt_interop_use.

Parameters
LocThe insert and source location description.
InteropVarvariable to be allocated
Devicedevide to which offloading will occur
NumDependencesnumber of dependence variables
DependenceAddresspointer to dependence variables
HaveNowaitClausedoes nowait clause exist
Returns
CallInst to the __tgt_interop_use call

Definition at line 7461 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), M, and updateToLocation().

◆ createOrderedDepend()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createOrderedDepend ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
unsigned NumLoops,
ArrayRef< llvm::Value * > StoreValues,
const Twine & Name,
bool IsDependSource )

Generator for 'omp ordered depend (source | sink)'.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion point to be used for alloca instructions.
NumLoopsThe number of loops in depend clause.
StoreValuesThe value will be stored in vector address.
NameThe name of alloca instruction.
IsDependSourceIf true, depend source; otherwise, depend sink.
Returns
The insertion position after the ordered.

Definition at line 7135 of file OMPIRBuilder.cpp.

References llvm::all_of(), assert(), Builder, createRuntimeFunctionCall(), llvm::ArrayType::get(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::Value::getType(), I, llvm::Type::isIntegerTy(), llvm::AllocaInst::setAlignment(), llvm::StoreInst::setAlignment(), and updateToLocation().

◆ createOrderedThreadsSimd()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createOrderedThreadsSimd ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
FinalizeCallbackTy FiniCB,
bool IsThreads )

Generator for 'omp ordered [threads | simd]'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finalize variable copies.
IsThreadsIf true, with threads clause or without clause; otherwise, with simd clause;
Returns
The insertion position after the ordered.

Definition at line 7181 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and updateToLocation().

◆ createParallel()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createParallel ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
BodyGenCallbackTy BodyGenCB,
PrivatizeCallbackTy PrivCB,
FinalizeCallbackTy FiniCB,
Value * IfCondition,
Value * NumThreads,
omp::ProcBindKind ProcBind,
bool IsCancellable )

Generator for 'omp parallel'.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion points to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.
PrivCBCallback to copy a given variable (think copy constructor).
FiniCBCallback to finalize variable copies.
IfConditionThe evaluated 'if' clause expression, if any.
NumThreadsThe evaluated 'num_threads' clause expression, if any.
ProcBindThe value of the 'proc_bind' clause (see ProcBindKind).
IsCancellableFlag to indicate a cancellable parallel region.
Returns
The insertion position after the parallel.

Definition at line 1574 of file OMPIRBuilder.cpp.

References addOutlineInfo(), assert(), llvm::BasicBlock::begin(), Builder, llvm::OpenMPIRBuilder::OutlineInfo::collectBlocks(), Config, llvm::SmallPtrSetImpl< PtrType >::count(), createRuntimeFunctionCall(), llvm::dbgs(), llvm::dyn_cast(), llvm::dyn_cast_if_present(), llvm::SetVector< T, Vector, Set, N >::empty(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, FinalizationStack, llvm::OpenMPIRBuilder::OutlineInfo::FixUpNonEntryAllocas, llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::CallBase::getCalledFunction(), llvm::FunctionCallee::getCallee(), llvm::BasicBlock::getFirstInsertionPt(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::BasicBlock::getTerminator(), hostParallelCallback(), I, llvm::Instruction::insertAfter(), isConflictIP(), LLVM_DEBUG, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::SetVector< T, Vector, Set, N >::remove_if(), llvm::BasicBlock::splitBasicBlock(), llvm::Error::success(), llvm::Expected< T >::takeError(), targetParallelCallback(), updateToLocation(), and Uses.

◆ createPlatformSpecificName()

std::string OpenMPIRBuilder::createPlatformSpecificName ( ArrayRef< StringRef > Parts) const

Get the create a name using the platform specific separators.

Parameters
Partsparts of the final name that needs separation The created name has a first separator between the first and second part and a second separator between all other parts. E.g. with FirstSeparator "$" and Separator "." and parts: "p1", "p2", "p3", "p4" The resulting name is "p1$p2.p3.p4" The separators are retrieved from the OpenMPIRBuilderConfig.

Definition at line 9202 of file OMPIRBuilder.cpp.

References Config.

Referenced by emitOffloadingArrays(), emitTargetRegionFunction(), and registerTargetGlobalVariable().

◆ createReductions()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createReductions ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
ArrayRef< ReductionInfo > ReductionInfos,
ArrayRef< bool > IsByRef,
bool IsNoWait = false,
bool IsTeamsReduction = false )

Generator for 'omp reduction'.

Emits the IR instructing the runtime to perform the specific kind of reductions. Expects reduction variables to have been privatized and initialized to reduction-neutral values separately. Emits the calls to runtime functions as well as the reduction function and the basic blocks performing the reduction atomically and non-atomically.

The code emitted for the following:

type var_1;
type var_2;
#pragma omp <directive> reduction(reduction-op:var_1,var_2)
/* body */;

corresponds to the following sketch.

void _outlined_par() {
// N is the number of different reductions.
void *red_array[] = {privatized_var_1, privatized_var_2, ...};
switch(__kmpc_reduce(..., N, /*size of data in red array*/, red_array,
_omp_reduction_func,
_gomp_critical_user.reduction.var)) {
case 1: {
var_1 = var_1 <reduction-op> privatized_var_1;
var_2 = var_2 <reduction-op> privatized_var_2;
// ...
__kmpc_end_reduce(...);
break;
}
case 2: {
_Atomic<ReductionOp>(var_1, privatized_var_1);
_Atomic<ReductionOp>(var_2, privatized_var_2);
// ...
break;
}
default: break;
}
}
void _omp_reduction_func(void **lhs, void **rhs) {
*(type *)lhs[0] = *(type *)lhs[0] <reduction-op> *(type *)rhs[0];
*(type *)lhs[1] = *(type *)lhs[1] <reduction-op> *(type *)rhs[1];
// ...
}
#define op(i)
Straight line strength reduction
#define N
volatile _Atomic(int32_t)
Atomic access abstraction (since MSVC does not do C11 yet)
Definition rpmalloc.c:333
Parameters
LocThe location where the reduction was encountered. Must be within the associate directive and after the last local access to the reduction variables.
AllocaIPAn insertion point suitable for allocas usable in reductions.
ReductionInfosA list of info on each reduction variable.
IsNoWaitA flag set if the reduction is marked as nowait.
IsByRefA flag set if the reduction is using reference or direct value.
IsTeamsReductionOptional flag set if it is a teams reduction.

Definition at line 4513 of file OMPIRBuilder.cpp.

References llvm::all_of(), assert(), llvm::OpenMPIRBuilder::ReductionInfo::AtomicReductionGen, Builder, checkReductionInfos(), Config, llvm::BasicBlock::Create(), createReductionsGPU(), createRuntimeFunctionCall(), DL, llvm::OpenMPIRBuilder::ReductionInfo::ElementType, llvm::BasicBlock::end(), llvm::enumerate(), llvm::Instruction::eraseFromParent(), llvm::ArrayType::get(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), getFreshReductionFunc(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getTerminator(), M, llvm::none_of(), P, populateReductionFunction(), llvm::OpenMPIRBuilder::ReductionInfo::PrivateVariable, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGen, llvm::ArrayRef< T >::size(), llvm::BasicBlock::splitBasicBlock(), llvm::Expected< T >::takeError(), updateToLocation(), and llvm::OpenMPIRBuilder::ReductionInfo::Variable.

◆ createReductionsGPU()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createReductionsGPU ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
InsertPointTy CodeGenIP,
ArrayRef< ReductionInfo > ReductionInfos,
ArrayRef< bool > IsByRef,
bool IsNoWait = false,
bool IsTeamsReduction = false,
ReductionGenCBKind ReductionGenCBKind = ReductionGenCBKind::MLIR,
std::optional< omp::GV > GridValue = {},
unsigned ReductionBufNum = 1024,
Value * SrcLocInfo = nullptr )

Design of OpenMP reductions on the GPU.

Consider a typical OpenMP program with one or more reduction clauses:

float foo; double bar; #pragma omp target teams distribute parallel for \ reduction(+:foo) reduction(*:bar) for (int i = 0; i < N; i++) { foo += A[i]; bar *= B[i]; }

where 'foo' and 'bar' are reduced across all OpenMP threads in all teams. In our OpenMP implementation on the NVPTX device an OpenMP team is mapped to a CUDA threadblock and OpenMP threads within a team are mapped to CUDA threads within a threadblock. Our goal is to efficiently aggregate values across all OpenMP threads such that:

  • the compiler and runtime are logically concise, and
  • the reduction is performed efficiently in a hierarchical manner as follows: within OpenMP threads in the same warp, across warps in a threadblock, and finally across teams on the NVPTX device.

Introduction to Decoupling

We would like to decouple the compiler and the runtime so that the latter is ignorant of the reduction variables (number, data types) and the reduction operators. This allows a simpler interface and implementation while still attaining good performance.

Pseudocode for the aforementioned OpenMP program generated by the compiler is as follows:

  1. Create private copies of reduction variables on each OpenMP thread: 'foo_private', 'bar_private'
  2. Each OpenMP thread reduces the chunk of 'A' and 'B' assigned to it and writes the result in 'foo_private' and 'bar_private' respectively.
  3. Call the OpenMP runtime on the GPU to reduce within a team and store the result on the team master:

    __kmpc_nvptx_parallel_reduce_nowait_v2(..., reduceData, shuffleReduceFn, interWarpCpyFn)

    where: struct ReduceData { double *foo; double *bar; } reduceData reduceData.foo = &foo_private reduceData.bar = &bar_private

    'shuffleReduceFn' and 'interWarpCpyFn' are pointers to two auxiliary functions generated by the compiler that operate on variables of type 'ReduceData'. They aid the runtime perform algorithmic steps in a data agnostic manner.

    'shuffleReduceFn' is a pointer to a function that reduces data of type 'ReduceData' across two OpenMP threads (lanes) in the same warp. It takes the following arguments as input:

    a. variable of type 'ReduceData' on the calling lane, b. its lane_id, c. an offset relative to the current lane_id to generate a remote_lane_id. The remote lane contains the second variable of type 'ReduceData' that is to be reduced. d. an algorithm version parameter determining which reduction algorithm to use.

    'shuffleReduceFn' retrieves data from the remote lane using efficient GPU shuffle intrinsics and reduces, using the algorithm specified by the 4th parameter, the two operands element-wise. The result is written to the first operand.

    Different reduction algorithms are implemented in different runtime functions, all calling 'shuffleReduceFn' to perform the essential reduction step. Therefore, based on the 4th parameter, this function behaves slightly differently to cooperate with the runtime to ensure correctness under different circumstances.

    'InterWarpCpyFn' is a pointer to a function that transfers reduced variables across warps. It tunnels, through CUDA shared memory, the thread-private data of type 'ReduceData' from lane 0 of each warp to a lane in the first warp.

  4. Call the OpenMP runtime on the GPU to reduce across teams. The last team writes the global reduced value to memory.

    ret = __kmpc_nvptx_teams_reduce_nowait(..., reduceData, shuffleReduceFn, interWarpCpyFn, scratchpadCopyFn, loadAndReduceFn)

    'scratchpadCopyFn' is a helper that stores reduced data from the team master to a scratchpad array in global memory.

    'loadAndReduceFn' is a helper that loads data from the scratchpad array and reduces it with the input operand.

    These compiler generated functions hide address calculation and alignment information from the runtime.

  5. if ret == 1: The team master of the last team stores the reduced result to the globals in memory. foo += reduceData.foo; bar *= reduceData.bar

Warp Reduction Algorithms

On the warp level, we have three algorithms implemented in the OpenMP runtime depending on the number of active lanes:

Full Warp Reduction

The reduce algorithm within a warp where all lanes are active is implemented in the runtime as follows:

full_warp_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn) { for (int offset = WARPSIZE/2; offset > 0; offset /= 2) ShuffleReduceFn(reduce_data, 0, offset, 0); }

The algorithm completes in log(2, WARPSIZE) steps.

'ShuffleReduceFn' is used here with lane_id set to 0 because it is not used therefore we save instructions by not retrieving lane_id from the corresponding special registers. The 4th parameter, which represents the version of the algorithm being used, is set to 0 to signify full warp reduction.

In this version, 'ShuffleReduceFn' behaves, per element, as follows:

#reduce_elem refers to an element in the local lane's data structure #remote_elem is retrieved from a remote lane remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); reduce_elem = reduce_elem REDUCE_OP remote_elem;

Contiguous Partial Warp Reduction

This reduce algorithm is used within a warp where only the first 'n' (n <= WARPSIZE) lanes are active. It is typically used when the number of OpenMP threads in a parallel region is not a multiple of WARPSIZE. The algorithm is implemented in the runtime as follows:

void contiguous_partial_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn, int size, int lane_id) { int curr_size; int offset; curr_size = size; mask = curr_size/2; while (offset>0) { ShuffleReduceFn(reduce_data, lane_id, offset, 1); curr_size = (curr_size+1)/2; offset = curr_size/2; } }

In this version, 'ShuffleReduceFn' behaves, per element, as follows:

remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); if (lane_id < offset) reduce_elem = reduce_elem REDUCE_OP remote_elem else reduce_elem = remote_elem

This algorithm assumes that the data to be reduced are located in a contiguous subset of lanes starting from the first. When there is an odd number of active lanes, the data in the last lane is not aggregated with any other lane's dat but is instead copied over.

Dispersed Partial Warp Reduction

This algorithm is used within a warp when any discontiguous subset of lanes are active. It is used to implement the reduction operation across lanes in an OpenMP simd region or in a nested parallel region.

void dispersed_partial_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn) { int size, remote_id; int logical_lane_id = number_of_active_lanes_before_me() * 2; do { remote_id = next_active_lane_id_right_after_me();

the above function returns 0 of no active lane

is present right after the current lane.

size = number_of_active_lanes_in_this_warp(); logical_lane_id /= 2; ShuffleReduceFn(reduce_data, logical_lane_id, remote_id-1-threadIdx.x, 2); } while (logical_lane_id % 2 == 0 && size > 1); }

There is no assumption made about the initial state of the reduction. Any number of lanes (>=1) could be active at any position. The reduction result is returned in the first active lane.

In this version, 'ShuffleReduceFn' behaves, per element, as follows:

remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); if (lane_id % 2 == 0 && offset > 0) reduce_elem = reduce_elem REDUCE_OP remote_elem else reduce_elem = remote_elem

Intra-Team Reduction

This function, as implemented in the runtime call '__kmpc_nvptx_parallel_reduce_nowait_v2', aggregates data across OpenMP threads in a team. It first reduces within a warp using the aforementioned algorithms. We then proceed to gather all such reduced values at the first warp.

The runtime makes use of the function 'InterWarpCpyFn', which copies data from each of the "warp master" (zeroth lane of each warp, where warp-reduced data is held) to the zeroth warp. This step reduces (in a mathematical sense) the problem of reduction across warp masters in a block to the problem of warp reduction.

Inter-Team Reduction

Once a team has reduced its data to a single value, it is stored in a global scratchpad array. Since each team has a distinct slot, this can be done without locking.

The last team to write to the scratchpad array proceeds to reduce the scratchpad array. One or more workers in the last team use the helper 'loadAndReduceDataFn' to load and reduce values from the array, i.e., the k'th worker reduces every k'th element.

Finally, a call is made to '__kmpc_nvptx_parallel_reduce_nowait_v2' to reduce across workers and compute a globally reduced value.

Parameters
LocThe location where the reduction was encountered. Must be within the associate directive and after the last local access to the reduction variables.
AllocaIPAn insertion point suitable for allocas usable in reductions.
CodeGenIPAn insertion point suitable for code generation.
ReductionInfosA list of info on each reduction variable.
IsNoWaitOptional flag set if the reduction is marked as nowait.
IsByRefFor each reduction clause, whether the reduction is by-ref.
IsTeamsReductionOptional flag set if it is a teams reduction.
GridValueOptional GPU grid value.
ReductionBufNumOptional OpenMPCUDAReductionBufNumValue to be used for teams reduction.
SrcLocInfoSource location information global.

Definition at line 4189 of file OMPIRBuilder.cpp.

References Builder, llvm::cast(), checkReductionInfos(), Clang, Cond, Config, llvm::BasicBlock::Create(), llvm::StructType::create(), createRuntimeFunctionCall(), llvm::OpenMPIRBuilder::ReductionInfo::ElementType, emitBlock(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::ArrayRef< T >::empty(), llvm::BasicBlock::end(), llvm::enumerate(), llvm::Instruction::eraseFromParent(), llvm::ArrayType::get(), llvm::PointerType::get(), llvm::Function::getAttributes(), getGridValue(), llvm::Value::getName(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::BasicBlock::getTerminator(), M, llvm::OpenMPIRBuilder::ReductionInfo::PrivateVariable, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGen, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGenClang, llvm::Value::replaceUsesWithIf(), Size, llvm::ArrayRef< T >::size(), llvm::BasicBlock::splitBasicBlock(), T, llvm::Expected< T >::takeError(), updateToLocation(), and llvm::OpenMPIRBuilder::ReductionInfo::Variable.

Referenced by createReductions().

◆ createRuntimeFunctionCall()

◆ createScan()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createScan ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
ArrayRef< llvm::Value * > ScanVars,
ArrayRef< llvm::Type * > ScanVarsType,
bool IsInclusive,
ScanInfo * ScanRedInfo )

This directive split and directs the control flow to input phase blocks or scan phase blocks based on 1.

whether input loop or scan loop is executed, 2. whether exclusive or inclusive scan is used.

Parameters
LocThe insert and source location description.
AllocaIPThe IP where the temporary buffer for scan reduction
ScanVarsScan Variables.
IsInclusiveWhether it is an inclusive or exclusive scan.
ScanRedInfoPointer to the ScanInfo objected created using ScanInfoInitialize.
Returns
The insertion position after the scan.

Definition at line 4732 of file OMPIRBuilder.cpp.

References Builder, emitBlock(), IV, llvm::ScanInfo::IV, llvm::ScanInfo::OMPAfterScanBlock, llvm::ScanInfo::OMPBeforeScanBlock, llvm::ScanInfo::OMPFirstScanLoop, llvm::ScanInfo::OMPScanDispatch, llvm::ScanInfo::OMPScanLoopExit, llvm::ScanInfo::ScanBuffPtrs, llvm::ArrayRef< T >::size(), and updateToLocation().

◆ createSection()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSection ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
FinalizeCallbackTy FiniCB )

Generator for 'omp section'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region body code.
FiniCBCallback to finalize variable copies.
Returns
The insertion position after the section.

Definition at line 2782 of file OMPIRBuilder.cpp.

References Builder, I, and updateToLocation().

◆ createSections()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
ArrayRef< StorableBodyGenCallbackTy > SectionCBs,
PrivatizeCallbackTy PrivCB,
FinalizeCallbackTy FiniCB,
bool IsCancellable,
bool IsNowait )

Generator for 'omp sections'.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion points to be used for alloca instructions.
SectionCBsCallbacks that will generate body of each section.
PrivCBCallback to copy a given variable (think copy constructor).
FiniCBCallback to finalize variable copies.
IsCancellableFlag to indicate a cancellable parallel region.
IsNowaitIf true, barrier - to ensure all sections are executed before moving forward will not be generated.
Returns
The insertion position after the sections.

Definition at line 2700 of file OMPIRBuilder.cpp.

References llvm::SwitchInst::addCase(), assert(), Builder, llvm::Continue, llvm::BasicBlock::Create(), createCanonicalLoop(), FinalizationStack, llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Type::getInt32Ty(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::BasicBlock::getSinglePredecessor(), isConflictIP(), M, llvm::ArrayRef< T >::size(), llvm::splitBBWithSuffix(), llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ createSingle()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSingle ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
FinalizeCallbackTy FiniCB,
bool IsNowait,
ArrayRef< llvm::Value * > CPVars = {},
ArrayRef< llvm::Function * > CPFuncs = {} )

Generator for 'omp single'.

Parameters
LocThe source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finalize variable copies.
IsNowaitIf false, a barrier is emitted.
CPVarscopyprivate variables.
CPFuncscopy functions to use for each copyprivate variable.
Returns
The insertion position after the single call.

Definition at line 7025 of file OMPIRBuilder.cpp.

References Builder, createBarrier(), createCopyPrivate(), createRuntimeFunctionCall(), llvm::ArrayRef< T >::empty(), llvm::Type::getInt32Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), I, llvm::ArrayRef< T >::size(), llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ createTarget()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTarget ( const LocationDescription & Loc,
bool IsOffloadEntry,
OpenMPIRBuilder::InsertPointTy AllocaIP,
OpenMPIRBuilder::InsertPointTy CodeGenIP,
TargetDataInfo & Info,
TargetRegionEntryInfo & EntryInfo,
const TargetKernelDefaultAttrs & DefaultAttrs,
const TargetKernelRuntimeAttrs & RuntimeAttrs,
Value * IfCond,
SmallVectorImpl< Value * > & Inputs,
GenMapInfoCallbackTy GenMapInfoCB,
OpenMPIRBuilder::TargetBodyGenCallbackTy CBFunc,
OpenMPIRBuilder::TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB,
CustomMapperCallbackTy CustomMapperCB,
const SmallVector< DependData > & Dependencies,
bool HasNowait = false,
Value * DynCGroupMem = nullptr,
omp::OMPDynGroupprivateFallbackType DynCGroupMemFallback = omp::OMPDynGroupprivateFallbackType::Abort )

Generator for 'omp target'.

Parameters
Locwhere the target data construct was encountered.
IsOffloadEntrywhether it is an offload entry.
CodeGenIPThe insertion point where the call to the outlined function should be emitted.
InfoStores all information realted to the Target directive.
EntryInfoThe entry information about the function.
DefaultAttrsStructure containing the default attributes, including numbers of threads and teams to launch the kernel with.
RuntimeAttrsStructure containing the runtime numbers of threads and teams to launch the kernel with.
IfCondvalue of the if clause.
InputsThe input values to the region that will be passed. as arguments to the outlined function.
BodyGenCBCallback that will generate the region code.
ArgAccessorFuncCBCallback that will generate accessors instructions for passed in target arguments where neccessary
CustomMapperCBCallback to generate code related to custom mappers.
DependenciesA vector of DependData objects that carry dependency information as passed in the depend clause
HasNowaitWhether the target construct has a nowait clause or not.
DynCGroupMemThe size of the dynamic groupprivate memory for each cgroup.
DynCGroupMemThe fallback mechanism to execute if the requested cgroup memory cannot be provided.

Definition at line 9149 of file OMPIRBuilder.cpp.

References Builder, Config, emitTargetCall(), emitTargetOutlinedFunction(), and updateToLocation().

◆ createTargetData()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTargetData ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
InsertPointTy CodeGenIP,
Value * DeviceID,
Value * IfCond,
TargetDataInfo & Info,
GenMapInfoCallbackTy GenMapInfoCB,
CustomMapperCallbackTy CustomMapperCB,
omp::RuntimeFunction * MapperFunc = nullptr,
function_ref< InsertPointOrErrorTy(InsertPointTy CodeGenIP, BodyGenTy BodyGenType)> BodyGenCB = nullptr,
function_ref< void(unsigned int, Value *)> DeviceAddrCB = nullptr,
Value * SrcLocInfo = nullptr )

Generator for 'omp target data'.

Parameters
LocThe location where the target data construct was encountered.
AllocaIPThe insertion points to be used for alloca instructions.
CodeGenIPThe insertion point at which the target directive code should be placed.
IsBeginIf true then emits begin mapper call otherwise emits end mapper call.
DeviceIDStores the DeviceID from the device clause.
IfCondValue which corresponds to the if clause condition.
InfoStores all information realted to the Target Data directive.
GenMapInfoCBCallback that populates the MapInfos and returns.
CustomMapperCBCallback to generate code related to custom mappers.
BodyGenCBOptional Callback to generate the region code.
DeviceAddrCBOptional callback to generate code related to use_device_ptr and use_device_addr.

Definition at line 7841 of file OMPIRBuilder.cpp.

References llvm::SmallVectorImpl< T >::append(), assert(), llvm::OpenMPIRBuilder::TargetDataRTArgs::BasePointersArray, Builder, llvm::cantFail(), Config, llvm::BasicBlock::Create(), createRuntimeFunctionCall(), DupNoPriv, emitBlock(), emitIfClause(), emitOffloadingArrays(), emitOffloadingArraysArgument(), emitTargetTask(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::Constant::getNullValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::isa(), llvm::OpenMPIRBuilder::TargetDataRTArgs::MapNamesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MappersArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArray, llvm::OpenMPIRBuilder::MapInfosTy::Names, NoPriv, llvm::OpenMPIRBuilder::TargetDataRTArgs::PointersArray, Priv, restoreIPandDebugLoc(), llvm::OpenMPIRBuilder::TargetDataRTArgs::SizesArray, llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ createTargetDeinit()

void OpenMPIRBuilder::createTargetDeinit ( const LocationDescription & Loc,
int32_t TeamsReductionDataSize = 0,
int32_t TeamsReductionBufferLength = 1024 )

Create a runtime call for kmpc_target_deinit.

Parameters
LocThe insert and source location description.
TeamsReductionDataSizeThe maximal size of all the reduction data for teams reduction.
TeamsReductionBufferLengthThe number of elements (each of up to TeamsReductionDataSize size), in the teams reduction buffer.

Definition at line 7656 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::ConstantFoldInsertValueInstruction(), createRuntimeFunctionCall(), llvm::StringRef::drop_back(), llvm::StringRef::ends_with(), getOrCreateRuntimeFunctionPtr(), M, and updateToLocation().

Referenced by createOutlinedFunction().

◆ createTargetInit()

◆ createTask()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTask ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
BodyGenCallbackTy BodyGenCB,
bool Tied = true,
Value * Final = nullptr,
Value * IfCondition = nullptr,
SmallVector< DependData > Dependencies = {},
bool Mergeable = false,
Value * EventHandle = nullptr,
Value * Priority = nullptr )

Generator for #omp taskloop

Parameters
LocThe location where the taskloop construct was encountered.
AllocaIPThe insertion point to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.
LoopInfoCallback that return the CLI
LBValLowerbound value of loop
UBValUpperbound value of loop
StepValStep value of loop
UntiedTrue if the task is untied, false if the task is tied.
IfCondi1 value. If it evaluates to false, an undeferred task is generated, and the encountering thread must suspend the current task region, for which execution cannot be resumed until execution of the structured block that is associated with the generated task is completed.
GrainSizeValue of the GrainSize/Num of Tasks if present
NoGroupFalse if NoGroup is defined, true if not
SchedIf Grainsize is defined, Sched is 1. Num Tasks, Shed is 2. Otherwise Sched is 0
Finali1 value which is true if the task is final, false if the task is not final.
MergeableIf the given task is mergeable
Priority‘priority-value’ specifies the execution order of the / tasks that is generated by the construct /
DupCBThe callback to generate the duplication code. See / documentation for TaskDupCallbackTy. This can be nullptr. /
TaskContextStructPtrValIf non-null, a pointer to to be placed / immediately after the {lower bound, upper / bound, step} values in the task data. LLVM_ABI InsertPointOrErrorTy createTaskloop( const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, llvm::function_ref<llvm::Expected<llvm::CanonicalLoopInfo *>()> LoopInfo, Value *LBVal, Value *UBVal, Value *StepVal, bool Untied = false, Value *IfCond = nullptr, Value *GrainSize = nullptr, bool NoGroup = false, int Sched = 0, Value *Final = nullptr, bool Mergeable = false, Value *Priority = nullptr, TaskDupCallbackTy DupCB = nullptr, Value *TaskContextStructPtrVal = nullptr);

/ Generator for #omp task / /

Parameters
LocThe location where the task construct was encountered. /
AllocaIPThe insertion point to be used for alloca instructions. /
BodyGenCBCallback that will generate the region code. /
TiedTrue if the task is tied, false if the task is untied. /
Finali1 value which is true if the task is final, false if the / task is not final. /
IfConditioni1 value. If it evaluates to false, an undeferred / task is generated, and the encountering thread must / suspend the current task region, for which execution / cannot be resumed until execution of the structured / block that is associated with the generated task is / completed. /
EventHandleIf present, signifies the event handle as part of / the detach clause /
MergeableIf the given task is mergeable /
priority‘priority-value’ specifies the execution order of the tasks that is generated by the construct

Definition at line 2406 of file OMPIRBuilder.cpp.

References addOutlineInfo(), llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::begin(), Builder, llvm::cast(), createFakeIntVal(), createRuntimeFunctionCall(), llvm::divideCeil(), llvm::dyn_cast(), emitTaskDependencies(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::Instruction::eraseFromParent(), llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::ConstantPointerNull::get(), llvm::StructType::get(), llvm::AllocaInst::getAllocatedType(), llvm::CallBase::getArgOperand(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Instruction::getDebugLoc(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::Value::getPointerAlignment(), llvm::PointerType::getUnqual(), I, llvm::Int32Ty, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::reverse(), llvm::Instruction::setDebugLoc(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::splitBB(), llvm::SplitBlockAndInsertIfThenElse(), and updateToLocation().

◆ createTaskgroup()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTaskgroup ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
BodyGenCallbackTy BodyGenCB )

Generator for the taskgroup construct.

Parameters
LocThe location where the taskgroup construct was encountered.
AllocaIPThe insertion point to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.

Definition at line 2671 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::splitBB(), and updateToLocation().

◆ createTaskwait()

void OpenMPIRBuilder::createTaskwait ( const LocationDescription & Loc)

Generator for 'omp taskwait'.

Parameters
LocThe location where the taskwait directive was encountered.

Definition at line 1937 of file OMPIRBuilder.cpp.

References emitTaskwaitImpl(), and updateToLocation().

◆ createTaskyield()

void OpenMPIRBuilder::createTaskyield ( const LocationDescription & Loc)

Generator for 'omp taskyield'.

Parameters
LocThe location where the taskyield directive was encountered.

Definition at line 1955 of file OMPIRBuilder.cpp.

References emitTaskyieldImpl(), and updateToLocation().

◆ createTeams()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTeams ( const LocationDescription & Loc,
BodyGenCallbackTy BodyGenCB,
Value * NumTeamsLower = nullptr,
Value * NumTeamsUpper = nullptr,
Value * ThreadLimit = nullptr,
Value * IfExpr = nullptr )

Generator for #omp teams

Parameters
LocThe location where the teams construct was encountered.
BodyGenCBCallback that will generate the region code.
NumTeamsLowerLower bound on number of teams. If this is nullptr, it is as if lower bound is specified as equal to upperbound. If this is non-null, then upperbound must also be non-null.
NumTeamsUpperUpper bound on the number of teams.
ThreadLimiton the number of threads that may participate in a contention group created by each team.
IfExpris the integer argument value of the if condition on the teams clause.

Definition at line 10623 of file OMPIRBuilder.cpp.

References addOutlineInfo(), llvm::CallBase::arg_size(), llvm::Function::arg_size(), assert(), llvm::BasicBlock::begin(), Builder, llvm::cast(), Config, createFakeIntVal(), createRuntimeFunctionCall(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::Function::getArg(), llvm::CallBase::getArgOperand(), llvm::Function::getEntryBlock(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::Value::getType(), llvm::Value::hasOneUse(), I, Int1, llvm::Type::isIntegerTy(), llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::reverse(), llvm::Value::setName(), llvm::splitBB(), updateToLocation(), and llvm::Value::user_back().

◆ emitBlock()

◆ emitBranch()

void OpenMPIRBuilder::emitBranch ( BasicBlock * Target)

Definition at line 9935 of file OMPIRBuilder.cpp.

References Builder, and llvm::BasicBlock::getTerminator().

Referenced by emitBlock(), emitIfClause(), and emitKernelLaunch().

◆ emitCancelationCheckImpl()

Error OpenMPIRBuilder::emitCancelationCheckImpl ( Value * CancelFlag,
omp::Directive CanceledDirective )

Generate control flow and cleanup for cancellation.

Parameters
CancelFlagFlag indicating if the cancellation is performed.
CanceledDirectiveThe kind of directive that is cancled.
ExitCBExtra code to be generated in the exit block.
Returns
an error, if any were triggered during execution.

Definition at line 1361 of file OMPIRBuilder.cpp.

References assert(), llvm::BasicBlock::begin(), Builder, llvm::BasicBlock::Create(), llvm::BasicBlock::end(), llvm::Instruction::eraseFromParent(), FinalizationStack, llvm::BasicBlock::getContext(), llvm::Value::getName(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), isLastFinalizationInfoCancellable(), llvm::SplitBlock(), llvm::Error::success(), and llvm::Expected< T >::takeError().

Referenced by createBarrier(), createCancel(), and createCancellationPoint().

◆ emitFlush()

void OpenMPIRBuilder::emitFlush ( const LocationDescription & Loc)

Generate a flush runtime call.

Parameters
LocThe location at which the request originated and is fulfilled.

Definition at line 1908 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), and getOrCreateSrcLocStr().

Referenced by createFlush().

◆ emitIfClause()

Error OpenMPIRBuilder::emitIfClause ( Value * Cond,
BodyGenCallbackTy ThenGen,
BodyGenCallbackTy ElseGen,
InsertPointTy AllocaIP = {} )

Emits code for OpenMP 'if' clause using specified BodyGenCallbackTy Here is the logic: if (Cond) { ThenGen(); } else { ElseGen(); }.

Returns
an error, if any were triggered during execution.

Definition at line 9970 of file OMPIRBuilder.cpp.

References Builder, Cond, llvm::BasicBlock::Create(), llvm::dyn_cast(), emitBlock(), emitBranch(), M, and llvm::Error::success().

Referenced by createTargetData(), and emitTargetCall().

◆ emitKernelExecutionMode()

GlobalVariable * OpenMPIRBuilder::emitKernelExecutionMode ( StringRef KernelName,
omp::OMPTgtExecModeFlags Mode )

Emit the kernel execution mode.

Definition at line 988 of file OMPIRBuilder.cpp.

References Builder, M, llvm::GlobalValue::ProtectedVisibility, and llvm::GlobalValue::WeakAnyLinkage.

Referenced by createOutlinedFunction().

◆ emitKernelLaunch()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::emitKernelLaunch ( const LocationDescription & Loc,
Value * OutlinedFnID,
EmitFallbackCallbackTy EmitTargetCallFallbackCB,
TargetKernelArgs & Args,
Value * DeviceID,
Value * RTLoc,
InsertPointTy AllocaIP )

Generate a target region entry call and host fallback call.

Parameters
LocThe location at which the request originated and is fulfilled.
OutlinedFnIDThe ooulined function ID.
EmitTargetCallFallbackCBCall back function to generate host fallback code.
ArgsData structure holding information about the kernel arguments.
DeviceIDIdentifier for the device via the 'device' clause.
RTLocSource location identifier
AllocaIPThe insertion point to be used for alloca instructions.

Definition at line 1298 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::BasicBlock::Create(), emitBlock(), emitBranch(), emitTargetKernel(), llvm::Failed(), getKernelArgsVector(), llvm::Expected< T >::takeError(), and updateToLocation().

Referenced by emitTargetCall().

◆ emitMapperCall()

void OpenMPIRBuilder::emitMapperCall ( const LocationDescription & Loc,
Function * MapperFunc,
Value * SrcLocInfo,
Value * MaptypesArg,
Value * MapnamesArg,
struct MapperAllocas & MapperAllocas,
int64_t DeviceID,
unsigned NumOperands )

Create the call for the target mapper function.

Parameters
LocThe source location description.
MapperFuncFunction to be called.
SrcLocInfoSource location information global.
MaptypesArgThe argument types.
MapnamesArgThe argument names.
MapperAllocasThe AllocaInst used for the call.
DeviceIDDevice ID for the call.
NumOperandsNumber of operands in the call.

Definition at line 9295 of file OMPIRBuilder.cpp.

References llvm::OpenMPIRBuilder::MapperAllocas::Args, llvm::OpenMPIRBuilder::MapperAllocas::ArgsBase, llvm::OpenMPIRBuilder::MapperAllocas::ArgSizes, Builder, createRuntimeFunctionCall(), llvm::ArrayType::get(), llvm::Constant::getNullValue(), llvm::PointerType::getUnqual(), and updateToLocation().

◆ emitNonContiguousDescriptor()

◆ emitOffloadingArrays()

Error OpenMPIRBuilder::emitOffloadingArrays ( InsertPointTy AllocaIP,
InsertPointTy CodeGenIP,
MapInfosTy & CombinedInfo,
TargetDataInfo & Info,
CustomMapperCallbackTy CustomMapperCB,
bool IsNonContiguous = false,
function_ref< void(unsigned int, Value *)> DeviceAddrCB = nullptr )

Emit the arrays used to pass the captures and map information to the offloading runtime library.

If there is no map or capture information, return nullptr by reference. Accepts a reference to a MapInfosTy object that contains information generated for mappable clauses, including base pointers, pointers, sizes, map types, user-defined mappers.

Definition at line 9740 of file OMPIRBuilder.cpp.

References Address, llvm::SmallBitVector::all(), llvm::SmallBitVector::any(), llvm::OpenMPIRBuilder::MapInfosTy::BasePointers, Builder, createOffloadMapnames(), createOffloadMaptypes(), createPlatformSpecificName(), llvm::OpenMPIRBuilder::MapInfosTy::DevicePointers, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Dims, llvm::dyn_cast(), emitNonContiguousDescriptor(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::ArrayType::get(), llvm::ConstantArray::get(), llvm::ConstantPointerNull::get(), llvm::AllocaInst::getAllocatedType(), llvm::AllocaInst::getAllocationSize(), llvm::Constant::getNullValue(), llvm::AllocaInst::getType(), llvm::Value::getType(), llvm::PointerType::getUnqual(), llvm::GlobalValue::Global, I, llvm::isa(), M, llvm::OpenMPIRBuilder::MapInfosTy::Names, llvm::OpenMPIRBuilder::MapInfosTy::NonContigInfo, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Offsets, P, Pointer, llvm::OpenMPIRBuilder::MapInfosTy::Pointers, llvm::GlobalValue::PrivateLinkage, llvm::SmallVectorTemplateBase< T, bool >::push_back(), restoreIPandDebugLoc(), llvm::SmallBitVector::set(), llvm::AllocaInst::setAlignment(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::OpenMPIRBuilder::MapInfosTy::Sizes, llvm::Error::success(), llvm::SmallBitVector::test(), and llvm::OpenMPIRBuilder::MapInfosTy::Types.

Referenced by createTargetData(), and emitOffloadingArraysAndArgs().

◆ emitOffloadingArraysAndArgs()

Error OpenMPIRBuilder::emitOffloadingArraysAndArgs ( InsertPointTy AllocaIP,
InsertPointTy CodeGenIP,
TargetDataInfo & Info,
TargetDataRTArgs & RTArgs,
MapInfosTy & CombinedInfo,
CustomMapperCallbackTy CustomMapperCB,
bool IsNonContiguous = false,
bool ForEndCall = false,
function_ref< void(unsigned int, Value *)> DeviceAddrCB = nullptr )

Allocates memory for and populates the arrays required for offloading (offload_{baseptrs|ptrs|mappers|sizes|maptypes|mapnames}).

Then, it emits their base addresses as arguments to be passed to the runtime library. In essence, this function is a combination of emitOffloadingArrays and emitOffloadingArraysArgument and should arguably be preferred by clients of OpenMPIRBuilder.

Definition at line 8945 of file OMPIRBuilder.cpp.

References Builder, emitOffloadingArrays(), emitOffloadingArraysArgument(), and llvm::Error::success().

Referenced by emitTargetCall().

◆ emitOffloadingArraysArgument()

void OpenMPIRBuilder::emitOffloadingArraysArgument ( IRBuilderBase & Builder,
OpenMPIRBuilder::TargetDataRTArgs & RTArgs,
OpenMPIRBuilder::TargetDataInfo & Info,
bool ForEndCall = false )

◆ emitScanReduction()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::emitScanReduction ( const LocationDescription & Loc,
ArrayRef< llvm::OpenMPIRBuilder::ReductionInfo > ReductionInfos,
ScanInfo * ScanRedInfo )

This function performs the scan reduction of the values updated in the input phase.

The reduction logic needs to be emitted between input and scan loop returned by CreateCanonicalScanLoops. The following is the code that is generated, buffer and span are expected to be populated before executing the generated code.

for (int k = 0; k != ceil(log2(span)); ++k) {
i=pow(2,k)
for (size cnt = last_iter; cnt >= i; --cnt)
buffer[cnt] op= buffer[cnt-i];
}
static double log2(double V)
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition STLExtras.h:1667
Parameters
LocThe insert and source location description.
ReductionInfosArray type containing the ReductionOps.
ScanRedInfoPointer to the ScanInfo objected created using ScanInfoInitialize.
Returns
The insertion position after the masked.

Definition at line 4891 of file OMPIRBuilder.cpp.

References llvm::PHINode::addIncoming(), Builder, llvm::BasicBlock::Create(), createBarrier(), createMasked(), emitBlock(), emitNoUnwindRuntimeCall(), F, llvm::Function::getContext(), llvm::BasicBlock::getFirstInsertionPt(), llvm::Intrinsic::getOrInsertDeclaration(), llvm::Value::getType(), IV, llvm::Next, llvm::ScanInfo::ScanBuffPtrs, llvm::ScanInfo::Span, llvm::splitBB(), llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ emitTargetKernel()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::emitTargetKernel ( const LocationDescription & Loc,
InsertPointTy AllocaIP,
Value *& Return,
Value * Ident,
Value * DeviceID,
Value * NumTeams,
Value * NumThreads,
Value * HostPtr,
ArrayRef< Value * > KernelArgs )

Generate a target region entry call.

Parameters
LocThe location at which the request originated and is fulfilled.
AllocaIPThe insertion point to be used for alloca instructions.
ReturnReturn value of the created function returned by reference.
DeviceIDIdentifier for the device via the 'device' clause.
NumTeamsNumer of teams for the region via the 'num_teams' clause or 0 if unspecified and -1 if there is no 'teams' clause.
NumThreadsNumber of threads via the 'thread_limit' clause.
HostPtrPointer to the host-side pointer of the target kernel.
KernelArgsArray of arguments to the kernel.

Definition at line 1268 of file OMPIRBuilder.cpp.

References Builder, createRuntimeFunctionCall(), getOrCreateRuntimeFunction(), I, M, Size, llvm::ArrayRef< T >::size(), and updateToLocation().

Referenced by emitKernelLaunch().

◆ emitTargetRegionFunction()

Error OpenMPIRBuilder::emitTargetRegionFunction ( TargetRegionEntryInfo & EntryInfo,
FunctionGenCallback & GenerateFunctionCallback,
bool IsOffloadEntry,
Function *& OutlinedFn,
Constant *& OutlinedFnID )

Create a unique name for the entry function using the source location information of the current target region.

The name will be something like:

__omp_offloading_DD_FFFF_PP_lBB[_CC]

where DD_FFFF is an ID unique to the file (device and file IDs), PP is the mangled name of the function that encloses the target region and BB is the line number of the target region. CC is a count added when more than one region is located at the same location.

If this target outline function is not an offload entry, we don't need to register it. This may happen if it is guarded by an if clause that is false at compile time, or no target archs have been specified.

The created target region ID is used by the runtime library to identify the current target region, so it only has to be unique and not necessarily point to anything. It could be the pointer to the outlined function that implements the target region, but we aren't using that so that the compiler doesn't need to keep that, and could therefore inline the host function if proven worthwhile during optimization. In the other hand, if emitting code for the device, the ID has to be the function address so that it can retrieved from the offloading entry and launched by the runtime library. We also mark the outlined function to have external linkage in case we are emitting code for the device, because these functions will be entry points to the device.

Parameters
InfoManagerThe info manager keeping track of the offload entries
EntryInfoThe entry information about the function
GenerateFunctionCallbackThe callback function to generate the code
OutlinedFunctionPointer to the outlined function
EntryFnIDNameName of the ID o be created

Definition at line 7795 of file OMPIRBuilder.cpp.

References Config, createPlatformSpecificName(), OffloadInfoManager, registerTargetRegionFunction(), llvm::Error::success(), and llvm::Expected< T >::takeError().

Referenced by emitTargetOutlinedFunction().

◆ emitTargetTask()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::emitTargetTask ( TargetTaskBodyCallbackTy TaskBodyCB,
Value * DeviceID,
Value * RTLoc,
OpenMPIRBuilder::InsertPointTy AllocaIP,
const SmallVector< llvm::OpenMPIRBuilder::DependData > & Dependencies,
const TargetDataRTArgs & RTArgs,
bool HasNoWait )

Generate a target-task for the target construct.

Parameters
TaskBodyCBCallback to generate the actual body of the target task.
DeviceIDIdentifier for the device via the 'device' clause.
RTLocSource location identifier
AllocaIPThe insertion point to be used for alloca instructions.
DependenciesVector of DependData objects holding information of dependencies as specified by the 'depend' clause.
HasNoWaitTrue if the target construct had 'nowait' on it, false otherwise

Definition at line 8560 of file OMPIRBuilder.cpp.

References addOutlineInfo(), llvm::CallBase::arg_size(), assert(), llvm::OpenMPIRBuilder::TargetDataRTArgs::BasePointersArray, llvm::BasicBlock::begin(), Builder, llvm::cast(), llvm::BasicBlock::Create(), createFakeIntVal(), createRuntimeFunctionCall(), createTaskWithPrivatesTy(), llvm::dbgs(), llvm::dyn_cast(), emitBlock(), emitTargetTaskProxyFunction(), emitTaskDependencies(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::Instruction::eraseFromParent(), llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::ConstantPointerNull::get(), llvm::AllocaInst::getAllocatedType(), llvm::CallBase::getArgOperand(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Instruction::getDebugLoc(), llvm::StructType::getElementType(), getOffloadingArrayType(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::Value::getPointerAlignment(), llvm::PointerType::getUnqual(), llvm::Value::hasOneUse(), I, llvm::isa(), LLVM_DEBUG, loadSharedDataFromTaskDescriptor(), M, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapNamesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MappersArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArrayEnd, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::TargetDataRTArgs::PointersArray, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::reverse(), llvm::Instruction::setDebugLoc(), llvm::SmallVectorTemplateCommon< T, typename >::size(), llvm::OpenMPIRBuilder::TargetDataRTArgs::SizesArray, llvm::splitBB(), and llvm::Value::user_back().

Referenced by createTargetData(), and emitTargetCall().

◆ emitTaskwaitImpl()

void OpenMPIRBuilder::emitTaskwaitImpl ( const LocationDescription & Loc)

Generate a taskwait runtime call.

Parameters
LocThe location at which the request originated and is fulfilled.

Definition at line 1924 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), and getOrCreateThreadID().

Referenced by createTaskwait().

◆ emitTaskyieldImpl()

void OpenMPIRBuilder::emitTaskyieldImpl ( const LocationDescription & Loc)

Generate a taskyield runtime call.

Parameters
LocThe location at which the request originated and is fulfilled.

Definition at line 1943 of file OMPIRBuilder.cpp.

References createRuntimeFunctionCall(), llvm::Constant::getNullValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), and getOrCreateThreadID().

Referenced by createTaskyield().

◆ emitUsed()

◆ emitUserDefinedMapper()

Expected< Function * > OpenMPIRBuilder::emitUserDefinedMapper ( function_ref< MapInfosOrErrorTy(InsertPointTy CodeGenIP, llvm::Value *PtrPHI, llvm::Value *BeginArg)> PrivAndGenMapInfoCB,
llvm::Type * ElemTy,
StringRef FuncName,
CustomMapperCallbackTy CustomMapperCB )

Emit the user-defined mapper function.

The code generation follows the pattern in the example below.

void .omp_mapper.<type_name>.<mapper_id>.(void *rt_mapper_handle,
void *base, void *begin,
int64_t size, int64_t type,
void *name = nullptr) {
// Allocate space for an array section first or add a base/begin for
// pointer dereference.
if ((size > 1 || (base != begin && maptype.IsPtrAndObj)) &&
!maptype.IsDelete)
__tgt_push_mapper_component(rt_mapper_handle, base, begin,
size*sizeof(Ty), clearToFromMember(type));
// Map members.
for (unsigned i = 0; i < size; i++) {
// For each component specified by this mapper:
for (auto c : begin[i]->all_components) {
if (c.hasMapper())
(*c.Mapper())(rt_mapper_handle, c.arg_base, c.arg_begin,
c.arg_size,
c.arg_type, c.arg_name);
else
__tgt_push_mapper_component(rt_mapper_handle, c.arg_base,
c.arg_begin, c.arg_size, c.arg_type,
c.arg_name);
}
}
// Delete the array section.
if (size > 1 && maptype.IsDelete)
__tgt_push_mapper_component(rt_mapper_handle, base, begin,
size*sizeof(Ty), clearToFromMember(type));
}
static const char * name
Parameters
PrivAndGenMapInfoCBCallback that privatizes code and populates the MapInfos and returns.
ElemTyDeclareMapper element type.
FuncNameOptional param to specify mapper function name.
CustomMapperCBOptional callback to generate code related to custom mappers.

Definition at line 9517 of file OMPIRBuilder.cpp.

References llvm::Function::addFnAttr(), llvm::PHINode::addIncoming(), llvm::Function::addParamAttr(), Builder, llvm::BasicBlock::Create(), llvm::Function::Create(), createRuntimeFunctionCall(), emitBlock(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::FunctionType::get(), llvm::Function::getArg(), getFlagMemberOffset(), llvm::Constant::getNullValue(), getOrCreateRuntimeFunction(), llvm::Value::getType(), I, llvm::GlobalValue::InternalLinkage, M, llvm::CallBase::setDoesNotThrow(), and Size.

◆ finalize()

◆ getAddrOfDeclareTargetVar()

Constant * OpenMPIRBuilder::getAddrOfDeclareTargetVar ( OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause,
OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause,
bool IsDeclaration,
bool IsExternallyVisible,
TargetRegionEntryInfo EntryInfo,
StringRef MangledName,
std::vector< GlobalVariable * > & GeneratedRefs,
bool OpenMPSIMD,
std::vector< Triple > TargetTriple,
Type * LlvmPtrTy,
std::function< Constant *()> GlobalInitializer,
std::function< GlobalValue::LinkageTypes()> VariableLinkage )

Retrieve (or create if non-existent) the address of a declare target variable, used in conjunction with registerTargetGlobalVariable to create declare target global variables.

Parameters
CaptureClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (link, to, enter).
DeviceClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (nohost, host, any)
IsDeclaration- boolean stating if the variable being registered is a declaration-only and not a definition
IsExternallyVisible- boolean stating if the variable is externally visible
EntryInfo- Unique entry information for the value generated using getTargetEntryUniqueInfo, used to name generated pointer references to the declare target variable
MangledName- the mangled name of the variable being registered
GeneratedRefs- references generated by invocations of registerTargetGlobalVariable invoked from getAddrOfDeclareTargetVar, these are required by Clang for book keeping.
OpenMPSIMD- if OpenMP SIMD mode is currently enabled
TargetTriple- The OpenMP device target triple we are compiling for
LlvmPtrTy- The type of the variable we are generating or retrieving an address for
GlobalInitializer- a lambda function which creates a constant used for initializing a pointer reference to the variable in certain cases. If a nullptr is passed, it will default to utilising the original variable to initialize the pointer reference.
VariableLinkage- a lambda function which returns the variables linkage type, if unspecified and a nullptr is given, it will instead utilise the linkage stored on the existing global variable in the LLVMModule.

Definition at line 11144 of file OMPIRBuilder.cpp.

References llvm::cast(), Config, llvm::TargetRegionEntryInfo::FileID, llvm::format(), getOrCreateInternalVariable(), M, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, registerTargetGlobalVariable(), and llvm::GlobalValue::WeakAnyLinkage.

Referenced by registerTargetGlobalVariable().

◆ getFlagMemberOffset()

unsigned OpenMPIRBuilder::getFlagMemberOffset ( )

Get the offset of the OMP_MAP_MEMBER_OF field.

Definition at line 11102 of file OMPIRBuilder.cpp.

References llvm::Offset, and llvm::omp::OMP_MAP_MEMBER_OF.

Referenced by emitUserDefinedMapper(), and getMemberOfFlag().

◆ getInsertionPoint()

InsertPointTy llvm::OpenMPIRBuilder::getInsertionPoint ( )
inline

}

Return the insertion point used by the underlying IRBuilder.

Definition at line 2297 of file OMPIRBuilder.h.

References Builder.

◆ getKernelArgsVector()

◆ getMemberOfFlag()

omp::OpenMPOffloadMappingFlags OpenMPIRBuilder::getMemberOfFlag ( unsigned Position)

Get OMP_MAP_MEMBER_OF flag with extra bits reserved based on the position given.

Parameters
Position- A value indicating the position of the parent of the member in the kernel argument structure, often retrieved by the parents position in the combined information vectors used to generate the structure itself. Multiple children (member's of) with the same parent will use the same returned member flag.

Definition at line 11113 of file OMPIRBuilder.cpp.

References getFlagMemberOffset().

◆ getOpenMPDefaultSimdAlign()

unsigned OpenMPIRBuilder::getOpenMPDefaultSimdAlign ( const Triple & TargetTriple,
const StringMap< bool > & Features )
static

Get the default alignment value for given target.

Parameters
TargetTripleTarget triple
FeaturesStringMap which describes extra CPU features

Definition at line 6662 of file OMPIRBuilder.cpp.

References llvm::Triple::isPPC(), llvm::Triple::isWasm(), llvm::Triple::isX86(), and llvm::StringMap< ValueTy, AllocatorTy >::lookup().

◆ getOrCreateDefaultSrcLocStr()

Constant * OpenMPIRBuilder::getOrCreateDefaultSrcLocStr ( uint32_t & SrcLocStrSize)

Return the (LLVM-IR) string describing the default source location.

Definition at line 1086 of file OMPIRBuilder.cpp.

References getOrCreateSrcLocStr().

Referenced by emitTargetCall(), and getOrCreateSrcLocStr().

◆ getOrCreateIdent()

◆ getOrCreateInternalVariable()

GlobalVariable * OpenMPIRBuilder::getOrCreateInternalVariable ( Type * Ty,
const StringRef & Name,
std::optional< unsigned > AddressSpace = {} )

Gets (if variable with the given name already exist) or creates internal global variable with the specified Name.

The created variable has linkage CommonLinkage by default and is initialized by null value.

Parameters
TyType of the global variable. If it is exist already the type must be the same.
NameName of the variable.

Definition at line 9207 of file OMPIRBuilder.cpp.

References assert(), llvm::GlobalValue::CommonLinkage, DL, llvm::Constant::getNullValue(), llvm::GlobalValue::InternalLinkage, InternalVars, M, llvm::GlobalValue::NotThreadLocal, and llvm::Triple::wasm32.

Referenced by createCachedThreadPrivate(), getAddrOfDeclareTargetVar(), and registerTargetGlobalVariable().

◆ getOrCreateRuntimeFunction()

◆ getOrCreateRuntimeFunctionPtr()

◆ getOrCreateSrcLocStr() [1/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( const LocationDescription & Loc,
uint32_t & SrcLocStrSize )

Return the (LLVM-IR) string describing the source location Loc.

Definition at line 1108 of file OMPIRBuilder.cpp.

References getOrCreateSrcLocStr().

◆ getOrCreateSrcLocStr() [2/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( DebugLoc DL,
uint32_t & SrcLocStrSize,
Function * F = nullptr )

Return the (LLVM-IR) string describing the DebugLoc DL.

Use F as fallback if DL does not specify the function name.

Definition at line 1091 of file OMPIRBuilder.cpp.

References DL, llvm::Function::empty(), F, getOrCreateDefaultSrcLocStr(), getOrCreateSrcLocStr(), and M.

◆ getOrCreateSrcLocStr() [3/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( StringRef FunctionName,
StringRef FileName,
unsigned Line,
unsigned Column,
uint32_t & SrcLocStrSize )

Return the (LLVM-IR) string describing the source location identified by the arguments.

Definition at line 1067 of file OMPIRBuilder.cpp.

References llvm::SmallString< InternalLen >::append(), getOrCreateSrcLocStr(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and llvm::SmallString< InternalLen >::str().

◆ getOrCreateSrcLocStr() [4/4]

◆ getOrCreateThreadID()

◆ getSizeInBytes()

Value * OpenMPIRBuilder::getSizeInBytes ( Value * BasePtr)

Computes the size of type in bytes.

Definition at line 9250 of file OMPIRBuilder.cpp.

References Builder, llvm::Type::getInt64Ty(), llvm::Constant::getNullValue(), llvm::PointerType::getUnqual(), and llvm::Null.

◆ getTargetEntryUniqueInfo()

TargetRegionEntryInfo OpenMPIRBuilder::getTargetEntryUniqueInfo ( FileIdentifierInfoCallbackTy CallBack,
vfs::FileSystem & VFS,
StringRef ParentName = "" )
static

Creates a unique info for a target entry when provided a filename and line number from.

Parameters
CallBackA callback function which should return filename the entry resides in as well as the line number for the target entry
ParentNameThe name of the parent the target entry resides in, if any.

Definition at line 11083 of file OMPIRBuilder.cpp.

References llvm::hash_value(), and llvm::vfs::FileSystem::status().

◆ initialize()

void OpenMPIRBuilder::initialize ( )

Initialize the internal state, this will put structures types and potentially other helpers into the underlying module.

Must be called before any other method and only once! This internal state includes types used in the OpenMPIRBuilder generated from OMPKinds.def.

Definition at line 743 of file OMPIRBuilder.cpp.

References M.

◆ isFinalized()

bool OpenMPIRBuilder::isFinalized ( )

Check whether the finalize function has already run.

Returns
true if the finalize function has already run

Definition at line 949 of file OMPIRBuilder.cpp.

◆ isLastFinalizationInfoCancellable()

bool llvm::OpenMPIRBuilder::isLastFinalizationInfoCancellable ( omp::Directive DK)
inline

Return true if the last entry in the finalization stack is of kind DK and cancellable.

Definition at line 2397 of file OMPIRBuilder.h.

References FinalizationStack.

Referenced by createBarrier(), and emitCancelationCheckImpl().

◆ loadOffloadInfoMetadata() [1/2]

void OpenMPIRBuilder::loadOffloadInfoMetadata ( Module & M)

Loads all the offload entries information from the host IR metadata.

This function is only meant to be used with device code generation.

Parameters
MModule to load Metadata info from. Module passed maybe loaded from bitcode file, i.e, different from OpenMPIRBuilder::M module.

Definition at line 11281 of file OMPIRBuilder.cpp.

References llvm::cast(), llvm_unreachable, M, OffloadInfoManager, ompOffloadInfoName, and llvm::NamedMDNode::operands().

Referenced by loadOffloadInfoMetadata().

◆ loadOffloadInfoMetadata() [2/2]

void OpenMPIRBuilder::loadOffloadInfoMetadata ( vfs::FileSystem & VFS,
StringRef HostFilePath )

Loads all the offload entries information from the host IR metadata read from the file passed in as the HostFilePath argument.

This function is only meant to be used with device code generation.

Parameters
HostFilePathThe path to the host IR file, used to load in offload metadata for the device, allowing host and device to maintain the same metadata mapping.

Definition at line 11327 of file OMPIRBuilder.cpp.

References llvm::StringRef::empty(), llvm::expectedToErrorOrAndEmitErrors(), llvm::vfs::FileSystem::getBufferForFile(), loadOffloadInfoMetadata(), M, llvm::parseBitcodeFile(), and llvm::report_fatal_error().

◆ popFinalizationCB()

void llvm::OpenMPIRBuilder::popFinalizationCB ( )
inline

Pop the last finalization callback from the finalization stack.

NOTE: Temporary solution until Clang CG is gone.

Definition at line 621 of file OMPIRBuilder.h.

References FinalizationStack.

◆ pushFinalizationCB()

void llvm::OpenMPIRBuilder::pushFinalizationCB ( const FinalizationInfo & FI)
inline

Push a finalization callback on the finalization stack.

NOTE: Temporary solution until Clang CG is gone.

Definition at line 614 of file OMPIRBuilder.h.

References FinalizationStack.

◆ readTeamBoundsForKernel()

std::pair< int32_t, int32_t > OpenMPIRBuilder::readTeamBoundsForKernel ( const Triple & T,
Function & Kernel )
static

Read/write a bounds on teams for Kernel.

Read will return 0 if none is set.

Definition at line 7739 of file OMPIRBuilder.cpp.

◆ readThreadBoundsForKernel()

std::pair< int32_t, int32_t > OpenMPIRBuilder::readThreadBoundsForKernel ( const Triple & T,
Function & Kernel )
static

}

Helpers to read/write kernel annotations from the IR.

{ Read/write a bounds on threads for Kernel. Read will return 0 if none is set.

Definition at line 7699 of file OMPIRBuilder.cpp.

References T, and llvm::to_integer().

◆ registerTargetGlobalVariable()

void OpenMPIRBuilder::registerTargetGlobalVariable ( OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause,
OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause,
bool IsDeclaration,
bool IsExternallyVisible,
TargetRegionEntryInfo EntryInfo,
StringRef MangledName,
std::vector< GlobalVariable * > & GeneratedRefs,
bool OpenMPSIMD,
std::vector< Triple > TargetTriple,
std::function< Constant *()> GlobalInitializer,
std::function< GlobalValue::LinkageTypes()> VariableLinkage,
Type * LlvmPtrTy,
Constant * Addr )

Registers a target variable for device or host.

Parameters
CaptureClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (link, to, enter).
DeviceClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (nohost, host, any)
IsDeclaration- boolean stating if the variable being registered is a declaration-only and not a definition
IsExternallyVisible- boolean stating if the variable is externally visible
EntryInfo- Unique entry information for the value generated using getTargetEntryUniqueInfo, used to name generated pointer references to the declare target variable
MangledName- the mangled name of the variable being registered
GeneratedRefs- references generated by invocations of registerTargetGlobalVariable these are required by Clang for book keeping.
OpenMPSIMD- if OpenMP SIMD mode is currently enabled
TargetTriple- The OpenMP device target triple we are compiling for
GlobalInitializer- a lambda function which creates a constant used for initializing a pointer reference to the variable in certain cases. If a nullptr is passed, it will default to utilising the original variable to initialize the pointer reference.
VariableLinkage- a lambda function which returns the variables linkage type, if unspecified and a nullptr is given, it will instead utilise the linkage stored on the existing global variable in the LLVMModule.
LlvmPtrTy- The type of the variable we are generating or retrieving an address for
Addr- the original llvm value (addr) of the variable to be registered

Definition at line 11200 of file OMPIRBuilder.cpp.

References llvm::cast(), Config, createPlatformSpecificName(), llvm::divideCeil(), getAddrOfDeclareTargetVar(), llvm::GlobalValue::getLinkage(), llvm::Value::getName(), getOrCreateInternalVariable(), llvm::Value::getType(), llvm::GlobalValue::getValueType(), llvm::GlobalValue::InternalLinkage, llvm::GlobalValue::LinkOnceODRLinkage, M, OffloadInfoManager, llvm::OffloadEntriesInfoManager::OMPTargetDeviceClauseAny, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, and llvm::GlobalValue::WeakAnyLinkage.

Referenced by getAddrOfDeclareTargetVar().

◆ registerTargetRegionFunction()

Constant * OpenMPIRBuilder::registerTargetRegionFunction ( TargetRegionEntryInfo & EntryInfo,
Function * OutlinedFunction,
StringRef EntryFnName,
StringRef EntryFnIDName )

Registers the given function and sets up the attribtues of the function Returns the FunctionID.

Parameters
InfoManagerThe info manager keeping track of the offload entries
EntryInfoThe entry information about the function
OutlinedFunctionPointer to the outlined function
EntryFnNameName of the outlined function
EntryFnIDNameName of the ID o be created

Definition at line 7828 of file OMPIRBuilder.cpp.

References OffloadInfoManager, and llvm::OffloadEntriesInfoManager::OMPTargetRegionEntryTargetRegion.

Referenced by emitTargetRegionFunction().

◆ scanInfoInitialize()

Expected< ScanInfo * > OpenMPIRBuilder::scanInfoInitialize ( )

Creates a ScanInfo object, allocates and returns the pointer.

Definition at line 5145 of file OMPIRBuilder.cpp.

References ScanInfos.

◆ setConfig()

void llvm::OpenMPIRBuilder::setConfig ( OpenMPIRBuilderConfig C)
inline

Definition at line 540 of file OMPIRBuilder.h.

References llvm::CallingConv::C, and Config.

◆ setCorrectMemberOfFlag()

void OpenMPIRBuilder::setCorrectMemberOfFlag ( omp::OpenMPOffloadMappingFlags & Flags,
omp::OpenMPOffloadMappingFlags MemberOfFlag )

Given an initial flag set, this function modifies it to contain the passed in MemberOfFlag generated from the getMemberOfFlag function.

The results are dependent on the existing flag bits set in the original flag set.

Parameters
Flags- The original set of flags to be modified with the passed in MemberOfFlag.
MemberOfFlag- A modified OMP_MAP_MEMBER_OF flag, adjusted slightly based on the getMemberOfFlag which adjusts the flag bits based on the members position in its parent.

Definition at line 11119 of file OMPIRBuilder.cpp.

References llvm::omp::OMP_MAP_ATTACH, llvm::omp::OMP_MAP_MEMBER_OF, and llvm::omp::OMP_MAP_PTR_AND_OBJ.

◆ tileLoops()

std::vector< CanonicalLoopInfo * > OpenMPIRBuilder::tileLoops ( DebugLoc DL,
ArrayRef< CanonicalLoopInfo * > Loops,
ArrayRef< Value * > TileSizes )

Tile a loop nest.

Tiles the loops of Loops by the tile sizes in TileSizes. Loops in / Loops must be perfectly nested, from outermost to innermost loop (i.e. Loops.front() is the outermost loop). The trip count llvm::Value of every loop and every tile sizes must be usable in the outermost loop's preheader. This implies that the loop nest is rectangular.

Example:

for (int i = 0; i < 15; ++i) // Canonical loop "i"
for (int j = 0; j < 14; ++j) // Canonical loop "j"
body(i, j);

After tiling with Loops={i,j} and TileSizes={5,7}, the loop is changed to

for (int i1 = 0; i1 < 3; ++i1)
for (int j1 = 0; j1 < 2; ++j1)
for (int i2 = 0; i2 < 5; ++i2)
for (int j2 = 0; j2 < 7; ++j2)
body(i1*3+i2, j1*3+j2);

The returned vector are the loops {i1,j1,i2,j2}. The loops i1 and j1 are referred to the floor, and the loops i2 and j2 are the tiles. Tiling also handles non-constant trip counts, non-constant tile sizes and trip counts that are not multiples of the tile size. In the latter case the tile loop of the last floor-loop iteration will have fewer iterations than specified as its tile size.

Parameters
DLDebug location for instructions added by tiling, for instance the floor- and tile trip count computation.
LoopsLoops to tile. The CanonicalLoopInfo objects are invalidated by this method, i.e. should not used after tiling.
TileSizesFor each loop in Loops, the tile size for that dimensions.
Returns
A list of generated loops. Contains twice as many loops as the input loop nest; the first half are the floor loops and the second half are the tile loops.

Definition at line 6318 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::Continue, createLoopSkeleton(), DL, llvm::SmallVectorImpl< T >::emplace_back(), llvm::enumerate(), F, llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBody(), llvm::CanonicalLoopInfo::getExit(), llvm::CanonicalLoopInfo::getIndVar(), llvm::CanonicalLoopInfo::getLatch(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::CanonicalLoopInfo::getPreheaderIP(), llvm::BasicBlock::getTerminator(), llvm::Value::getType(), Loops, P, llvm::SmallVectorTemplateBase< T, bool >::push_back(), redirectAllPredecessorsTo(), redirectTo(), removeUnusedBlocksFromParent(), llvm::Value::replaceAllUsesWith(), llvm::SmallVectorImpl< T >::reserve(), Size, llvm::ArrayRef< T >::size(), and TileSize.

Referenced by unrollLoopPartial().

◆ unrollLoopFull()

void OpenMPIRBuilder::unrollLoopFull ( DebugLoc DL,
CanonicalLoopInfo * Loop )

Fully unroll a loop.

Instead of unrolling the loop immediately (and duplicating its body instructions), it is deferred to LLVM's LoopUnrollPass by adding loop metadata.

Parameters
DLDebug location for instructions added by unrolling.
LoopThe loop to unroll. The loop will be invalidated.

Definition at line 6558 of file OMPIRBuilder.cpp.

References addLoopMetadata(), Builder, llvm::MDNode::get(), and llvm::MDString::get().

◆ unrollLoopHeuristic()

void OpenMPIRBuilder::unrollLoopHeuristic ( DebugLoc DL,
CanonicalLoopInfo * Loop )

Fully or partially unroll a loop.

How the loop is unrolled is determined using LLVM's LoopUnrollPass.

Parameters
DLDebug location for instructions added by unrolling.
LoopThe loop to unroll. The loop will be invalidated.

Definition at line 6565 of file OMPIRBuilder.cpp.

References addLoopMetadata(), Builder, llvm::MDNode::get(), and llvm::MDString::get().

◆ unrollLoopPartial()

void OpenMPIRBuilder::unrollLoopPartial ( DebugLoc DL,
CanonicalLoopInfo * Loop,
int32_t Factor,
CanonicalLoopInfo ** UnrolledCLI )

Partially unroll a loop.

The CanonicalLoopInfo of the unrolled loop for use with chained loop-associated directive can be requested using UnrolledCLI. Not needing the CanonicalLoopInfo allows more efficient code generation by deferring the actual unrolling to the LoopUnrollPass using loop metadata. A loop-associated directive applied to the unrolled loop needs to know the new trip count which means that if using a heuristically determined unroll factor (Factor == 0), that factor must be computed immediately. We are using the same logic as the LoopUnrollPass to derived the unroll factor, but which assumes that some canonicalization has taken place (e.g. Mem2Reg, LICM, GVN, Inlining, etc.). That is, the heuristic will perform better when the unrolled loop's CanonicalLoopInfo is not needed.

Parameters
DLDebug location for instructions added by unrolling.
LoopThe loop to unroll. The loop will be invalidated.
FactorThe factor to unroll the loop by. A factor of 0 indicates that a heuristic should be used to determine the unroll-factor.
UnrolledCLIIf non-null, receives the CanonicalLoopInfo of the partially unrolled loop. Otherwise, uses loop metadata to defer unrolling to the LoopUnrollPass.

Definition at line 6935 of file OMPIRBuilder.cpp.

References addLoopMetadata(), assert(), computeHeuristicUnrollFactor(), DL, F, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::Type::getInt32Ty(), llvm::Type::getIntegerBitWidth(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and tileLoops().

◆ updateToLocation()

◆ writeTeamsForKernel()

void OpenMPIRBuilder::writeTeamsForKernel ( const Triple & T,
Function & Kernel,
int32_t LB,
int32_t UB )
static

Definition at line 7744 of file OMPIRBuilder.cpp.

References T, and llvm::utostr().

Referenced by createTargetInit().

◆ writeThreadBoundsForKernel()

void OpenMPIRBuilder::writeThreadBoundsForKernel ( const Triple & T,
Function & Kernel,
int32_t LB,
int32_t UB )
static

Definition at line 7724 of file OMPIRBuilder.cpp.

References T, updateNVPTXAttr(), and llvm::utostr().

Referenced by createTargetInit().

Member Data Documentation

◆ Builder

IRBuilder llvm::OpenMPIRBuilder::Builder

The LLVM-IR Builder used to create IR.

Definition at line 2425 of file OMPIRBuilder.h.

Referenced by applySimd(), calculateCanonicalLoopTripCount(), collapseLoops(), createAtomicCapture(), createAtomicCompare(), createAtomicRead(), createAtomicUpdate(), createAtomicWrite(), createBarrier(), createCachedThreadPrivate(), createCancel(), createCancellationPoint(), createCanonicalLoop(), createCanonicalLoop(), createCanonicalScanLoops(), createCopyinClauseBlocks(), createCopyPrivate(), createDistribute(), createLoopSkeleton(), createMapperAllocas(), createOffloadEntriesAndInfoMetadata(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createParallel(), createReductions(), createReductionsGPU(), createRuntimeFunctionCall(), createScan(), createSection(), createSections(), createSingle(), createTarget(), createTargetData(), createTargetDeinit(), createTargetInit(), createTargetLoopWorkshareCall(), createTask(), createTaskgroup(), createTeams(), emitBlock(), emitBranch(), emitCancelationCheckImpl(), emitIfClause(), emitKernelExecutionMode(), emitKernelLaunch(), emitMapperCall(), emitNonContiguousDescriptor(), emitOffloadingArrays(), emitOffloadingArraysAndArgs(), emitOffloadingArraysArgument(), emitScanReduction(), emitTargetCall(), emitTargetKernel(), emitTargetTask(), emitTaskDependencies(), emitUsed(), emitUserDefinedMapper(), finalize(), llvm::OpenMPIRBuilder::FinalizationInfo::getFiniBB(), getInsertionPoint(), getKernelArgsVector(), getOrCreateSrcLocStr(), getSizeInBytes(), hostParallelCallback(), llvm::OpenMPIRBuilder::FinalizationInfo::mergeFiniBB(), OpenMPIRBuilder(), targetParallelCallback(), tileLoops(), unrollLoopFull(), unrollLoopHeuristic(), updateToLocation(), and workshareLoopTargetCallback().

◆ Config

◆ ConstantAllocaRaiseCandidates

SmallVector<llvm::Function *, 16> llvm::OpenMPIRBuilder::ConstantAllocaRaiseCandidates

A collection of candidate target functions that's constant allocas will attempt to be raised on a call of finalize after all currently enqueued outline info's have been processed.

Definition at line 2465 of file OMPIRBuilder.h.

Referenced by createOutlinedFunction(), and finalize().

◆ FinalizationStack

SmallVector<FinalizationInfo, 8> llvm::OpenMPIRBuilder::FinalizationStack

The finalization stack made up of finalize callbacks currently in-flight, wrapped into FinalizationInfo objects that reference also the finalization target block and the kind of cancellable directive.

Definition at line 2393 of file OMPIRBuilder.h.

Referenced by createParallel(), createSections(), emitCancelationCheckImpl(), isLastFinalizationInfoCancellable(), popFinalizationCB(), and pushFinalizationCB().

◆ IdentMap

DenseMap<std::pair<Constant *, uint64_t>, Constant *> llvm::OpenMPIRBuilder::IdentMap

Map to remember existing ident_t*.

Definition at line 2431 of file OMPIRBuilder.h.

Referenced by getOrCreateIdent().

◆ InternalVars

StringMap<GlobalVariable *, BumpPtrAllocator> llvm::OpenMPIRBuilder::InternalVars

An ordered map of auto-generated variables to their unique names.

It stores variables with the following names: 1) ".gomp_critical_user_" + <critical_section_name> + ".var" for "omp critical" directives; 2) <mangled_name_for_global_var> + ".cache." for cache for threadprivate variables.

Definition at line 2482 of file OMPIRBuilder.h.

Referenced by getOrCreateInternalVariable().

◆ LoopInfos

std::forward_list<CanonicalLoopInfo> llvm::OpenMPIRBuilder::LoopInfos

Collection of owned canonical loop objects that eventually need to be free'd.

Definition at line 2469 of file OMPIRBuilder.h.

Referenced by createLoopSkeleton().

◆ M

◆ OffloadInfoManager

OffloadEntriesInfoManager llvm::OpenMPIRBuilder::OffloadInfoManager

◆ ompOffloadInfoName

const std::string llvm::OpenMPIRBuilder::ompOffloadInfoName = "omp_offload.info"

OMP Offload Info Metadata name string.

Definition at line 3828 of file OMPIRBuilder.h.

Referenced by loadOffloadInfoMetadata().

◆ OutlineInfos

SmallVector<OutlineInfo, 16> llvm::OpenMPIRBuilder::OutlineInfos

Collection of regions that need to be outlined during finalization.

Definition at line 2460 of file OMPIRBuilder.h.

Referenced by addOutlineInfo(), finalize(), and ~OpenMPIRBuilder().

◆ ScanInfos

std::forward_list<ScanInfo> llvm::OpenMPIRBuilder::ScanInfos

Collection of owned ScanInfo objects that eventually need to be free'd.

Definition at line 2472 of file OMPIRBuilder.h.

Referenced by scanInfoInitialize().

◆ SrcLocStrMap

StringMap<Constant *> llvm::OpenMPIRBuilder::SrcLocStrMap

Map to remember source location strings.

Definition at line 2428 of file OMPIRBuilder.h.

Referenced by getOrCreateSrcLocStr().

◆ T

const Triple llvm::OpenMPIRBuilder::T

The documentation for this class was generated from the following files: