This file implements the targeting of the RegisterBankInfo class for AMDGPU. More...

#include "AMDGPURegisterBankInfo.h"
#include "AMDGPU.h"
#include "AMDGPUGlobalISelUtils.h"
#include "AMDGPUInstrInfo.h"
#include "GCNSubtarget.h"
#include "SIMachineFunctionInfo.h"
#include "SIRegisterInfo.h"
#include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"
#include "llvm/CodeGen/GlobalISel/LegalizerHelper.h"
#include "llvm/CodeGen/GlobalISel/MIPatternMatch.h"
#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
#include "llvm/CodeGen/RegisterBank.h"
#include "llvm/IR/IntrinsicsAMDGPU.h"
#include "AMDGPUGenRegisterBank.inc"
#include "AMDGPUGenRegisterBankInfo.def"

Macros
#define	GET_TARGET_REGBANK_IMPL

Functions
static bool	isVectorRegisterBank (const RegisterBank &Bank)

static void	setRegsToType (MachineRegisterInfo &MRI, ArrayRef< Register > Regs, LLT NewTy)
	Replace the current type each register in `Regs` has with `NewTy`.

static LLT	getHalfSizedType (LLT Ty)

static std::pair< LLT, LLT >	splitUnequalType (LLT Ty, unsigned FirstSize)
	Split `Ty` into 2 pieces.

static LLT	widen96To128 (LLT Ty)

static unsigned	getExtendOp (unsigned Opc)

static std::pair< Register, Register >	unpackV2S16ToS32 (MachineIRBuilder &B, Register Src, unsigned ExtOpcode)

static bool	substituteSimpleCopyRegs (const AMDGPURegisterBankInfo::OperandsMapper &OpdMapper, unsigned OpIdx)

static std::pair< Register, unsigned >	getBaseWithConstantOffset (MachineRegisterInfo &MRI, Register Reg)

static void	reinsertVectorIndexAdd (MachineIRBuilder &B, MachineInstr &IdxUseInstr, unsigned OpIdx, unsigned ConstOffset)
	Utility function for pushing dynamic vector indexes with a constant offset into waterfall loops.

static void	extendLow32IntoHigh32 (MachineIRBuilder &B, Register Hi32Reg, Register Lo32Reg, unsigned ExtOpc, const RegisterBank &RegBank, bool IsBooleanSrc=false)
	Implement extending a 32-bit value to a 64-bit value.

static Register	constrainRegToBank (MachineRegisterInfo &MRI, MachineIRBuilder &B, Register &Reg, const RegisterBank &Bank)

static unsigned	regBankUnion (unsigned RB0, unsigned RB1)

static unsigned	regBankBoolUnion (unsigned RB0, unsigned RB1)

Detailed Description

This file implements the targeting of the RegisterBankInfo class for AMDGPU.

AMDGPU has unique register bank constraints that require special high level strategies to deal with. There are two main true physical register banks VGPR (vector), and SGPR (scalar). Additionally the VCC register bank is a sort of pseudo-register bank needed to represent SGPRs used in a vector boolean context. There is also the AGPR bank, which is a special purpose physical register bank present on some subtargets.

Copying from VGPR to SGPR is generally illegal, unless the value is known to be uniform. It is generally not valid to legalize operands by inserting copies as on other targets. Operations which require uniform, SGPR operands generally require scalarization by repeatedly executing the instruction, activating each set of lanes using a unique set of input values. This is referred to as a waterfall loop.

Booleans

Booleans (s1 values) requires special consideration. A vector compare result is naturally a bitmask with one bit per lane, in a 32 or 64-bit register. These are represented with the VCC bank. During selection, we need to be able to unambiguously go back from a register class to a register bank. To distinguish whether an SGPR should use the SGPR or VCC register bank, we need to know the use context type. An SGPR s1 value always means a VCC bank value, otherwise it will be the SGPR bank. A scalar compare sets SCC, which is a 1-bit unaddressable register. This will need to be copied to a 32-bit virtual register. Taken together, this means we need to adjust the type of boolean operations to be regbank legal. All SALU booleans need to be widened to 32-bits, and all VALU booleans need to be s1 values.

A noteworthy exception to the s1-means-vcc rule is for legalization artifact casts. G_TRUNC s1 results, and G_SEXT/G_ZEXT/G_ANYEXT sources are never vcc bank. A non-boolean source (such as a truncate from a 1-bit load from memory) will require a copy to the VCC bank which will require clearing the high bits and inserting a compare.

Constant bus restriction

VALU instructions have a limitation known as the constant bus restriction. Most VALU instructions can use SGPR operands, but may read at most 1 SGPR or constant literal value (this to 2 in gfx10 for most instructions). This is one unique SGPR, so the same SGPR may be used for multiple operands. From a register bank perspective, any combination of operands should be legal as an SGPR, but this is contextually dependent on the SGPR operands all being the same register. There is therefore optimal to choose the SGPR with the most uses to minimize the number of copies.

We avoid trying to solve this problem in RegBankSelect. Any VALU G_* operation should have its source operands all mapped to VGPRs (except for VCC), inserting copies from any SGPR operands. This the most trivial legal mapping. Anything beyond the simplest 1:1 instruction selection would be too complicated to solve here. Every optimization pattern or instruction selected to multiple outputs would have to enforce this rule, and there would be additional complexity in tracking this rule for every G_* operation. By forcing all inputs to VGPRs, it also simplifies the task of picking the optimal operand combination from a post-isel optimization pass.

Definition in file AMDGPURegisterBankInfo.cpp.

Macro Definition Documentation

◆ GET_TARGET_REGBANK_IMPL

#define GET_TARGET_REGBANK_IMPL

Definition at line 86 of file AMDGPURegisterBankInfo.cpp.

Function Documentation

◆ constrainRegToBank()

static Register constrainRegToBank	(	MachineRegisterInfo &	MRI,
		MachineIRBuilder &	B,
		Register &	Reg,
		const RegisterBank &	Bank
	)

static

Definition at line 2006 of file AMDGPURegisterBankInfo.cpp.

References B, and MRI.

◆ extendLow32IntoHigh32()

static void extendLow32IntoHigh32	(	MachineIRBuilder &	B,
		Register	Hi32Reg,
		Register	Lo32Reg,
		unsigned	ExtOpc,
		const RegisterBank &	RegBank,
		bool	IsBooleanSrc = `false`
	)

static

Implement extending a 32-bit value to a 64-bit value.

Lo32Reg is the original 32-bit source value (to be inserted in the low part of the combined 64-bit result), and Hi32Reg is the high half of the combined 64-bit value.

Definition at line 1898 of file AMDGPURegisterBankInfo.cpp.

References assert(), B, and llvm::LLT::scalar().

Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().