LLVM  17.0.0git
AArch64FrameLowering.cpp
Go to the documentation of this file.
1 //===- AArch64FrameLowering.cpp - AArch64 Frame Lowering -------*- C++ -*-====//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file contains the AArch64 implementation of TargetFrameLowering class.
10 //
11 // On AArch64, stack frames are structured as follows:
12 //
13 // The stack grows downward.
14 //
15 // All of the individual frame areas on the frame below are optional, i.e. it's
16 // possible to create a function so that the particular area isn't present
17 // in the frame.
18 //
19 // At function entry, the "frame" looks as follows:
20 //
21 // | | Higher address
22 // |-----------------------------------|
23 // | |
24 // | arguments passed on the stack |
25 // | |
26 // |-----------------------------------| <- sp
27 // | | Lower address
28 //
29 //
30 // After the prologue has run, the frame has the following general structure.
31 // Note that this doesn't depict the case where a red-zone is used. Also,
32 // technically the last frame area (VLAs) doesn't get created until in the
33 // main function body, after the prologue is run. However, it's depicted here
34 // for completeness.
35 //
36 // | | Higher address
37 // |-----------------------------------|
38 // | |
39 // | arguments passed on the stack |
40 // | |
41 // |-----------------------------------|
42 // | |
43 // | (Win64 only) varargs from reg |
44 // | |
45 // |-----------------------------------|
46 // | |
47 // | callee-saved gpr registers | <--.
48 // | | | On Darwin platforms these
49 // |- - - - - - - - - - - - - - - - - -| | callee saves are swapped,
50 // | prev_lr | | (frame record first)
51 // | prev_fp | <--'
52 // | async context if needed |
53 // | (a.k.a. "frame record") |
54 // |-----------------------------------| <- fp(=x29)
55 // | |
56 // | callee-saved fp/simd/SVE regs |
57 // | |
58 // |-----------------------------------|
59 // | |
60 // | SVE stack objects |
61 // | |
62 // |-----------------------------------|
63 // |.empty.space.to.make.part.below....|
64 // |.aligned.in.case.it.needs.more.than| (size of this area is unknown at
65 // |.the.standard.16-byte.alignment....| compile time; if present)
66 // |-----------------------------------|
67 // | |
68 // | local variables of fixed size |
69 // | including spill slots |
70 // |-----------------------------------| <- bp(not defined by ABI,
71 // |.variable-sized.local.variables....| LLVM chooses X19)
72 // |.(VLAs)............................| (size of this area is unknown at
73 // |...................................| compile time)
74 // |-----------------------------------| <- sp
75 // | | Lower address
76 //
77 //
78 // To access the data in a frame, at-compile time, a constant offset must be
79 // computable from one of the pointers (fp, bp, sp) to access it. The size
80 // of the areas with a dotted background cannot be computed at compile-time
81 // if they are present, making it required to have all three of fp, bp and
82 // sp to be set up to be able to access all contents in the frame areas,
83 // assuming all of the frame areas are non-empty.
84 //
85 // For most functions, some of the frame areas are empty. For those functions,
86 // it may not be necessary to set up fp or bp:
87 // * A base pointer is definitely needed when there are both VLAs and local
88 // variables with more-than-default alignment requirements.
89 // * A frame pointer is definitely needed when there are local variables with
90 // more-than-default alignment requirements.
91 //
92 // For Darwin platforms the frame-record (fp, lr) is stored at the top of the
93 // callee-saved area, since the unwind encoding does not allow for encoding
94 // this dynamically and existing tools depend on this layout. For other
95 // platforms, the frame-record is stored at the bottom of the (gpr) callee-saved
96 // area to allow SVE stack objects (allocated directly below the callee-saves,
97 // if available) to be accessed directly from the framepointer.
98 // The SVE spill/fill instructions have VL-scaled addressing modes such
99 // as:
100 // ldr z8, [fp, #-7 mul vl]
101 // For SVE the size of the vector length (VL) is not known at compile-time, so
102 // '#-7 mul vl' is an offset that can only be evaluated at runtime. With this
103 // layout, we don't need to add an unscaled offset to the framepointer before
104 // accessing the SVE object in the frame.
105 //
106 // In some cases when a base pointer is not strictly needed, it is generated
107 // anyway when offsets from the frame pointer to access local variables become
108 // so large that the offset can't be encoded in the immediate fields of loads
109 // or stores.
110 //
111 // Outgoing function arguments must be at the bottom of the stack frame when
112 // calling another function. If we do not have variable-sized stack objects, we
113 // can allocate a "reserved call frame" area at the bottom of the local
114 // variable area, large enough for all outgoing calls. If we do have VLAs, then
115 // the stack pointer must be decremented and incremented around each call to
116 // make space for the arguments below the VLAs.
117 //
118 // FIXME: also explain the redzone concept.
119 //
120 // An example of the prologue:
121 //
122 // .globl __foo
123 // .align 2
124 // __foo:
125 // Ltmp0:
126 // .cfi_startproc
127 // .cfi_personality 155, ___gxx_personality_v0
128 // Leh_func_begin:
129 // .cfi_lsda 16, Lexception33
130 //
131 // stp xa,bx, [sp, -#offset]!
132 // ...
133 // stp x28, x27, [sp, #offset-32]
134 // stp fp, lr, [sp, #offset-16]
135 // add fp, sp, #offset - 16
136 // sub sp, sp, #1360
137 //
138 // The Stack:
139 // +-------------------------------------------+
140 // 10000 | ........ | ........ | ........ | ........ |
141 // 10004 | ........ | ........ | ........ | ........ |
142 // +-------------------------------------------+
143 // 10008 | ........ | ........ | ........ | ........ |
144 // 1000c | ........ | ........ | ........ | ........ |
145 // +===========================================+
146 // 10010 | X28 Register |
147 // 10014 | X28 Register |
148 // +-------------------------------------------+
149 // 10018 | X27 Register |
150 // 1001c | X27 Register |
151 // +===========================================+
152 // 10020 | Frame Pointer |
153 // 10024 | Frame Pointer |
154 // +-------------------------------------------+
155 // 10028 | Link Register |
156 // 1002c | Link Register |
157 // +===========================================+
158 // 10030 | ........ | ........ | ........ | ........ |
159 // 10034 | ........ | ........ | ........ | ........ |
160 // +-------------------------------------------+
161 // 10038 | ........ | ........ | ........ | ........ |
162 // 1003c | ........ | ........ | ........ | ........ |
163 // +-------------------------------------------+
164 //
165 // [sp] = 10030 :: >>initial value<<
166 // sp = 10020 :: stp fp, lr, [sp, #-16]!
167 // fp = sp == 10020 :: mov fp, sp
168 // [sp] == 10020 :: stp x28, x27, [sp, #-16]!
169 // sp == 10010 :: >>final value<<
170 //
171 // The frame pointer (w29) points to address 10020. If we use an offset of
172 // '16' from 'w29', we get the CFI offsets of -8 for w30, -16 for w29, -24
173 // for w27, and -32 for w28:
174 //
175 // Ltmp1:
176 // .cfi_def_cfa w29, 16
177 // Ltmp2:
178 // .cfi_offset w30, -8
179 // Ltmp3:
180 // .cfi_offset w29, -16
181 // Ltmp4:
182 // .cfi_offset w27, -24
183 // Ltmp5:
184 // .cfi_offset w28, -32
185 //
186 //===----------------------------------------------------------------------===//
187 
188 #include "AArch64FrameLowering.h"
189 #include "AArch64InstrInfo.h"
191 #include "AArch64RegisterInfo.h"
192 #include "AArch64Subtarget.h"
193 #include "AArch64TargetMachine.h"
196 #include "llvm/ADT/ScopeExit.h"
197 #include "llvm/ADT/SmallVector.h"
198 #include "llvm/ADT/Statistic.h"
214 #include "llvm/IR/Attributes.h"
215 #include "llvm/IR/CallingConv.h"
216 #include "llvm/IR/DataLayout.h"
217 #include "llvm/IR/DebugLoc.h"
218 #include "llvm/IR/Function.h"
219 #include "llvm/MC/MCAsmInfo.h"
220 #include "llvm/MC/MCDwarf.h"
222 #include "llvm/Support/Debug.h"
224 #include "llvm/Support/MathExtras.h"
228 #include <cassert>
229 #include <cstdint>
230 #include <iterator>
231 #include <optional>
232 #include <vector>
233 
234 using namespace llvm;
235 
236 #define DEBUG_TYPE "frame-info"
237 
238 static cl::opt<bool> EnableRedZone("aarch64-redzone",
239  cl::desc("enable use of redzone on AArch64"),
240  cl::init(false), cl::Hidden);
241 
242 static cl::opt<bool>
243  ReverseCSRRestoreSeq("reverse-csr-restore-seq",
244  cl::desc("reverse the CSR restore sequence"),
245  cl::init(false), cl::Hidden);
246 
248  "stack-tagging-merge-settag",
249  cl::desc("merge settag instruction in function epilog"), cl::init(true),
250  cl::Hidden);
251 
252 static cl::opt<bool> OrderFrameObjects("aarch64-order-frame-objects",
253  cl::desc("sort stack allocations"),
254  cl::init(true), cl::Hidden);
255 
257  "homogeneous-prolog-epilog", cl::Hidden,
258  cl::desc("Emit homogeneous prologue and epilogue for the size "
259  "optimization (default = off)"));
260 
261 STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");
262 
263 /// Returns how much of the incoming argument stack area (in bytes) we should
264 /// clean up in an epilogue. For the C calling convention this will be 0, for
265 /// guaranteed tail call conventions it can be positive (a normal return or a
266 /// tail call to a function that uses less stack space for arguments) or
267 /// negative (for a tail call to a function that needs more stack space than us
268 /// for arguments).
272  bool IsTailCallReturn = false;
273  if (MBB.end() != MBBI) {
274  unsigned RetOpcode = MBBI->getOpcode();
275  IsTailCallReturn = RetOpcode == AArch64::TCRETURNdi ||
276  RetOpcode == AArch64::TCRETURNri ||
277  RetOpcode == AArch64::TCRETURNriBTI;
278  }
280 
281  int64_t ArgumentPopSize = 0;
282  if (IsTailCallReturn) {
283  MachineOperand &StackAdjust = MBBI->getOperand(1);
284 
285  // For a tail-call in a callee-pops-arguments environment, some or all of
286  // the stack may actually be in use for the call's arguments, this is
287  // calculated during LowerCall and consumed here...
288  ArgumentPopSize = StackAdjust.getImm();
289  } else {
290  // ... otherwise the amount to pop is *all* of the argument space,
291  // conveniently stored in the MachineFunctionInfo by
292  // LowerFormalArguments. This will, of course, be zero for the C calling
293  // convention.
294  ArgumentPopSize = AFI->getArgumentStackToRestore();
295  }
296 
297  return ArgumentPopSize;
298 }
299 
301 static bool needsWinCFI(const MachineFunction &MF);
304 
305 /// Returns true if a homogeneous prolog or epilog code can be emitted
306 /// for the size optimization. If possible, a frame helper call is injected.
307 /// When Exit block is given, this check is for epilog.
308 bool AArch64FrameLowering::homogeneousPrologEpilog(
309  MachineFunction &MF, MachineBasicBlock *Exit) const {
310  if (!MF.getFunction().hasMinSize())
311  return false;
313  return false;
315  return false;
316  if (EnableRedZone)
317  return false;
318 
319  // TODO: Window is supported yet.
320  if (needsWinCFI(MF))
321  return false;
322  // TODO: SVE is not supported yet.
323  if (getSVEStackSize(MF))
324  return false;
325 
326  // Bail on stack adjustment needed on return for simplicity.
327  const MachineFrameInfo &MFI = MF.getFrameInfo();
328  const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
329  if (MFI.hasVarSizedObjects() || RegInfo->hasStackRealignment(MF))
330  return false;
331  if (Exit && getArgumentStackToRestore(MF, *Exit))
332  return false;
333 
334  return true;
335 }
336 
337 /// Returns true if CSRs should be paired.
338 bool AArch64FrameLowering::producePairRegisters(MachineFunction &MF) const {
339  return produceCompactUnwindFrame(MF) || homogeneousPrologEpilog(MF);
340 }
341 
342 /// This is the biggest offset to the stack pointer we can encode in aarch64
343 /// instructions (without using a separate calculation and a temp register).
344 /// Note that the exception here are vector stores/loads which cannot encode any
345 /// displacements (see estimateRSStackSizeLimit(), isAArch64FrameOffsetLegal()).
346 static const unsigned DefaultSafeSPDisplacement = 255;
347 
348 /// Look at each instruction that references stack frames and return the stack
349 /// size limit beyond which some of these instructions will require a scratch
350 /// register during their expansion later.
352  // FIXME: For now, just conservatively guestimate based on unscaled indexing
353  // range. We'll end up allocating an unnecessary spill slot a lot, but
354  // realistically that's not a big deal at this stage of the game.
355  for (MachineBasicBlock &MBB : MF) {
356  for (MachineInstr &MI : MBB) {
357  if (MI.isDebugInstr() || MI.isPseudo() ||
358  MI.getOpcode() == AArch64::ADDXri ||
359  MI.getOpcode() == AArch64::ADDSXri)
360  continue;
361 
362  for (const MachineOperand &MO : MI.operands()) {
363  if (!MO.isFI())
364  continue;
365 
367  if (isAArch64FrameOffsetLegal(MI, Offset, nullptr, nullptr, nullptr) ==
369  return 0;
370  }
371  }
372  }
374 }
375 
379 }
380 
381 /// Returns the size of the fixed object area (allocated next to sp on entry)
382 /// On Win64 this may include a var args area and an UnwindHelp object for EH.
383 static unsigned getFixedObjectSize(const MachineFunction &MF,
384  const AArch64FunctionInfo *AFI, bool IsWin64,
385  bool IsFunclet) {
386  if (!IsWin64 || IsFunclet) {
387  return AFI->getTailCallReservedStack();
388  } else {
389  if (AFI->getTailCallReservedStack() != 0)
390  report_fatal_error("cannot generate ABI-changing tail call for Win64");
391  // Var args are stored here in the primary function.
392  const unsigned VarArgsArea = AFI->getVarArgsGPRSize();
393  // To support EH funclets we allocate an UnwindHelp object
394  const unsigned UnwindHelpObject = (MF.hasEHFunclets() ? 8 : 0);
395  return alignTo(VarArgsArea + UnwindHelpObject, 16);
396  }
397 }
398 
399 /// Returns the size of the entire SVE stackframe (calleesaves + spills).
402  return StackOffset::getScalable((int64_t)AFI->getStackSizeSVE());
403 }
404 
406  if (!EnableRedZone)
407  return false;
408 
409  // Don't use the red zone if the function explicitly asks us not to.
410  // This is typically used for kernel code.
411  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
412  const unsigned RedZoneSize =
413  Subtarget.getTargetLowering()->getRedZoneSize(MF.getFunction());
414  if (!RedZoneSize)
415  return false;
416 
417  const MachineFrameInfo &MFI = MF.getFrameInfo();
419  uint64_t NumBytes = AFI->getLocalStackSize();
420 
421  return !(MFI.hasCalls() || hasFP(MF) || NumBytes > RedZoneSize ||
422  getSVEStackSize(MF));
423 }
424 
425 /// hasFP - Return true if the specified function should have a dedicated frame
426 /// pointer register.
428  const MachineFrameInfo &MFI = MF.getFrameInfo();
429  const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
430  // Win64 EH requires a frame pointer if funclets are present, as the locals
431  // are accessed off the frame pointer in both the parent function and the
432  // funclets.
433  if (MF.hasEHFunclets())
434  return true;
435  // Retain behavior of always omitting the FP for leaf functions when possible.
437  return true;
438  if (MFI.hasVarSizedObjects() || MFI.isFrameAddressTaken() ||
439  MFI.hasStackMap() || MFI.hasPatchPoint() ||
440  RegInfo->hasStackRealignment(MF))
441  return true;
442  // With large callframes around we may need to use FP to access the scavenging
443  // emergency spillslot.
444  //
445  // Unfortunately some calls to hasFP() like machine verifier ->
446  // getReservedReg() -> hasFP in the middle of global isel are too early
447  // to know the max call frame size. Hopefully conservatively returning "true"
448  // in those cases is fine.
449  // DefaultSafeSPDisplacement is fine as we only emergency spill GP regs.
450  if (!MFI.isMaxCallFrameSizeComputed() ||
452  return true;
453 
454  return false;
455 }
456 
457 /// hasReservedCallFrame - Under normal circumstances, when a frame pointer is
458 /// not required, we reserve argument space for call sites in the function
459 /// immediately on entry to the current function. This eliminates the need for
460 /// add/sub sp brackets around call sites. Returns true if the call frame is
461 /// included as part of the stack frame.
462 bool
464  return !MF.getFrameInfo().hasVarSizedObjects();
465 }
466 
470  const AArch64InstrInfo *TII =
471  static_cast<const AArch64InstrInfo *>(MF.getSubtarget().getInstrInfo());
472  DebugLoc DL = I->getDebugLoc();
473  unsigned Opc = I->getOpcode();
474  bool IsDestroy = Opc == TII->getCallFrameDestroyOpcode();
475  uint64_t CalleePopAmount = IsDestroy ? I->getOperand(1).getImm() : 0;
476 
477  if (!hasReservedCallFrame(MF)) {
478  int64_t Amount = I->getOperand(0).getImm();
479  Amount = alignTo(Amount, getStackAlign());
480  if (!IsDestroy)
481  Amount = -Amount;
482 
483  // N.b. if CalleePopAmount is valid but zero (i.e. callee would pop, but it
484  // doesn't have to pop anything), then the first operand will be zero too so
485  // this adjustment is a no-op.
486  if (CalleePopAmount == 0) {
487  // FIXME: in-function stack adjustment for calls is limited to 24-bits
488  // because there's no guaranteed temporary register available.
489  //
490  // ADD/SUB (immediate) has only LSL #0 and LSL #12 available.
491  // 1) For offset <= 12-bit, we use LSL #0
492  // 2) For 12-bit <= offset <= 24-bit, we use two instructions. One uses
493  // LSL #0, and the other uses LSL #12.
494  //
495  // Most call frames will be allocated at the start of a function so
496  // this is OK, but it is a limitation that needs dealing with.
497  assert(Amount > -0xffffff && Amount < 0xffffff && "call frame too large");
498  emitFrameOffset(MBB, I, DL, AArch64::SP, AArch64::SP,
499  StackOffset::getFixed(Amount), TII);
500  }
501  } else if (CalleePopAmount != 0) {
502  // If the calling convention demands that the callee pops arguments from the
503  // stack, we want to add it back if we have a reserved call frame.
504  assert(CalleePopAmount < 0xffffff && "call frame too large");
505  emitFrameOffset(MBB, I, DL, AArch64::SP, AArch64::SP,
506  StackOffset::getFixed(-(int64_t)CalleePopAmount), TII);
507  }
508  return MBB.erase(I);
509 }
510 
511 void AArch64FrameLowering::emitCalleeSavedGPRLocations(
513  MachineFunction &MF = *MBB.getParent();
514  MachineFrameInfo &MFI = MF.getFrameInfo();
515 
516  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
517  if (CSI.empty())
518  return;
519 
520  const TargetSubtargetInfo &STI = MF.getSubtarget();
521  const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
522  const TargetInstrInfo &TII = *STI.getInstrInfo();
524 
525  for (const auto &Info : CSI) {
526  if (MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector)
527  continue;
528 
529  assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
530  unsigned DwarfReg = TRI.getDwarfRegNum(Info.getReg(), true);
531 
532  int64_t Offset =
533  MFI.getObjectOffset(Info.getFrameIdx()) - getOffsetOfLocalArea();
534  unsigned CFIIndex = MF.addFrameInst(
535  MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
536  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
537  .addCFIIndex(CFIIndex)
539  }
540 }
541 
542 void AArch64FrameLowering::emitCalleeSavedSVELocations(
544  MachineFunction &MF = *MBB.getParent();
545  MachineFrameInfo &MFI = MF.getFrameInfo();
546 
547  // Add callee saved registers to move list.
548  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
549  if (CSI.empty())
550  return;
551 
552  const TargetSubtargetInfo &STI = MF.getSubtarget();
553  const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
554  const TargetInstrInfo &TII = *STI.getInstrInfo();
557 
558  for (const auto &Info : CSI) {
559  if (!(MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
560  continue;
561 
562  // Not all unwinders may know about SVE registers, so assume the lowest
563  // common demoninator.
564  assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
565  unsigned Reg = Info.getReg();
566  if (!static_cast<const AArch64RegisterInfo &>(TRI).regNeedsCFI(Reg, Reg))
567  continue;
568 
570  StackOffset::getScalable(MFI.getObjectOffset(Info.getFrameIdx())) -
572 
573  unsigned CFIIndex = MF.addFrameInst(createCFAOffset(TRI, Reg, Offset));
574  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
575  .addCFIIndex(CFIIndex)
577  }
578 }
579 
580 static void insertCFISameValue(const MCInstrDesc &Desc, MachineFunction &MF,
583  unsigned DwarfReg) {
584  unsigned CFIIndex =
585  MF.addFrameInst(MCCFIInstruction::createSameValue(nullptr, DwarfReg));
586  BuildMI(MBB, InsertPt, DebugLoc(), Desc).addCFIIndex(CFIIndex);
587 }
588 
590  MachineBasicBlock &MBB) const {
591 
592  MachineFunction &MF = *MBB.getParent();
593  const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
594  const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
595  const auto &TRI =
596  static_cast<const AArch64RegisterInfo &>(*Subtarget.getRegisterInfo());
597  const auto &MFI = *MF.getInfo<AArch64FunctionInfo>();
598 
599  const MCInstrDesc &CFIDesc = TII.get(TargetOpcode::CFI_INSTRUCTION);
600  DebugLoc DL;
601 
602  // Reset the CFA to `SP + 0`.
604  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::cfiDefCfa(
605  nullptr, TRI.getDwarfRegNum(AArch64::SP, true), 0));
606  BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
607 
608  // Flip the RA sign state.
609  if (MFI.shouldSignReturnAddress(MF)) {
610  CFIIndex = MF.addFrameInst(MCCFIInstruction::createNegateRAState(nullptr));
611  BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
612  }
613 
614  // Shadow call stack uses X18, reset it.
616  insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
617  TRI.getDwarfRegNum(AArch64::X18, true));
618 
619  // Emit .cfi_same_value for callee-saved registers.
620  const std::vector<CalleeSavedInfo> &CSI =
622  for (const auto &Info : CSI) {
623  unsigned Reg = Info.getReg();
624  if (!TRI.regNeedsCFI(Reg, Reg))
625  continue;
626  insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
627  TRI.getDwarfRegNum(Reg, true));
628  }
629 }
630 
633  bool SVE) {
634  MachineFunction &MF = *MBB.getParent();
635  MachineFrameInfo &MFI = MF.getFrameInfo();
636 
637  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
638  if (CSI.empty())
639  return;
640 
641  const TargetSubtargetInfo &STI = MF.getSubtarget();
642  const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
643  const TargetInstrInfo &TII = *STI.getInstrInfo();
645 
646  for (const auto &Info : CSI) {
647  if (SVE !=
648  (MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
649  continue;
650 
651  unsigned Reg = Info.getReg();
652  if (SVE &&
653  !static_cast<const AArch64RegisterInfo &>(TRI).regNeedsCFI(Reg, Reg))
654  continue;
655 
656  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
657  nullptr, TRI.getDwarfRegNum(Info.getReg(), true)));
658  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
659  .addCFIIndex(CFIIndex)
661  }
662 }
663 
664 void AArch64FrameLowering::emitCalleeSavedGPRRestores(
667 }
668 
669 void AArch64FrameLowering::emitCalleeSavedSVERestores(
672 }
673 
675  switch (Reg.id()) {
676  default:
677  // The called routine is expected to preserve r19-r28
678  // r29 and r30 are used as frame pointer and link register resp.
679  return 0;
680 
681  // GPRs
682 #define CASE(n) \
683  case AArch64::W##n: \
684  case AArch64::X##n: \
685  return AArch64::X##n
686  CASE(0);
687  CASE(1);
688  CASE(2);
689  CASE(3);
690  CASE(4);
691  CASE(5);
692  CASE(6);
693  CASE(7);
694  CASE(8);
695  CASE(9);
696  CASE(10);
697  CASE(11);
698  CASE(12);
699  CASE(13);
700  CASE(14);
701  CASE(15);
702  CASE(16);
703  CASE(17);
704  CASE(18);
705 #undef CASE
706 
707  // FPRs
708 #define CASE(n) \
709  case AArch64::B##n: \
710  case AArch64::H##n: \
711  case AArch64::S##n: \
712  case AArch64::D##n: \
713  case AArch64::Q##n: \
714  return HasSVE ? AArch64::Z##n : AArch64::Q##n
715  CASE(0);
716  CASE(1);
717  CASE(2);
718  CASE(3);
719  CASE(4);
720  CASE(5);
721  CASE(6);
722  CASE(7);
723  CASE(8);
724  CASE(9);
725  CASE(10);
726  CASE(11);
727  CASE(12);
728  CASE(13);
729  CASE(14);
730  CASE(15);
731  CASE(16);
732  CASE(17);
733  CASE(18);
734  CASE(19);
735  CASE(20);
736  CASE(21);
737  CASE(22);
738  CASE(23);
739  CASE(24);
740  CASE(25);
741  CASE(26);
742  CASE(27);
743  CASE(28);
744  CASE(29);
745  CASE(30);
746  CASE(31);
747 #undef CASE
748  }
749 }
750 
751 void AArch64FrameLowering::emitZeroCallUsedRegs(BitVector RegsToZero,
752  MachineBasicBlock &MBB) const {
753  // Insertion point.
755 
756  // Fake a debug loc.
757  DebugLoc DL;
758  if (MBBI != MBB.end())
759  DL = MBBI->getDebugLoc();
760 
761  const MachineFunction &MF = *MBB.getParent();
763  const AArch64RegisterInfo &TRI = *STI.getRegisterInfo();
764 
765  BitVector GPRsToZero(TRI.getNumRegs());
766  BitVector FPRsToZero(TRI.getNumRegs());
767  bool HasSVE = STI.hasSVE();
768  for (MCRegister Reg : RegsToZero.set_bits()) {
769  if (TRI.isGeneralPurposeRegister(MF, Reg)) {
770  // For GPRs, we only care to clear out the 64-bit register.
771  if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
772  GPRsToZero.set(XReg);
773  } else if (AArch64::FPR128RegClass.contains(Reg) ||
774  AArch64::FPR64RegClass.contains(Reg) ||
775  AArch64::FPR32RegClass.contains(Reg) ||
776  AArch64::FPR16RegClass.contains(Reg) ||
777  AArch64::FPR8RegClass.contains(Reg)) {
778  // For FPRs,
779  if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
780  FPRsToZero.set(XReg);
781  }
782  }
783 
784  const AArch64InstrInfo &TII = *STI.getInstrInfo();
785 
786  // Zero out GPRs.
787  for (MCRegister Reg : GPRsToZero.set_bits())
788  BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVi64imm), Reg).addImm(0);
789 
790  // Zero out FP/vector registers.
791  for (MCRegister Reg : FPRsToZero.set_bits())
792  if (HasSVE)
793  BuildMI(MBB, MBBI, DL, TII.get(AArch64::DUP_ZI_D), Reg)
794  .addImm(0)
795  .addImm(0);
796  else
797  BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVIv2d_ns), Reg).addImm(0);
798 
799  if (HasSVE) {
800  for (MCRegister PReg :
801  {AArch64::P0, AArch64::P1, AArch64::P2, AArch64::P3, AArch64::P4,
802  AArch64::P5, AArch64::P6, AArch64::P7, AArch64::P8, AArch64::P9,
803  AArch64::P10, AArch64::P11, AArch64::P12, AArch64::P13, AArch64::P14,
804  AArch64::P15}) {
805  if (RegsToZero[PReg])
806  BuildMI(MBB, MBBI, DL, TII.get(AArch64::PFALSE), PReg);
807  }
808  }
809 }
810 
811 // Find a scratch register that we can use at the start of the prologue to
812 // re-align the stack pointer. We avoid using callee-save registers since they
813 // may appear to be free when this is called from canUseAsPrologue (during
814 // shrink wrapping), but then no longer be free when this is called from
815 // emitPrologue.
816 //
817 // FIXME: This is a bit conservative, since in the above case we could use one
818 // of the callee-save registers as a scratch temp to re-align the stack pointer,
819 // but we would then have to make sure that we were in fact saving at least one
820 // callee-save register in the prologue, which is additional complexity that
821 // doesn't seem worth the benefit.
823  MachineFunction *MF = MBB->getParent();
824 
825  // If MBB is an entry block, use X9 as the scratch register
826  if (&MF->front() == MBB)
827  return AArch64::X9;
828 
829  const AArch64Subtarget &Subtarget = MF->getSubtarget<AArch64Subtarget>();
830  const AArch64RegisterInfo &TRI = *Subtarget.getRegisterInfo();
831  LivePhysRegs LiveRegs(TRI);
832  LiveRegs.addLiveIns(*MBB);
833 
834  // Mark callee saved registers as used so we will not choose them.
835  const MCPhysReg *CSRegs = MF->getRegInfo().getCalleeSavedRegs();
836  for (unsigned i = 0; CSRegs[i]; ++i)
837  LiveRegs.addReg(CSRegs[i]);
838 
839  // Prefer X9 since it was historically used for the prologue scratch reg.
840  const MachineRegisterInfo &MRI = MF->getRegInfo();
841  if (LiveRegs.available(MRI, AArch64::X9))
842  return AArch64::X9;
843 
844  for (unsigned Reg : AArch64::GPR64RegClass) {
845  if (LiveRegs.available(MRI, Reg))
846  return Reg;
847  }
848  return AArch64::NoRegister;
849 }
850 
852  const MachineBasicBlock &MBB) const {
853  const MachineFunction *MF = MBB.getParent();
854  MachineBasicBlock *TmpMBB = const_cast<MachineBasicBlock *>(&MBB);
855  const AArch64Subtarget &Subtarget = MF->getSubtarget<AArch64Subtarget>();
856  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
857 
858  // Don't need a scratch register if we're not going to re-align the stack.
859  if (!RegInfo->hasStackRealignment(*MF))
860  return true;
861  // Otherwise, we can use any block as long as it has a scratch register
862  // available.
863  return findScratchNonCalleeSaveRegister(TmpMBB) != AArch64::NoRegister;
864 }
865 
867  uint64_t StackSizeInBytes) {
868  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
869  if (!Subtarget.isTargetWindows())
870  return false;
871  const Function &F = MF.getFunction();
872  // TODO: When implementing stack protectors, take that into account
873  // for the probe threshold.
874  unsigned StackProbeSize =
875  F.getFnAttributeAsParsedInteger("stack-probe-size", 4096);
876  return (StackSizeInBytes >= StackProbeSize) &&
877  !F.hasFnAttribute("no-stack-arg-probe");
878 }
879 
880 static bool needsWinCFI(const MachineFunction &MF) {
881  const Function &F = MF.getFunction();
882  return MF.getTarget().getMCAsmInfo()->usesWindowsCFI() &&
883  F.needsUnwindTableEntry();
884 }
885 
886 bool AArch64FrameLowering::shouldCombineCSRLocalStackBump(
887  MachineFunction &MF, uint64_t StackBumpBytes) const {
889  const MachineFrameInfo &MFI = MF.getFrameInfo();
890  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
891  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
892  if (homogeneousPrologEpilog(MF))
893  return false;
894 
895  if (AFI->getLocalStackSize() == 0)
896  return false;
897 
898  // For WinCFI, if optimizing for size, prefer to not combine the stack bump
899  // (to force a stp with predecrement) to match the packed unwind format,
900  // provided that there actually are any callee saved registers to merge the
901  // decrement with.
902  // This is potentially marginally slower, but allows using the packed
903  // unwind format for functions that both have a local area and callee saved
904  // registers. Using the packed unwind format notably reduces the size of
905  // the unwind info.
906  if (needsWinCFI(MF) && AFI->getCalleeSavedStackSize() > 0 &&
907  MF.getFunction().hasOptSize())
908  return false;
909 
910  // 512 is the maximum immediate for stp/ldp that will be used for
911  // callee-save save/restores
912  if (StackBumpBytes >= 512 || windowsRequiresStackProbe(MF, StackBumpBytes))
913  return false;
914 
915  if (MFI.hasVarSizedObjects())
916  return false;
917 
918  if (RegInfo->hasStackRealignment(MF))
919  return false;
920 
921  // This isn't strictly necessary, but it simplifies things a bit since the
922  // current RedZone handling code assumes the SP is adjusted by the
923  // callee-save save/restore code.
924  if (canUseRedZone(MF))
925  return false;
926 
927  // When there is an SVE area on the stack, always allocate the
928  // callee-saves and spills/locals separately.
929  if (getSVEStackSize(MF))
930  return false;
931 
932  return true;
933 }
934 
935 bool AArch64FrameLowering::shouldCombineCSRLocalStackBumpInEpilogue(
936  MachineBasicBlock &MBB, unsigned StackBumpBytes) const {
937  if (!shouldCombineCSRLocalStackBump(*MBB.getParent(), StackBumpBytes))
938  return false;
939 
940  if (MBB.empty())
941  return true;
942 
943  // Disable combined SP bump if the last instruction is an MTE tag store. It
944  // is almost always better to merge SP adjustment into those instructions.
947  while (LastI != Begin) {
948  --LastI;
949  if (LastI->isTransient())
950  continue;
951  if (!LastI->getFlag(MachineInstr::FrameDestroy))
952  break;
953  }
954  switch (LastI->getOpcode()) {
955  case AArch64::STGloop:
956  case AArch64::STZGloop:
957  case AArch64::STGOffset:
958  case AArch64::STZGOffset:
959  case AArch64::ST2GOffset:
960  case AArch64::STZ2GOffset:
961  return false;
962  default:
963  return true;
964  }
965  llvm_unreachable("unreachable");
966 }
967 
968 // Given a load or a store instruction, generate an appropriate unwinding SEH
969 // code on Windows.
971  const TargetInstrInfo &TII,
973  unsigned Opc = MBBI->getOpcode();
975  MachineFunction &MF = *MBB->getParent();
976  DebugLoc DL = MBBI->getDebugLoc();
977  unsigned ImmIdx = MBBI->getNumOperands() - 1;
978  int Imm = MBBI->getOperand(ImmIdx).getImm();
980  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
981  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
982 
983  switch (Opc) {
984  default:
985  llvm_unreachable("No SEH Opcode for this instruction");
986  case AArch64::LDPDpost:
987  Imm = -Imm;
988  [[fallthrough]];
989  case AArch64::STPDpre: {
990  unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
991  unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(2).getReg());
992  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFRegP_X))
993  .addImm(Reg0)
994  .addImm(Reg1)
995  .addImm(Imm * 8)
996  .setMIFlag(Flag);
997  break;
998  }
999  case AArch64::LDPXpost:
1000  Imm = -Imm;
1001  [[fallthrough]];
1002  case AArch64::STPXpre: {
1003  Register Reg0 = MBBI->getOperand(1).getReg();
1004  Register Reg1 = MBBI->getOperand(2).getReg();
1005  if (Reg0 == AArch64::FP && Reg1 == AArch64::LR)
1006  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFPLR_X))
1007  .addImm(Imm * 8)
1008  .setMIFlag(Flag);
1009  else
1010  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveRegP_X))
1011  .addImm(RegInfo->getSEHRegNum(Reg0))
1012  .addImm(RegInfo->getSEHRegNum(Reg1))
1013  .addImm(Imm * 8)
1014  .setMIFlag(Flag);
1015  break;
1016  }
1017  case AArch64::LDRDpost:
1018  Imm = -Imm;
1019  [[fallthrough]];
1020  case AArch64::STRDpre: {
1021  unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1022  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFReg_X))
1023  .addImm(Reg)
1024  .addImm(Imm)
1025  .setMIFlag(Flag);
1026  break;
1027  }
1028  case AArch64::LDRXpost:
1029  Imm = -Imm;
1030  [[fallthrough]];
1031  case AArch64::STRXpre: {
1032  unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1033  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveReg_X))
1034  .addImm(Reg)
1035  .addImm(Imm)
1036  .setMIFlag(Flag);
1037  break;
1038  }
1039  case AArch64::STPDi:
1040  case AArch64::LDPDi: {
1041  unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1042  unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1043  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFRegP))
1044  .addImm(Reg0)
1045  .addImm(Reg1)
1046  .addImm(Imm * 8)
1047  .setMIFlag(Flag);
1048  break;
1049  }
1050  case AArch64::STPXi:
1051  case AArch64::LDPXi: {
1052  Register Reg0 = MBBI->getOperand(0).getReg();
1053  Register Reg1 = MBBI->getOperand(1).getReg();
1054  if (Reg0 == AArch64::FP && Reg1 == AArch64::LR)
1055  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFPLR))
1056  .addImm(Imm * 8)
1057  .setMIFlag(Flag);
1058  else
1059  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveRegP))
1060  .addImm(RegInfo->getSEHRegNum(Reg0))
1061  .addImm(RegInfo->getSEHRegNum(Reg1))
1062  .addImm(Imm * 8)
1063  .setMIFlag(Flag);
1064  break;
1065  }
1066  case AArch64::STRXui:
1067  case AArch64::LDRXui: {
1068  int Reg = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1069  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveReg))
1070  .addImm(Reg)
1071  .addImm(Imm * 8)
1072  .setMIFlag(Flag);
1073  break;
1074  }
1075  case AArch64::STRDui:
1076  case AArch64::LDRDui: {
1077  unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1078  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFReg))
1079  .addImm(Reg)
1080  .addImm(Imm * 8)
1081  .setMIFlag(Flag);
1082  break;
1083  }
1084  }
1085  auto I = MBB->insertAfter(MBBI, MIB);
1086  return I;
1087 }
1088 
1089 // Fix up the SEH opcode associated with the save/restore instruction.
1091  unsigned LocalStackSize) {
1092  MachineOperand *ImmOpnd = nullptr;
1093  unsigned ImmIdx = MBBI->getNumOperands() - 1;
1094  switch (MBBI->getOpcode()) {
1095  default:
1096  llvm_unreachable("Fix the offset in the SEH instruction");
1097  case AArch64::SEH_SaveFPLR:
1098  case AArch64::SEH_SaveRegP:
1099  case AArch64::SEH_SaveReg:
1100  case AArch64::SEH_SaveFRegP:
1101  case AArch64::SEH_SaveFReg:
1102  ImmOpnd = &MBBI->getOperand(ImmIdx);
1103  break;
1104  }
1105  if (ImmOpnd)
1106  ImmOpnd->setImm(ImmOpnd->getImm() + LocalStackSize);
1107 }
1108 
1109 // Convert callee-save register save/restore instruction to do stack pointer
1110 // decrement/increment to allocate/deallocate the callee-save stack area by
1111 // converting store/load to use pre/post increment version.
1114  const DebugLoc &DL, const TargetInstrInfo *TII, int CSStackSizeInc,
1115  bool NeedsWinCFI, bool *HasWinCFI, bool EmitCFI,
1117  int CFAOffset = 0) {
1118  unsigned NewOpc;
1119  switch (MBBI->getOpcode()) {
1120  default:
1121  llvm_unreachable("Unexpected callee-save save/restore opcode!");
1122  case AArch64::STPXi:
1123  NewOpc = AArch64::STPXpre;
1124  break;
1125  case AArch64::STPDi:
1126  NewOpc = AArch64::STPDpre;
1127  break;
1128  case AArch64::STPQi:
1129  NewOpc = AArch64::STPQpre;
1130  break;
1131  case AArch64::STRXui:
1132  NewOpc = AArch64::STRXpre;
1133  break;
1134  case AArch64::STRDui:
1135  NewOpc = AArch64::STRDpre;
1136  break;
1137  case AArch64::STRQui:
1138  NewOpc = AArch64::STRQpre;
1139  break;
1140  case AArch64::LDPXi:
1141  NewOpc = AArch64::LDPXpost;
1142  break;
1143  case AArch64::LDPDi:
1144  NewOpc = AArch64::LDPDpost;
1145  break;
1146  case AArch64::LDPQi:
1147  NewOpc = AArch64::LDPQpost;
1148  break;
1149  case AArch64::LDRXui:
1150  NewOpc = AArch64::LDRXpost;
1151  break;
1152  case AArch64::LDRDui:
1153  NewOpc = AArch64::LDRDpost;
1154  break;
1155  case AArch64::LDRQui:
1156  NewOpc = AArch64::LDRQpost;
1157  break;
1158  }
1159  // Get rid of the SEH code associated with the old instruction.
1160  if (NeedsWinCFI) {
1161  auto SEH = std::next(MBBI);
1163  SEH->eraseFromParent();
1164  }
1165 
1166  TypeSize Scale = TypeSize::Fixed(1);
1167  unsigned Width;
1168  int64_t MinOffset, MaxOffset;
1169  bool Success = static_cast<const AArch64InstrInfo *>(TII)->getMemOpInfo(
1170  NewOpc, Scale, Width, MinOffset, MaxOffset);
1171  (void)Success;
1172  assert(Success && "unknown load/store opcode");
1173 
1174  // If the first store isn't right where we want SP then we can't fold the
1175  // update in so create a normal arithmetic instruction instead.
1176  MachineFunction &MF = *MBB.getParent();
1177  if (MBBI->getOperand(MBBI->getNumOperands() - 1).getImm() != 0 ||
1178  CSStackSizeInc < MinOffset || CSStackSizeInc > MaxOffset) {
1179  emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1180  StackOffset::getFixed(CSStackSizeInc), TII, FrameFlag,
1181  false, false, nullptr, EmitCFI,
1182  StackOffset::getFixed(CFAOffset));
1183 
1184  return std::prev(MBBI);
1185  }
1186 
1187  MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII->get(NewOpc));
1188  MIB.addReg(AArch64::SP, RegState::Define);
1189 
1190  // Copy all operands other than the immediate offset.
1191  unsigned OpndIdx = 0;
1192  for (unsigned OpndEnd = MBBI->getNumOperands() - 1; OpndIdx < OpndEnd;
1193  ++OpndIdx)
1194  MIB.add(MBBI->getOperand(OpndIdx));
1195 
1196  assert(MBBI->getOperand(OpndIdx).getImm() == 0 &&
1197  "Unexpected immediate offset in first/last callee-save save/restore "
1198  "instruction!");
1199  assert(MBBI->getOperand(OpndIdx - 1).getReg() == AArch64::SP &&
1200  "Unexpected base register in callee-save save/restore instruction!");
1201  assert(CSStackSizeInc % Scale == 0);
1202  MIB.addImm(CSStackSizeInc / (int)Scale);
1203 
1204  MIB.setMIFlags(MBBI->getFlags());
1205  MIB.setMemRefs(MBBI->memoperands());
1206 
1207  // Generate a new SEH code that corresponds to the new instruction.
1208  if (NeedsWinCFI) {
1209  *HasWinCFI = true;
1210  InsertSEH(*MIB, *TII, FrameFlag);
1211  }
1212 
1213  if (EmitCFI) {
1214  unsigned CFIIndex = MF.addFrameInst(
1215  MCCFIInstruction::cfiDefCfaOffset(nullptr, CFAOffset - CSStackSizeInc));
1216  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1217  .addCFIIndex(CFIIndex)
1218  .setMIFlags(FrameFlag);
1219  }
1220 
1221  return std::prev(MBB.erase(MBBI));
1222 }
1223 
1224 // Fixup callee-save register save/restore instructions to take into account
1225 // combined SP bump by adding the local stack size to the stack offsets.
1227  uint64_t LocalStackSize,
1228  bool NeedsWinCFI,
1229  bool *HasWinCFI) {
1231  return;
1232 
1233  unsigned Opc = MI.getOpcode();
1234  unsigned Scale;
1235  switch (Opc) {
1236  case AArch64::STPXi:
1237  case AArch64::STRXui:
1238  case AArch64::STPDi:
1239  case AArch64::STRDui:
1240  case AArch64::LDPXi:
1241  case AArch64::LDRXui:
1242  case AArch64::LDPDi:
1243  case AArch64::LDRDui:
1244  Scale = 8;
1245  break;
1246  case AArch64::STPQi:
1247  case AArch64::STRQui:
1248  case AArch64::LDPQi:
1249  case AArch64::LDRQui:
1250  Scale = 16;
1251  break;
1252  default:
1253  llvm_unreachable("Unexpected callee-save save/restore opcode!");
1254  }
1255 
1256  unsigned OffsetIdx = MI.getNumExplicitOperands() - 1;
1257  assert(MI.getOperand(OffsetIdx - 1).getReg() == AArch64::SP &&
1258  "Unexpected base register in callee-save save/restore instruction!");
1259  // Last operand is immediate offset that needs fixing.
1260  MachineOperand &OffsetOpnd = MI.getOperand(OffsetIdx);
1261  // All generated opcodes have scaled offsets.
1262  assert(LocalStackSize % Scale == 0);
1263  OffsetOpnd.setImm(OffsetOpnd.getImm() + LocalStackSize / Scale);
1264 
1265  if (NeedsWinCFI) {
1266  *HasWinCFI = true;
1267  auto MBBI = std::next(MachineBasicBlock::iterator(MI));
1268  assert(MBBI != MI.getParent()->end() && "Expecting a valid instruction");
1270  "Expecting a SEH instruction");
1271  fixupSEHOpcode(MBBI, LocalStackSize);
1272  }
1273 }
1274 
1275 static bool isTargetWindows(const MachineFunction &MF) {
1277 }
1278 
1279 // Convenience function to determine whether I is an SVE callee save.
1281  switch (I->getOpcode()) {
1282  default:
1283  return false;
1284  case AArch64::STR_ZXI:
1285  case AArch64::STR_PXI:
1286  case AArch64::LDR_ZXI:
1287  case AArch64::LDR_PXI:
1288  return I->getFlag(MachineInstr::FrameSetup) ||
1289  I->getFlag(MachineInstr::FrameDestroy);
1290  }
1291 }
1292 
1294  if (!(llvm::any_of(
1296  [](const auto &Info) { return Info.getReg() == AArch64::LR; }) &&
1297  MF.getFunction().hasFnAttribute(Attribute::ShadowCallStack)))
1298  return false;
1299 
1301  report_fatal_error("Must reserve x18 to use shadow call stack");
1302 
1303  return true;
1304 }
1305 
1307  MachineFunction &MF,
1310  const DebugLoc &DL, bool NeedsWinCFI,
1311  bool NeedsUnwindInfo) {
1312  // Shadow call stack prolog: str x30, [x18], #8
1313  BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXpost))
1314  .addReg(AArch64::X18, RegState::Define)
1315  .addReg(AArch64::LR)
1316  .addReg(AArch64::X18)
1317  .addImm(8)
1319 
1320  // This instruction also makes x18 live-in to the entry block.
1321  MBB.addLiveIn(AArch64::X18);
1322 
1323  if (NeedsWinCFI)
1324  BuildMI(MBB, MBBI, DL, TII.get(AArch64::SEH_Nop))
1326 
1327  if (NeedsUnwindInfo) {
1328  // Emit a CFI instruction that causes 8 to be subtracted from the value of
1329  // x18 when unwinding past this frame.
1330  static const char CFIInst[] = {
1331  dwarf::DW_CFA_val_expression,
1332  18, // register
1333  2, // length
1334  static_cast<char>(unsigned(dwarf::DW_OP_breg18)),
1335  static_cast<char>(-8) & 0x7f, // addend (sleb128)
1336  };
1337  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createEscape(
1338  nullptr, StringRef(CFIInst, sizeof(CFIInst))));
1339  BuildMI(MBB, MBBI, DL, TII.get(AArch64::CFI_INSTRUCTION))
1340  .addCFIIndex(CFIIndex)
1342  }
1343 }
1344 
1346  MachineFunction &MF,
1349  const DebugLoc &DL) {
1350  // Shadow call stack epilog: ldr x30, [x18, #-8]!
1351  BuildMI(MBB, MBBI, DL, TII.get(AArch64::LDRXpre))
1352  .addReg(AArch64::X18, RegState::Define)
1353  .addReg(AArch64::LR, RegState::Define)
1354  .addReg(AArch64::X18)
1355  .addImm(-8)
1357 
1359  unsigned CFIIndex =
1361  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
1362  .addCFIIndex(CFIIndex)
1364  }
1365 }
1366 
1368  MachineBasicBlock &MBB) const {
1370  const MachineFrameInfo &MFI = MF.getFrameInfo();
1371  const Function &F = MF.getFunction();
1372  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1373  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
1374  const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1375  MachineModuleInfo &MMI = MF.getMMI();
1377  bool EmitCFI = AFI->needsDwarfUnwindInfo(MF);
1378  bool HasFP = hasFP(MF);
1379  bool NeedsWinCFI = needsWinCFI(MF);
1380  bool HasWinCFI = false;
1381  auto Cleanup = make_scope_exit([&]() { MF.setHasWinCFI(HasWinCFI); });
1382 
1383  bool IsFunclet = MBB.isEHFuncletEntry();
1384 
1385  // At this point, we're going to decide whether or not the function uses a
1386  // redzone. In most cases, the function doesn't have a redzone so let's
1387  // assume that's false and set it to true in the case that there's a redzone.
1388  AFI->setHasRedZone(false);
1389 
1390  // Debug location must be unknown since the first debug location is used
1391  // to determine the end of the prologue.
1392  DebugLoc DL;
1393 
1394  const auto &MFnI = *MF.getInfo<AArch64FunctionInfo>();
1396  emitShadowCallStackPrologue(*TII, MF, MBB, MBBI, DL, NeedsWinCFI,
1397  MFnI.needsDwarfUnwindInfo(MF));
1398 
1399  if (MFnI.shouldSignReturnAddress(MF)) {
1400  if (MFnI.shouldSignWithBKey()) {
1401  BuildMI(MBB, MBBI, DL, TII->get(AArch64::EMITBKEY))
1403  }
1404 
1405  // No SEH opcode for this one; it doesn't materialize into an
1406  // instruction on Windows.
1407  BuildMI(MBB, MBBI, DL,
1408  TII->get(MFnI.shouldSignWithBKey() ? AArch64::PACIBSP
1409  : AArch64::PACIASP))
1411 
1412  if (EmitCFI) {
1413  unsigned CFIIndex =
1415  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1416  .addCFIIndex(CFIIndex)
1418  } else if (NeedsWinCFI) {
1419  HasWinCFI = true;
1420  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PACSignLR))
1422  }
1423  }
1424  if (EmitCFI && MFnI.isMTETagged()) {
1425  BuildMI(MBB, MBBI, DL, TII->get(AArch64::EMITMTETAGGED))
1427  }
1428 
1429  // We signal the presence of a Swift extended frame to external tools by
1430  // storing FP with 0b0001 in bits 63:60. In normal userland operation a simple
1431  // ORR is sufficient, it is assumed a Swift kernel would initialize the TBI
1432  // bits so that is still true.
1433  if (HasFP && AFI->hasSwiftAsyncContext()) {
1434  switch (MF.getTarget().Options.SwiftAsyncFramePointer) {
1436  if (Subtarget.swiftAsyncContextIsDynamicallySet()) {
1437  // The special symbol below is absolute and has a *value* that can be
1438  // combined with the frame pointer to signal an extended frame.
1439  BuildMI(MBB, MBBI, DL, TII->get(AArch64::LOADgot), AArch64::X16)
1440  .addExternalSymbol("swift_async_extendedFramePointerFlags",
1442  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrs), AArch64::FP)
1443  .addUse(AArch64::FP)
1444  .addUse(AArch64::X16)
1445  .addImm(Subtarget.isTargetILP32() ? 32 : 0);
1446  break;
1447  }
1448  [[fallthrough]];
1449 
1451  // ORR x29, x29, #0x1000_0000_0000_0000
1452  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXri), AArch64::FP)
1453  .addUse(AArch64::FP)
1454  .addImm(0x1100)
1456  break;
1457 
1459  break;
1460  }
1461  }
1462 
1463  // All calls are tail calls in GHC calling conv, and functions have no
1464  // prologue/epilogue.
1466  return;
1467 
1468  // Set tagged base pointer to the requested stack slot.
1469  // Ideally it should match SP value after prologue.
1470  std::optional<int> TBPI = AFI->getTaggedBasePointerIndex();
1471  if (TBPI)
1472  AFI->setTaggedBasePointerOffset(-MFI.getObjectOffset(*TBPI));
1473  else
1475 
1476  const StackOffset &SVEStackSize = getSVEStackSize(MF);
1477 
1478  // getStackSize() includes all the locals in its size calculation. We don't
1479  // include these locals when computing the stack size of a funclet, as they
1480  // are allocated in the parent's stack frame and accessed via the frame
1481  // pointer from the funclet. We only save the callee saved registers in the
1482  // funclet, which are really the callee saved registers of the parent
1483  // function, including the funclet.
1484  int64_t NumBytes = IsFunclet ? getWinEHFuncletFrameSize(MF)
1485  : MFI.getStackSize();
1486  if (!AFI->hasStackFrame() && !windowsRequiresStackProbe(MF, NumBytes)) {
1487  assert(!HasFP && "unexpected function without stack frame but with FP");
1488  assert(!SVEStackSize &&
1489  "unexpected function without stack frame but with SVE objects");
1490  // All of the stack allocation is for locals.
1491  AFI->setLocalStackSize(NumBytes);
1492  if (!NumBytes)
1493  return;
1494  // REDZONE: If the stack size is less than 128 bytes, we don't need
1495  // to actually allocate.
1496  if (canUseRedZone(MF)) {
1497  AFI->setHasRedZone(true);
1498  ++NumRedZoneFunctions;
1499  } else {
1500  emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1501  StackOffset::getFixed(-NumBytes), TII,
1502  MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI);
1503  if (EmitCFI) {
1504  // Label used to tie together the PROLOG_LABEL and the MachineMoves.
1505  MCSymbol *FrameLabel = MMI.getContext().createTempSymbol();
1506  // Encode the stack size of the leaf function.
1507  unsigned CFIIndex = MF.addFrameInst(
1508  MCCFIInstruction::cfiDefCfaOffset(FrameLabel, NumBytes));
1509  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1510  .addCFIIndex(CFIIndex)
1512  }
1513  }
1514 
1515  if (NeedsWinCFI) {
1516  HasWinCFI = true;
1517  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1519  }
1520 
1521  return;
1522  }
1523 
1524  bool IsWin64 =
1525  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
1526  unsigned FixedObject = getFixedObjectSize(MF, AFI, IsWin64, IsFunclet);
1527 
1528  auto PrologueSaveSize = AFI->getCalleeSavedStackSize() + FixedObject;
1529  // All of the remaining stack allocations are for locals.
1530  AFI->setLocalStackSize(NumBytes - PrologueSaveSize);
1531  bool CombineSPBump = shouldCombineCSRLocalStackBump(MF, NumBytes);
1532  bool HomPrologEpilog = homogeneousPrologEpilog(MF);
1533  if (CombineSPBump) {
1534  assert(!SVEStackSize && "Cannot combine SP bump with SVE");
1535  emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1536  StackOffset::getFixed(-NumBytes), TII,
1537  MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI,
1538  EmitCFI);
1539  NumBytes = 0;
1540  } else if (HomPrologEpilog) {
1541  // Stack has been already adjusted.
1542  NumBytes -= PrologueSaveSize;
1543  } else if (PrologueSaveSize != 0) {
1545  MBB, MBBI, DL, TII, -PrologueSaveSize, NeedsWinCFI, &HasWinCFI,
1546  EmitCFI);
1547  NumBytes -= PrologueSaveSize;
1548  }
1549  assert(NumBytes >= 0 && "Negative stack allocation size!?");
1550 
1551  // Move past the saves of the callee-saved registers, fixing up the offsets
1552  // and pre-inc if we decided to combine the callee-save and local stack
1553  // pointer bump above.
1555  while (MBBI != End && MBBI->getFlag(MachineInstr::FrameSetup) &&
1556  !IsSVECalleeSave(MBBI)) {
1557  if (CombineSPBump)
1559  NeedsWinCFI, &HasWinCFI);
1560  ++MBBI;
1561  }
1562 
1563  // For funclets the FP belongs to the containing function.
1564  if (!IsFunclet && HasFP) {
1565  // Only set up FP if we actually need to.
1566  int64_t FPOffset = AFI->getCalleeSaveBaseToFrameRecordOffset();
1567 
1568  if (CombineSPBump)
1569  FPOffset += AFI->getLocalStackSize();
1570 
1571  if (AFI->hasSwiftAsyncContext()) {
1572  // Before we update the live FP we have to ensure there's a valid (or
1573  // null) asynchronous context in its slot just before FP in the frame
1574  // record, so store it now.
1575  const auto &Attrs = MF.getFunction().getAttributes();
1576  bool HaveInitialContext = Attrs.hasAttrSomewhere(Attribute::SwiftAsync);
1577  if (HaveInitialContext)
1578  MBB.addLiveIn(AArch64::X22);
1579  BuildMI(MBB, MBBI, DL, TII->get(AArch64::StoreSwiftAsyncContext))
1580  .addUse(HaveInitialContext ? AArch64::X22 : AArch64::XZR)
1581  .addUse(AArch64::SP)
1582  .addImm(FPOffset - 8)
1584  }
1585 
1586  if (HomPrologEpilog) {
1587  auto Prolog = MBBI;
1588  --Prolog;
1589  assert(Prolog->getOpcode() == AArch64::HOM_Prolog);
1590  Prolog->addOperand(MachineOperand::CreateImm(FPOffset));
1591  } else {
1592  // Issue sub fp, sp, FPOffset or
1593  // mov fp,sp when FPOffset is zero.
1594  // Note: All stores of callee-saved registers are marked as "FrameSetup".
1595  // This code marks the instruction(s) that set the FP also.
1596  emitFrameOffset(MBB, MBBI, DL, AArch64::FP, AArch64::SP,
1597  StackOffset::getFixed(FPOffset), TII,
1598  MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI);
1599  if (NeedsWinCFI && HasWinCFI) {
1600  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1602  // After setting up the FP, the rest of the prolog doesn't need to be
1603  // included in the SEH unwind info.
1604  NeedsWinCFI = false;
1605  }
1606  }
1607  if (EmitCFI) {
1608  // Define the current CFA rule to use the provided FP.
1609  const int OffsetToFirstCalleeSaveFromFP =
1611  AFI->getCalleeSavedStackSize();
1612  Register FramePtr = RegInfo->getFrameRegister(MF);
1613  unsigned Reg = RegInfo->getDwarfRegNum(FramePtr, true);
1614  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::cfiDefCfa(
1615  nullptr, Reg, FixedObject - OffsetToFirstCalleeSaveFromFP));
1616  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1617  .addCFIIndex(CFIIndex)
1619  }
1620  }
1621 
1622  // Now emit the moves for whatever callee saved regs we have (including FP,
1623  // LR if those are saved). Frame instructions for SVE register are emitted
1624  // later, after the instruction which actually save SVE regs.
1625  if (EmitCFI)
1626  emitCalleeSavedGPRLocations(MBB, MBBI);
1627 
1628  // Alignment is required for the parent frame, not the funclet
1629  const bool NeedsRealignment =
1630  NumBytes && !IsFunclet && RegInfo->hasStackRealignment(MF);
1631  int64_t RealignmentPadding =
1632  (NeedsRealignment && MFI.getMaxAlign() > Align(16))
1633  ? MFI.getMaxAlign().value() - 16
1634  : 0;
1635 
1636  if (windowsRequiresStackProbe(MF, NumBytes + RealignmentPadding)) {
1637  uint64_t NumWords = (NumBytes + RealignmentPadding) >> 4;
1638  if (NeedsWinCFI) {
1639  HasWinCFI = true;
1640  // alloc_l can hold at most 256MB, so assume that NumBytes doesn't
1641  // exceed this amount. We need to move at most 2^24 - 1 into x15.
1642  // This is at most two instructions, MOVZ follwed by MOVK.
1643  // TODO: Fix to use multiple stack alloc unwind codes for stacks
1644  // exceeding 256MB in size.
1645  if (NumBytes >= (1 << 28))
1646  report_fatal_error("Stack size cannot exceed 256MB for stack "
1647  "unwinding purposes");
1648 
1649  uint32_t LowNumWords = NumWords & 0xFFFF;
1650  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVZXi), AArch64::X15)
1651  .addImm(LowNumWords)
1654  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1656  if ((NumWords & 0xFFFF0000) != 0) {
1657  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVKXi), AArch64::X15)
1658  .addReg(AArch64::X15)
1659  .addImm((NumWords & 0xFFFF0000) >> 16) // High half
1662  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1664  }
1665  } else {
1666  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVi64imm), AArch64::X15)
1667  .addImm(NumWords)
1669  }
1670 
1671  const char* ChkStk = Subtarget.getChkStkName();
1672  switch (MF.getTarget().getCodeModel()) {
1673  case CodeModel::Tiny:
1674  case CodeModel::Small:
1675  case CodeModel::Medium:
1676  case CodeModel::Kernel:
1677  BuildMI(MBB, MBBI, DL, TII->get(AArch64::BL))
1678  .addExternalSymbol(ChkStk)
1679  .addReg(AArch64::X15, RegState::Implicit)
1684  if (NeedsWinCFI) {
1685  HasWinCFI = true;
1686  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1688  }
1689  break;
1690  case CodeModel::Large:
1691  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVaddrEXT))
1692  .addReg(AArch64::X16, RegState::Define)
1693  .addExternalSymbol(ChkStk)
1694  .addExternalSymbol(ChkStk)
1696  if (NeedsWinCFI) {
1697  HasWinCFI = true;
1698  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1700  }
1701 
1702  BuildMI(MBB, MBBI, DL, TII->get(getBLRCallOpcode(MF)))
1703  .addReg(AArch64::X16, RegState::Kill)
1704  .addReg(AArch64::X15, RegState::Implicit | RegState::Define)
1709  if (NeedsWinCFI) {
1710  HasWinCFI = true;
1711  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1713  }
1714  break;
1715  }
1716 
1717  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SUBXrx64), AArch64::SP)
1718  .addReg(AArch64::SP, RegState::Kill)
1719  .addReg(AArch64::X15, RegState::Kill)
1722  if (NeedsWinCFI) {
1723  HasWinCFI = true;
1724  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_StackAlloc))
1725  .addImm(NumBytes)
1727  }
1728  NumBytes = 0;
1729 
1730  if (RealignmentPadding > 0) {
1731  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ADDXri), AArch64::X15)
1732  .addReg(AArch64::SP)
1733  .addImm(RealignmentPadding)
1734  .addImm(0);
1735 
1736  uint64_t AndMask = ~(MFI.getMaxAlign().value() - 1);
1737  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ANDXri), AArch64::SP)
1738  .addReg(AArch64::X15, RegState::Kill)
1740  AFI->setStackRealigned(true);
1741 
1742  // No need for SEH instructions here; if we're realigning the stack,
1743  // we've set a frame pointer and already finished the SEH prologue.
1744  assert(!NeedsWinCFI);
1745  }
1746  }
1747 
1748  StackOffset AllocateBefore = SVEStackSize, AllocateAfter = {};
1749  MachineBasicBlock::iterator CalleeSavesBegin = MBBI, CalleeSavesEnd = MBBI;
1750 
1751  // Process the SVE callee-saves to determine what space needs to be
1752  // allocated.
1753  if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
1754  // Find callee save instructions in frame.
1755  CalleeSavesBegin = MBBI;
1756  assert(IsSVECalleeSave(CalleeSavesBegin) && "Unexpected instruction");
1758  ++MBBI;
1759  CalleeSavesEnd = MBBI;
1760 
1761  AllocateBefore = StackOffset::getScalable(CalleeSavedSize);
1762  AllocateAfter = SVEStackSize - AllocateBefore;
1763  }
1764 
1765  // Allocate space for the callee saves (if any).
1767  MBB, CalleeSavesBegin, DL, AArch64::SP, AArch64::SP, -AllocateBefore, TII,
1768  MachineInstr::FrameSetup, false, false, nullptr,
1769  EmitCFI && !HasFP && AllocateBefore,
1770  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
1771 
1772  if (EmitCFI)
1773  emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
1774 
1775  // Finally allocate remaining SVE stack space.
1776  emitFrameOffset(MBB, CalleeSavesEnd, DL, AArch64::SP, AArch64::SP,
1777  -AllocateAfter, TII, MachineInstr::FrameSetup, false, false,
1778  nullptr, EmitCFI && !HasFP && AllocateAfter,
1779  AllocateBefore + StackOffset::getFixed(
1780  (int64_t)MFI.getStackSize() - NumBytes));
1781 
1782  // Allocate space for the rest of the frame.
1783  if (NumBytes) {
1784  unsigned scratchSPReg = AArch64::SP;
1785 
1786  if (NeedsRealignment) {
1787  scratchSPReg = findScratchNonCalleeSaveRegister(&MBB);
1788  assert(scratchSPReg != AArch64::NoRegister);
1789  }
1790 
1791  // If we're a leaf function, try using the red zone.
1792  if (!canUseRedZone(MF)) {
1793  // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
1794  // the correct value here, as NumBytes also includes padding bytes,
1795  // which shouldn't be counted here.
1797  MBB, MBBI, DL, scratchSPReg, AArch64::SP,
1799  false, NeedsWinCFI, &HasWinCFI, EmitCFI && !HasFP,
1800  SVEStackSize +
1801  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
1802  }
1803  if (NeedsRealignment) {
1804  assert(MFI.getMaxAlign() > Align(1));
1805  assert(scratchSPReg != AArch64::SP);
1806 
1807  // SUB X9, SP, NumBytes
1808  // -- X9 is temporary register, so shouldn't contain any live data here,
1809  // -- free to use. This is already produced by emitFrameOffset above.
1810  // AND SP, X9, 0b11111...0000
1811  uint64_t AndMask = ~(MFI.getMaxAlign().value() - 1);
1812 
1813  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ANDXri), AArch64::SP)
1814  .addReg(scratchSPReg, RegState::Kill)
1816  AFI->setStackRealigned(true);
1817 
1818  // No need for SEH instructions here; if we're realigning the stack,
1819  // we've set a frame pointer and already finished the SEH prologue.
1820  assert(!NeedsWinCFI);
1821  }
1822  }
1823 
1824  // If we need a base pointer, set it up here. It's whatever the value of the
1825  // stack pointer is at this point. Any variable size objects will be allocated
1826  // after this, so we can still use the base pointer to reference locals.
1827  //
1828  // FIXME: Clarify FrameSetup flags here.
1829  // Note: Use emitFrameOffset() like above for FP if the FrameSetup flag is
1830  // needed.
1831  // For funclets the BP belongs to the containing function.
1832  if (!IsFunclet && RegInfo->hasBasePointer(MF)) {
1833  TII->copyPhysReg(MBB, MBBI, DL, RegInfo->getBaseRegister(), AArch64::SP,
1834  false);
1835  if (NeedsWinCFI) {
1836  HasWinCFI = true;
1837  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1839  }
1840  }
1841 
1842  // The very last FrameSetup instruction indicates the end of prologue. Emit a
1843  // SEH opcode indicating the prologue end.
1844  if (NeedsWinCFI && HasWinCFI) {
1845  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1847  }
1848 
1849  // SEH funclets are passed the frame pointer in X1. If the parent
1850  // function uses the base register, then the base register is used
1851  // directly, and is not retrieved from X1.
1852  if (IsFunclet && F.hasPersonalityFn()) {
1853  EHPersonality Per = classifyEHPersonality(F.getPersonalityFn());
1854  if (isAsynchronousEHPersonality(Per)) {
1855  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::COPY), AArch64::FP)
1856  .addReg(AArch64::X1)
1858  MBB.addLiveIn(AArch64::X1);
1859  }
1860  }
1861 }
1862 
1864  bool NeedsWinCFI, bool *HasWinCFI) {
1865  const auto &MFI = *MF.getInfo<AArch64FunctionInfo>();
1866  if (!MFI.shouldSignReturnAddress(MF))
1867  return;
1868  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1869  const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1870 
1872  DebugLoc DL;
1873  if (MBBI != MBB.end())
1874  DL = MBBI->getDebugLoc();
1875 
1876  // The AUTIASP instruction assembles to a hint instruction before v8.3a so
1877  // this instruction can safely used for any v8a architecture.
1878  // From v8.3a onwards there are optimised authenticate LR and return
1879  // instructions, namely RETA{A,B}, that can be used instead. In this case the
1880  // DW_CFA_AARCH64_negate_ra_state can't be emitted.
1881  if (Subtarget.hasPAuth() &&
1882  !MF.getFunction().hasFnAttribute(Attribute::ShadowCallStack) &&
1883  MBBI != MBB.end() && MBBI->getOpcode() == AArch64::RET_ReallyLR &&
1884  !NeedsWinCFI) {
1885  BuildMI(MBB, MBBI, DL,
1886  TII->get(MFI.shouldSignWithBKey() ? AArch64::RETAB : AArch64::RETAA))
1887  .copyImplicitOps(*MBBI);
1888  MBB.erase(MBBI);
1889  } else {
1890  BuildMI(
1891  MBB, MBBI, DL,
1892  TII->get(MFI.shouldSignWithBKey() ? AArch64::AUTIBSP : AArch64::AUTIASP))
1894 
1895  unsigned CFIIndex =
1897  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1898  .addCFIIndex(CFIIndex)
1900  if (NeedsWinCFI) {
1901  *HasWinCFI = true;
1902  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PACSignLR))
1904  }
1905  }
1906 }
1907 
1908 static bool isFuncletReturnInstr(const MachineInstr &MI) {
1909  switch (MI.getOpcode()) {
1910  default:
1911  return false;
1912  case AArch64::CATCHRET:
1913  case AArch64::CLEANUPRET:
1914  return true;
1915  }
1916 }
1917 
1919  MachineBasicBlock &MBB) const {
1921  MachineFrameInfo &MFI = MF.getFrameInfo();
1922  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1923  const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1924  DebugLoc DL;
1925  bool NeedsWinCFI = needsWinCFI(MF);
1926  bool EmitCFI =
1927  MF.getInfo<AArch64FunctionInfo>()->needsAsyncDwarfUnwindInfo(MF);
1928  bool HasWinCFI = false;
1929  bool IsFunclet = false;
1930  auto WinCFI = make_scope_exit([&]() { assert(HasWinCFI == MF.hasWinCFI()); });
1931 
1932  if (MBB.end() != MBBI) {
1933  DL = MBBI->getDebugLoc();
1934  IsFunclet = isFuncletReturnInstr(*MBBI);
1935  }
1936 
1937  auto FinishingTouches = make_scope_exit([&]() {
1938  InsertReturnAddressAuth(MF, MBB, NeedsWinCFI, &HasWinCFI);
1941  if (EmitCFI)
1942  emitCalleeSavedGPRRestores(MBB, MBB.getFirstTerminator());
1943  if (HasWinCFI)
1945  TII->get(AArch64::SEH_EpilogEnd))
1947  });
1948 
1949  int64_t NumBytes = IsFunclet ? getWinEHFuncletFrameSize(MF)
1950  : MFI.getStackSize();
1952 
1953  // All calls are tail calls in GHC calling conv, and functions have no
1954  // prologue/epilogue.
1956  return;
1957 
1958  // How much of the stack used by incoming arguments this function is expected
1959  // to restore in this particular epilogue.
1960  int64_t ArgumentStackToRestore = getArgumentStackToRestore(MF, MBB);
1961  bool IsWin64 =
1962  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
1963  unsigned FixedObject = getFixedObjectSize(MF, AFI, IsWin64, IsFunclet);
1964 
1965  int64_t AfterCSRPopSize = ArgumentStackToRestore;
1966  auto PrologueSaveSize = AFI->getCalleeSavedStackSize() + FixedObject;
1967  // We cannot rely on the local stack size set in emitPrologue if the function
1968  // has funclets, as funclets have different local stack size requirements, and
1969  // the current value set in emitPrologue may be that of the containing
1970  // function.
1971  if (MF.hasEHFunclets())
1972  AFI->setLocalStackSize(NumBytes - PrologueSaveSize);
1973  if (homogeneousPrologEpilog(MF, &MBB)) {
1974  assert(!NeedsWinCFI);
1975  auto LastPopI = MBB.getFirstTerminator();
1976  if (LastPopI != MBB.begin()) {
1977  auto HomogeneousEpilog = std::prev(LastPopI);
1978  if (HomogeneousEpilog->getOpcode() == AArch64::HOM_Epilog)
1979  LastPopI = HomogeneousEpilog;
1980  }
1981 
1982  // Adjust local stack
1983  emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
1985  MachineInstr::FrameDestroy, false, NeedsWinCFI);
1986 
1987  // SP has been already adjusted while restoring callee save regs.
1988  // We've bailed-out the case with adjusting SP for arguments.
1989  assert(AfterCSRPopSize == 0);
1990  return;
1991  }
1992  bool CombineSPBump = shouldCombineCSRLocalStackBumpInEpilogue(MBB, NumBytes);
1993  // Assume we can't combine the last pop with the sp restore.
1994 
1995  bool CombineAfterCSRBump = false;
1996  if (!CombineSPBump && PrologueSaveSize != 0) {
1998  while (Pop->getOpcode() == TargetOpcode::CFI_INSTRUCTION ||
2000  Pop = std::prev(Pop);
2001  // Converting the last ldp to a post-index ldp is valid only if the last
2002  // ldp's offset is 0.
2003  const MachineOperand &OffsetOp = Pop->getOperand(Pop->getNumOperands() - 1);
2004  // If the offset is 0 and the AfterCSR pop is not actually trying to
2005  // allocate more stack for arguments (in space that an untimely interrupt
2006  // may clobber), convert it to a post-index ldp.
2007  if (OffsetOp.getImm() == 0 && AfterCSRPopSize >= 0) {
2009  MBB, Pop, DL, TII, PrologueSaveSize, NeedsWinCFI, &HasWinCFI, EmitCFI,
2010  MachineInstr::FrameDestroy, PrologueSaveSize);
2011  } else {
2012  // If not, make sure to emit an add after the last ldp.
2013  // We're doing this by transfering the size to be restored from the
2014  // adjustment *before* the CSR pops to the adjustment *after* the CSR
2015  // pops.
2016  AfterCSRPopSize += PrologueSaveSize;
2017  CombineAfterCSRBump = true;
2018  }
2019  }
2020 
2021  // Move past the restores of the callee-saved registers.
2022  // If we plan on combining the sp bump of the local stack size and the callee
2023  // save stack size, we might need to adjust the CSR save and restore offsets.
2026  while (LastPopI != Begin) {
2027  --LastPopI;
2028  if (!LastPopI->getFlag(MachineInstr::FrameDestroy) ||
2029  IsSVECalleeSave(LastPopI)) {
2030  ++LastPopI;
2031  break;
2032  } else if (CombineSPBump)
2034  NeedsWinCFI, &HasWinCFI);
2035  }
2036 
2037  if (MF.hasWinCFI()) {
2038  // If the prologue didn't contain any SEH opcodes and didn't set the
2039  // MF.hasWinCFI() flag, assume the epilogue won't either, and skip the
2040  // EpilogStart - to avoid generating CFI for functions that don't need it.
2041  // (And as we didn't generate any prologue at all, it would be asymmetrical
2042  // to the epilogue.) By the end of the function, we assert that
2043  // HasWinCFI is equal to MF.hasWinCFI(), to verify this assumption.
2044  HasWinCFI = true;
2045  BuildMI(MBB, LastPopI, DL, TII->get(AArch64::SEH_EpilogStart))
2047  }
2048 
2049  if (hasFP(MF) && AFI->hasSwiftAsyncContext()) {
2050  switch (MF.getTarget().Options.SwiftAsyncFramePointer) {
2052  // Avoid the reload as it is GOT relative, and instead fall back to the
2053  // hardcoded value below. This allows a mismatch between the OS and
2054  // application without immediately terminating on the difference.
2055  [[fallthrough]];
2057  // We need to reset FP to its untagged state on return. Bit 60 is
2058  // currently used to show the presence of an extended frame.
2059 
2060  // BIC x29, x29, #0x1000_0000_0000_0000
2061  BuildMI(MBB, MBB.getFirstTerminator(), DL, TII->get(AArch64::ANDXri),
2062  AArch64::FP)
2063  .addUse(AArch64::FP)
2064  .addImm(0x10fe)
2066  break;
2067 
2069  break;
2070  }
2071  }
2072 
2073  const StackOffset &SVEStackSize = getSVEStackSize(MF);
2074 
2075  // If there is a single SP update, insert it before the ret and we're done.
2076  if (CombineSPBump) {
2077  assert(!SVEStackSize && "Cannot combine SP bump with SVE");
2078 
2079  // When we are about to restore the CSRs, the CFA register is SP again.
2080  if (EmitCFI && hasFP(MF)) {
2081  const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
2082  unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
2083  unsigned CFIIndex =
2084  MF.addFrameInst(MCCFIInstruction::cfiDefCfa(nullptr, Reg, NumBytes));
2085  BuildMI(MBB, LastPopI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2086  .addCFIIndex(CFIIndex)
2088  }
2089 
2090  emitFrameOffset(MBB, MBB.getFirstTerminator(), DL, AArch64::SP, AArch64::SP,
2091  StackOffset::getFixed(NumBytes + (int64_t)AfterCSRPopSize),
2092  TII, MachineInstr::FrameDestroy, false, NeedsWinCFI,
2093  &HasWinCFI, EmitCFI, StackOffset::getFixed(NumBytes));
2094  return;
2095  }
2096 
2097  NumBytes -= PrologueSaveSize;
2098  assert(NumBytes >= 0 && "Negative stack allocation size!?");
2099 
2100  // Process the SVE callee-saves to determine what space needs to be
2101  // deallocated.
2102  StackOffset DeallocateBefore = {}, DeallocateAfter = SVEStackSize;
2103  MachineBasicBlock::iterator RestoreBegin = LastPopI, RestoreEnd = LastPopI;
2104  if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2105  RestoreBegin = std::prev(RestoreEnd);
2106  while (RestoreBegin != MBB.begin() &&
2107  IsSVECalleeSave(std::prev(RestoreBegin)))
2108  --RestoreBegin;
2109 
2110  assert(IsSVECalleeSave(RestoreBegin) &&
2111  IsSVECalleeSave(std::prev(RestoreEnd)) && "Unexpected instruction");
2112 
2113  StackOffset CalleeSavedSizeAsOffset =
2114  StackOffset::getScalable(CalleeSavedSize);
2115  DeallocateBefore = SVEStackSize - CalleeSavedSizeAsOffset;
2116  DeallocateAfter = CalleeSavedSizeAsOffset;
2117  }
2118 
2119  // Deallocate the SVE area.
2120  if (SVEStackSize) {
2121  // If we have stack realignment or variable sized objects on the stack,
2122  // restore the stack pointer from the frame pointer prior to SVE CSR
2123  // restoration.
2124  if (AFI->isStackRealigned() || MFI.hasVarSizedObjects()) {
2125  if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2126  // Set SP to start of SVE callee-save area from which they can
2127  // be reloaded. The code below will deallocate the stack space
2128  // space by moving FP -> SP.
2129  emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::FP,
2130  StackOffset::getScalable(-CalleeSavedSize), TII,
2132  }
2133  } else {
2134  if (AFI->getSVECalleeSavedStackSize()) {
2135  // Deallocate the non-SVE locals first before we can deallocate (and
2136  // restore callee saves) from the SVE area.
2138  MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
2140  false, false, nullptr, EmitCFI && !hasFP(MF),
2141  SVEStackSize + StackOffset::getFixed(NumBytes + PrologueSaveSize));
2142  NumBytes = 0;
2143  }
2144 
2145  emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
2146  DeallocateBefore, TII, MachineInstr::FrameDestroy, false,
2147  false, nullptr, EmitCFI && !hasFP(MF),
2148  SVEStackSize +
2149  StackOffset::getFixed(NumBytes + PrologueSaveSize));
2150 
2151  emitFrameOffset(MBB, RestoreEnd, DL, AArch64::SP, AArch64::SP,
2152  DeallocateAfter, TII, MachineInstr::FrameDestroy, false,
2153  false, nullptr, EmitCFI && !hasFP(MF),
2154  DeallocateAfter +
2155  StackOffset::getFixed(NumBytes + PrologueSaveSize));
2156  }
2157  if (EmitCFI)
2158  emitCalleeSavedSVERestores(MBB, RestoreEnd);
2159  }
2160 
2161  if (!hasFP(MF)) {
2162  bool RedZone = canUseRedZone(MF);
2163  // If this was a redzone leaf function, we don't need to restore the
2164  // stack pointer (but we may need to pop stack args for fastcc).
2165  if (RedZone && AfterCSRPopSize == 0)
2166  return;
2167 
2168  // Pop the local variables off the stack. If there are no callee-saved
2169  // registers, it means we are actually positioned at the terminator and can
2170  // combine stack increment for the locals and the stack increment for
2171  // callee-popped arguments into (possibly) a single instruction and be done.
2172  bool NoCalleeSaveRestore = PrologueSaveSize == 0;
2173  int64_t StackRestoreBytes = RedZone ? 0 : NumBytes;
2174  if (NoCalleeSaveRestore)
2175  StackRestoreBytes += AfterCSRPopSize;
2176 
2178  MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2179  StackOffset::getFixed(StackRestoreBytes), TII,
2180  MachineInstr::FrameDestroy, false, NeedsWinCFI, &HasWinCFI, EmitCFI,
2181  StackOffset::getFixed((RedZone ? 0 : NumBytes) + PrologueSaveSize));
2182 
2183  // If we were able to combine the local stack pop with the argument pop,
2184  // then we're done.
2185  if (NoCalleeSaveRestore || AfterCSRPopSize == 0) {
2186  return;
2187  }
2188 
2189  NumBytes = 0;
2190  }
2191 
2192  // Restore the original stack pointer.
2193  // FIXME: Rather than doing the math here, we should instead just use
2194  // non-post-indexed loads for the restores if we aren't actually going to
2195  // be able to save any instructions.
2196  if (!IsFunclet && (MFI.hasVarSizedObjects() || AFI->isStackRealigned())) {
2198  MBB, LastPopI, DL, AArch64::SP, AArch64::FP,
2200  TII, MachineInstr::FrameDestroy, false, NeedsWinCFI);
2201  } else if (NumBytes)
2202  emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2203  StackOffset::getFixed(NumBytes), TII,
2204  MachineInstr::FrameDestroy, false, NeedsWinCFI);
2205 
2206  // When we are about to restore the CSRs, the CFA register is SP again.
2207  if (EmitCFI && hasFP(MF)) {
2208  const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
2209  unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
2210  unsigned CFIIndex = MF.addFrameInst(
2211  MCCFIInstruction::cfiDefCfa(nullptr, Reg, PrologueSaveSize));
2212  BuildMI(MBB, LastPopI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2213  .addCFIIndex(CFIIndex)
2215  }
2216 
2217  // This must be placed after the callee-save restore code because that code
2218  // assumes the SP is at the same location as it was after the callee-save save
2219  // code in the prologue.
2220  if (AfterCSRPopSize) {
2221  assert(AfterCSRPopSize > 0 && "attempting to reallocate arg stack that an "
2222  "interrupt may have clobbered");
2223 
2225  MBB, MBB.getFirstTerminator(), DL, AArch64::SP, AArch64::SP,
2227  false, NeedsWinCFI, &HasWinCFI, EmitCFI,
2228  StackOffset::getFixed(CombineAfterCSRBump ? PrologueSaveSize : 0));
2229  }
2230 }
2231 
2232 /// getFrameIndexReference - Provide a base+offset reference to an FI slot for
2233 /// debug info. It's the same as what we use for resolving the code-gen
2234 /// references for now. FIXME: This can go wrong when references are
2235 /// SP-relative and simple call frames aren't used.
2238  Register &FrameReg) const {
2240  MF, FI, FrameReg,
2241  /*PreferFP=*/
2242  MF.getFunction().hasFnAttribute(Attribute::SanitizeHWAddress),
2243  /*ForSimm=*/false);
2244 }
2245 
2248  int FI) const {
2250 }
2251 
2253  int64_t ObjectOffset) {
2254  const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
2255  const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2256  bool IsWin64 =
2257  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
2258  unsigned FixedObject =
2259  getFixedObjectSize(MF, AFI, IsWin64, /*IsFunclet=*/false);
2260  int64_t CalleeSaveSize = AFI->getCalleeSavedStackSize(MF.getFrameInfo());
2261  int64_t FPAdjust =
2262  CalleeSaveSize - AFI->getCalleeSaveBaseToFrameRecordOffset();
2263  return StackOffset::getFixed(ObjectOffset + FixedObject + FPAdjust);
2264 }
2265 
2267  int64_t ObjectOffset) {
2268  const auto &MFI = MF.getFrameInfo();
2269  return StackOffset::getFixed(ObjectOffset + (int64_t)MFI.getStackSize());
2270 }
2271 
2272  // TODO: This function currently does not work for scalable vectors.
2274  int FI) const {
2275  const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
2276  MF.getSubtarget().getRegisterInfo());
2277  int ObjectOffset = MF.getFrameInfo().getObjectOffset(FI);
2278  return RegInfo->getLocalAddressRegister(MF) == AArch64::FP
2279  ? getFPOffset(MF, ObjectOffset).getFixed()
2280  : getStackOffset(MF, ObjectOffset).getFixed();
2281 }
2282 
2284  const MachineFunction &MF, int FI, Register &FrameReg, bool PreferFP,
2285  bool ForSimm) const {
2286  const auto &MFI = MF.getFrameInfo();
2287  int64_t ObjectOffset = MFI.getObjectOffset(FI);
2288  bool isFixed = MFI.isFixedObjectIndex(FI);
2289  bool isSVE = MFI.getStackID(FI) == TargetStackID::ScalableVector;
2290  return resolveFrameOffsetReference(MF, ObjectOffset, isFixed, isSVE, FrameReg,
2291  PreferFP, ForSimm);
2292 }
2293 
2295  const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE,
2296  Register &FrameReg, bool PreferFP, bool ForSimm) const {
2297  const auto &MFI = MF.getFrameInfo();
2298  const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
2299  MF.getSubtarget().getRegisterInfo());
2300  const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
2301  const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2302 
2303  int64_t FPOffset = getFPOffset(MF, ObjectOffset).getFixed();
2304  int64_t Offset = getStackOffset(MF, ObjectOffset).getFixed();
2305  bool isCSR =
2306  !isFixed && ObjectOffset >= -((int)AFI->getCalleeSavedStackSize(MFI));
2307 
2308  const StackOffset &SVEStackSize = getSVEStackSize(MF);
2309 
2310  // Use frame pointer to reference fixed objects. Use it for locals if
2311  // there are VLAs or a dynamically realigned SP (and thus the SP isn't
2312  // reliable as a base). Make sure useFPForScavengingIndex() does the
2313  // right thing for the emergency spill slot.
2314  bool UseFP = false;
2315  if (AFI->hasStackFrame() && !isSVE) {
2316  // We shouldn't prefer using the FP to access fixed-sized stack objects when
2317  // there are scalable (SVE) objects in between the FP and the fixed-sized
2318  // objects.
2319  PreferFP &= !SVEStackSize;
2320 
2321  // Note: Keeping the following as multiple 'if' statements rather than
2322  // merging to a single expression for readability.
2323  //
2324  // Argument access should always use the FP.
2325  if (isFixed) {
2326  UseFP = hasFP(MF);
2327  } else if (isCSR && RegInfo->hasStackRealignment(MF)) {
2328  // References to the CSR area must use FP if we're re-aligning the stack
2329  // since the dynamically-sized alignment padding is between the SP/BP and
2330  // the CSR area.
2331  assert(hasFP(MF) && "Re-aligned stack must have frame pointer");
2332  UseFP = true;
2333  } else if (hasFP(MF) && !RegInfo->hasStackRealignment(MF)) {
2334  // If the FPOffset is negative and we're producing a signed immediate, we
2335  // have to keep in mind that the available offset range for negative
2336  // offsets is smaller than for positive ones. If an offset is available
2337  // via the FP and the SP, use whichever is closest.
2338  bool FPOffsetFits = !ForSimm || FPOffset >= -256;
2339  PreferFP |= Offset > -FPOffset && !SVEStackSize;
2340 
2341  if (MFI.hasVarSizedObjects()) {
2342  // If we have variable sized objects, we can use either FP or BP, as the
2343  // SP offset is unknown. We can use the base pointer if we have one and
2344  // FP is not preferred. If not, we're stuck with using FP.
2345  bool CanUseBP = RegInfo->hasBasePointer(MF);
2346  if (FPOffsetFits && CanUseBP) // Both are ok. Pick the best.
2347  UseFP = PreferFP;
2348  else if (!CanUseBP) // Can't use BP. Forced to use FP.
2349  UseFP = true;
2350  // else we can use BP and FP, but the offset from FP won't fit.
2351  // That will make us scavenge registers which we can probably avoid by
2352  // using BP. If it won't fit for BP either, we'll scavenge anyway.
2353  } else if (FPOffset >= 0) {
2354  // Use SP or FP, whichever gives us the best chance of the offset
2355  // being in range for direct access. If the FPOffset is positive,
2356  // that'll always be best, as the SP will be even further away.
2357  UseFP = true;
2358  } else if (MF.hasEHFunclets() && !RegInfo->hasBasePointer(MF)) {
2359  // Funclets access the locals contained in the parent's stack frame
2360  // via the frame pointer, so we have to use the FP in the parent
2361  // function.
2362  (void) Subtarget;
2363  assert(
2364  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv()) &&
2365  "Funclets should only be present on Win64");
2366  UseFP = true;
2367  } else {
2368  // We have the choice between FP and (SP or BP).
2369  if (FPOffsetFits && PreferFP) // If FP is the best fit, use it.
2370  UseFP = true;
2371  }
2372  }
2373  }
2374 
2375  assert(
2376  ((isFixed || isCSR) || !RegInfo->hasStackRealignment(MF) || !UseFP) &&
2377  "In the presence of dynamic stack pointer realignment, "
2378  "non-argument/CSR objects cannot be accessed through the frame pointer");
2379 
2380  if (isSVE) {
2381  StackOffset FPOffset =
2383  StackOffset SPOffset =
2384  SVEStackSize +
2385  StackOffset::get(MFI.getStackSize() - AFI->getCalleeSavedStackSize(),
2386  ObjectOffset);
2387  // Always use the FP for SVE spills if available and beneficial.
2388  if (hasFP(MF) && (SPOffset.getFixed() ||
2389  FPOffset.getScalable() < SPOffset.getScalable() ||
2390  RegInfo->hasStackRealignment(MF))) {
2391  FrameReg = RegInfo->getFrameRegister(MF);
2392  return FPOffset;
2393  }
2394 
2395  FrameReg = RegInfo->hasBasePointer(MF) ? RegInfo->getBaseRegister()
2396  : (unsigned)AArch64::SP;
2397  return SPOffset;
2398  }
2399 
2400  StackOffset ScalableOffset = {};
2401  if (UseFP && !(isFixed || isCSR))
2402  ScalableOffset = -SVEStackSize;
2403  if (!UseFP && (isFixed || isCSR))
2404  ScalableOffset = SVEStackSize;
2405 
2406  if (UseFP) {
2407  FrameReg = RegInfo->getFrameRegister(MF);
2408  return StackOffset::getFixed(FPOffset) + ScalableOffset;
2409  }
2410 
2411  // Use the base pointer if we have one.
2412  if (RegInfo->hasBasePointer(MF))
2413  FrameReg = RegInfo->getBaseRegister();
2414  else {
2415  assert(!MFI.hasVarSizedObjects() &&
2416  "Can't use SP when we have var sized objects.");
2417  FrameReg = AArch64::SP;
2418  // If we're using the red zone for this function, the SP won't actually
2419  // be adjusted, so the offsets will be negative. They're also all
2420  // within range of the signed 9-bit immediate instructions.
2421  if (canUseRedZone(MF))
2422  Offset -= AFI->getLocalStackSize();
2423  }
2424 
2425  return StackOffset::getFixed(Offset) + ScalableOffset;
2426 }
2427 
2428 static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg) {
2429  // Do not set a kill flag on values that are also marked as live-in. This
2430  // happens with the @llvm-returnaddress intrinsic and with arguments passed in
2431  // callee saved registers.
2432  // Omitting the kill flags is conservatively correct even if the live-in
2433  // is not used after all.
2434  bool IsLiveIn = MF.getRegInfo().isLiveIn(Reg);
2435  return getKillRegState(!IsLiveIn);
2436 }
2437 
2439  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2441  return Subtarget.isTargetMachO() &&
2442  !(Subtarget.getTargetLowering()->supportSwiftError() &&
2443  Attrs.hasAttrSomewhere(Attribute::SwiftError)) &&
2445 }
2446 
2447 static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2,
2448  bool NeedsWinCFI, bool IsFirst,
2449  const TargetRegisterInfo *TRI) {
2450  // If we are generating register pairs for a Windows function that requires
2451  // EH support, then pair consecutive registers only. There are no unwind
2452  // opcodes for saves/restores of non-consectuve register pairs.
2453  // The unwind opcodes are save_regp, save_regp_x, save_fregp, save_frepg_x,
2454  // save_lrpair.
2455  // https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling
2456 
2457  if (Reg2 == AArch64::FP)
2458  return true;
2459  if (!NeedsWinCFI)
2460  return false;
2461  if (TRI->getEncodingValue(Reg2) == TRI->getEncodingValue(Reg1) + 1)
2462  return false;
2463  // If pairing a GPR with LR, the pair can be described by the save_lrpair
2464  // opcode. If this is the first register pair, it would end up with a
2465  // predecrement, but there's no save_lrpair_x opcode, so we can only do this
2466  // if LR is paired with something else than the first register.
2467  // The save_lrpair opcode requires the first register to be an odd one.
2468  if (Reg1 >= AArch64::X19 && Reg1 <= AArch64::X27 &&
2469  (Reg1 - AArch64::X19) % 2 == 0 && Reg2 == AArch64::LR && !IsFirst)
2470  return false;
2471  return true;
2472 }
2473 
2474 /// Returns true if Reg1 and Reg2 cannot be paired using a ldp/stp instruction.
2475 /// WindowsCFI requires that only consecutive registers can be paired.
2476 /// LR and FP need to be allocated together when the frame needs to save
2477 /// the frame-record. This means any other register pairing with LR is invalid.
2478 static bool invalidateRegisterPairing(unsigned Reg1, unsigned Reg2,
2479  bool UsesWinAAPCS, bool NeedsWinCFI,
2480  bool NeedsFrameRecord, bool IsFirst,
2481  const TargetRegisterInfo *TRI) {
2482  if (UsesWinAAPCS)
2483  return invalidateWindowsRegisterPairing(Reg1, Reg2, NeedsWinCFI, IsFirst,
2484  TRI);
2485 
2486  // If we need to store the frame record, don't pair any register
2487  // with LR other than FP.
2488  if (NeedsFrameRecord)
2489  return Reg2 == AArch64::LR;
2490 
2491  return false;
2492 }
2493 
2494 namespace {
2495 
2496 struct RegPairInfo {
2497  unsigned Reg1 = AArch64::NoRegister;
2498  unsigned Reg2 = AArch64::NoRegister;
2499  int FrameIdx;
2500  int Offset;
2501  enum RegType { GPR, FPR64, FPR128, PPR, ZPR } Type;
2502 
2503  RegPairInfo() = default;
2504 
2505  bool isPaired() const { return Reg2 != AArch64::NoRegister; }
2506 
2507  unsigned getScale() const {
2508  switch (Type) {
2509  case PPR:
2510  return 2;
2511  case GPR:
2512  case FPR64:
2513  return 8;
2514  case ZPR:
2515  case FPR128:
2516  return 16;
2517  }
2518  llvm_unreachable("Unsupported type");
2519  }
2520 
2521  bool isScalable() const { return Type == PPR || Type == ZPR; }
2522 };
2523 
2524 } // end anonymous namespace
2525 
2529  bool NeedsFrameRecord) {
2530 
2531  if (CSI.empty())
2532  return;
2533 
2534  bool IsWindows = isTargetWindows(MF);
2535  bool NeedsWinCFI = needsWinCFI(MF);
2537  MachineFrameInfo &MFI = MF.getFrameInfo();
2539  unsigned Count = CSI.size();
2540  (void)CC;
2541  // MachO's compact unwind format relies on all registers being stored in
2542  // pairs.
2545  (Count & 1) == 0) &&
2546  "Odd number of callee-saved regs to spill!");
2547  int ByteOffset = AFI->getCalleeSavedStackSize();
2548  int StackFillDir = -1;
2549  int RegInc = 1;
2550  unsigned FirstReg = 0;
2551  if (NeedsWinCFI) {
2552  // For WinCFI, fill the stack from the bottom up.
2553  ByteOffset = 0;
2554  StackFillDir = 1;
2555  // As the CSI array is reversed to match PrologEpilogInserter, iterate
2556  // backwards, to pair up registers starting from lower numbered registers.
2557  RegInc = -1;
2558  FirstReg = Count - 1;
2559  }
2560  int ScalableByteOffset = AFI->getSVECalleeSavedStackSize();
2561  bool NeedGapToAlignStack = AFI->hasCalleeSaveStackFreeSpace();
2562 
2563  // When iterating backwards, the loop condition relies on unsigned wraparound.
2564  for (unsigned i = FirstReg; i < Count; i += RegInc) {
2565  RegPairInfo RPI;
2566  RPI.Reg1 = CSI[i].getReg();
2567 
2568  if (AArch64::GPR64RegClass.contains(RPI.Reg1))
2569  RPI.Type = RegPairInfo::GPR;
2570  else if (AArch64::FPR64RegClass.contains(RPI.Reg1))
2571  RPI.Type = RegPairInfo::FPR64;
2572  else if (AArch64::FPR128RegClass.contains(RPI.Reg1))
2573  RPI.Type = RegPairInfo::FPR128;
2574  else if (AArch64::ZPRRegClass.contains(RPI.Reg1))
2575  RPI.Type = RegPairInfo::ZPR;
2576  else if (AArch64::PPRRegClass.contains(RPI.Reg1))
2577  RPI.Type = RegPairInfo::PPR;
2578  else
2579  llvm_unreachable("Unsupported register class.");
2580 
2581  // Add the next reg to the pair if it is in the same register class.
2582  if (unsigned(i + RegInc) < Count) {
2583  Register NextReg = CSI[i + RegInc].getReg();
2584  bool IsFirst = i == FirstReg;
2585  switch (RPI.Type) {
2586  case RegPairInfo::GPR:
2587  if (AArch64::GPR64RegClass.contains(NextReg) &&
2588  !invalidateRegisterPairing(RPI.Reg1, NextReg, IsWindows,
2589  NeedsWinCFI, NeedsFrameRecord, IsFirst,
2590  TRI))
2591  RPI.Reg2 = NextReg;
2592  break;
2593  case RegPairInfo::FPR64:
2594  if (AArch64::FPR64RegClass.contains(NextReg) &&
2595  !invalidateWindowsRegisterPairing(RPI.Reg1, NextReg, NeedsWinCFI,
2596  IsFirst, TRI))
2597  RPI.Reg2 = NextReg;
2598  break;
2599  case RegPairInfo::FPR128:
2600  if (AArch64::FPR128RegClass.contains(NextReg))
2601  RPI.Reg2 = NextReg;
2602  break;
2603  case RegPairInfo::PPR:
2604  case RegPairInfo::ZPR:
2605  break;
2606  }
2607  }
2608 
2609  // GPRs and FPRs are saved in pairs of 64-bit regs. We expect the CSI
2610  // list to come in sorted by frame index so that we can issue the store
2611  // pair instructions directly. Assert if we see anything otherwise.
2612  //
2613  // The order of the registers in the list is controlled by
2614  // getCalleeSavedRegs(), so they will always be in-order, as well.
2615  assert((!RPI.isPaired() ||
2616  (CSI[i].getFrameIdx() + RegInc == CSI[i + RegInc].getFrameIdx())) &&
2617  "Out of order callee saved regs!");
2618 
2619  assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg2 != AArch64::FP ||
2620  RPI.Reg1 == AArch64::LR) &&
2621  "FrameRecord must be allocated together with LR");
2622 
2623  // Windows AAPCS has FP and LR reversed.
2624  assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg1 != AArch64::FP ||
2625  RPI.Reg2 == AArch64::LR) &&
2626  "FrameRecord must be allocated together with LR");
2627 
2628  // MachO's compact unwind format relies on all registers being stored in
2629  // adjacent register pairs.
2632  (RPI.isPaired() &&
2633  ((RPI.Reg1 == AArch64::LR && RPI.Reg2 == AArch64::FP) ||
2634  RPI.Reg1 + 1 == RPI.Reg2))) &&
2635  "Callee-save registers not saved as adjacent register pair!");
2636 
2637  RPI.FrameIdx = CSI[i].getFrameIdx();
2638  if (NeedsWinCFI &&
2639  RPI.isPaired()) // RPI.FrameIdx must be the lower index of the pair
2640  RPI.FrameIdx = CSI[i + RegInc].getFrameIdx();
2641 
2642  int Scale = RPI.getScale();
2643 
2644  int OffsetPre = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
2645  assert(OffsetPre % Scale == 0);
2646 
2647  if (RPI.isScalable())
2648  ScalableByteOffset += StackFillDir * Scale;
2649  else
2650  ByteOffset += StackFillDir * (RPI.isPaired() ? 2 * Scale : Scale);
2651 
2652  // Swift's async context is directly before FP, so allocate an extra
2653  // 8 bytes for it.
2654  if (NeedsFrameRecord && AFI->hasSwiftAsyncContext() &&
2655  RPI.Reg2 == AArch64::FP)
2656  ByteOffset += StackFillDir * 8;
2657 
2658  assert(!(RPI.isScalable() && RPI.isPaired()) &&
2659  "Paired spill/fill instructions don't exist for SVE vectors");
2660 
2661  // Round up size of non-pair to pair size if we need to pad the
2662  // callee-save area to ensure 16-byte alignment.
2663  if (NeedGapToAlignStack && !NeedsWinCFI &&
2664  !RPI.isScalable() && RPI.Type != RegPairInfo::FPR128 &&
2665  !RPI.isPaired() && ByteOffset % 16 != 0) {
2666  ByteOffset += 8 * StackFillDir;
2667  assert(MFI.getObjectAlign(RPI.FrameIdx) <= Align(16));
2668  // A stack frame with a gap looks like this, bottom up:
2669  // d9, d8. x21, gap, x20, x19.
2670  // Set extra alignment on the x21 object to create the gap above it.
2671  MFI.setObjectAlignment(RPI.FrameIdx, Align(16));
2672  NeedGapToAlignStack = false;
2673  }
2674 
2675  int OffsetPost = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
2676  assert(OffsetPost % Scale == 0);
2677  // If filling top down (default), we want the offset after incrementing it.
2678  // If fillibg bootom up (WinCFI) we need the original offset.
2679  int Offset = NeedsWinCFI ? OffsetPre : OffsetPost;
2680 
2681  // The FP, LR pair goes 8 bytes into our expanded 24-byte slot so that the
2682  // Swift context can directly precede FP.
2683  if (NeedsFrameRecord && AFI->hasSwiftAsyncContext() &&
2684  RPI.Reg2 == AArch64::FP)
2685  Offset += 8;
2686  RPI.Offset = Offset / Scale;
2687 
2688  assert(((!RPI.isScalable() && RPI.Offset >= -64 && RPI.Offset <= 63) ||
2689  (RPI.isScalable() && RPI.Offset >= -256 && RPI.Offset <= 255)) &&
2690  "Offset out of bounds for LDP/STP immediate");
2691 
2692  // Save the offset to frame record so that the FP register can point to the
2693  // innermost frame record (spilled FP and LR registers).
2694  if (NeedsFrameRecord && ((!IsWindows && RPI.Reg1 == AArch64::LR &&
2695  RPI.Reg2 == AArch64::FP) ||
2696  (IsWindows && RPI.Reg1 == AArch64::FP &&
2697  RPI.Reg2 == AArch64::LR)))
2699 
2700  RegPairs.push_back(RPI);
2701  if (RPI.isPaired())
2702  i += RegInc;
2703  }
2704  if (NeedsWinCFI) {
2705  // If we need an alignment gap in the stack, align the topmost stack
2706  // object. A stack frame with a gap looks like this, bottom up:
2707  // x19, d8. d9, gap.
2708  // Set extra alignment on the topmost stack object (the first element in
2709  // CSI, which goes top down), to create the gap above it.
2710  if (AFI->hasCalleeSaveStackFreeSpace())
2711  MFI.setObjectAlignment(CSI[0].getFrameIdx(), Align(16));
2712  // We iterated bottom up over the registers; flip RegPairs back to top
2713  // down order.
2714  std::reverse(RegPairs.begin(), RegPairs.end());
2715  }
2716 }
2717 
2720  ArrayRef<CalleeSavedInfo> CSI, const TargetRegisterInfo *TRI) const {
2721  MachineFunction &MF = *MBB.getParent();
2722  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
2723  bool NeedsWinCFI = needsWinCFI(MF);
2724  DebugLoc DL;
2725  SmallVector<RegPairInfo, 8> RegPairs;
2726 
2727  computeCalleeSaveRegisterPairs(MF, CSI, TRI, RegPairs, hasFP(MF));
2728 
2729  const MachineRegisterInfo &MRI = MF.getRegInfo();
2730  if (homogeneousPrologEpilog(MF)) {
2731  auto MIB = BuildMI(MBB, MI, DL, TII.get(AArch64::HOM_Prolog))
2733 
2734  for (auto &RPI : RegPairs) {
2735  MIB.addReg(RPI.Reg1);
2736  MIB.addReg(RPI.Reg2);
2737 
2738  // Update register live in.
2739  if (!MRI.isReserved(RPI.Reg1))
2740  MBB.addLiveIn(RPI.Reg1);
2741  if (!MRI.isReserved(RPI.Reg2))
2742  MBB.addLiveIn(RPI.Reg2);
2743  }
2744  return true;
2745  }
2746  for (const RegPairInfo &RPI : llvm::reverse(RegPairs)) {
2747  unsigned Reg1 = RPI.Reg1;
2748  unsigned Reg2 = RPI.Reg2;
2749  unsigned StrOpc;
2750 
2751  // Issue sequence of spills for cs regs. The first spill may be converted
2752  // to a pre-decrement store later by emitPrologue if the callee-save stack
2753  // area allocation can't be combined with the local stack area allocation.
2754  // For example:
2755  // stp x22, x21, [sp, #0] // addImm(+0)
2756  // stp x20, x19, [sp, #16] // addImm(+2)
2757  // stp fp, lr, [sp, #32] // addImm(+4)
2758  // Rationale: This sequence saves uop updates compared to a sequence of
2759  // pre-increment spills like stp xi,xj,[sp,#-16]!
2760  // Note: Similar rationale and sequence for restores in epilog.
2761  unsigned Size;
2762  Align Alignment;
2763  switch (RPI.Type) {
2764  case RegPairInfo::GPR:
2765  StrOpc = RPI.isPaired() ? AArch64::STPXi : AArch64::STRXui;
2766  Size = 8;
2767  Alignment = Align(8);
2768  break;
2769  case RegPairInfo::FPR64:
2770  StrOpc = RPI.isPaired() ? AArch64::STPDi : AArch64::STRDui;
2771  Size = 8;
2772  Alignment = Align(8);
2773  break;
2774  case RegPairInfo::FPR128:
2775  StrOpc = RPI.isPaired() ? AArch64::STPQi : AArch64::STRQui;
2776  Size = 16;
2777  Alignment = Align(16);
2778  break;
2779  case RegPairInfo::ZPR:
2780  StrOpc = AArch64::STR_ZXI;
2781  Size = 16;
2782  Alignment = Align(16);
2783  break;
2784  case RegPairInfo::PPR:
2785  StrOpc = AArch64::STR_PXI;
2786  Size = 2;
2787  Alignment = Align(2);
2788  break;
2789  }
2790  LLVM_DEBUG(dbgs() << "CSR spill: (" << printReg(Reg1, TRI);
2791  if (RPI.isPaired()) dbgs() << ", " << printReg(Reg2, TRI);
2792  dbgs() << ") -> fi#(" << RPI.FrameIdx;
2793  if (RPI.isPaired()) dbgs() << ", " << RPI.FrameIdx + 1;
2794  dbgs() << ")\n");
2795 
2796  assert((!NeedsWinCFI || !(Reg1 == AArch64::LR && Reg2 == AArch64::FP)) &&
2797  "Windows unwdinding requires a consecutive (FP,LR) pair");
2798  // Windows unwind codes require consecutive registers if registers are
2799  // paired. Make the switch here, so that the code below will save (x,x+1)
2800  // and not (x+1,x).
2801  unsigned FrameIdxReg1 = RPI.FrameIdx;
2802  unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
2803  if (NeedsWinCFI && RPI.isPaired()) {
2804  std::swap(Reg1, Reg2);
2805  std::swap(FrameIdxReg1, FrameIdxReg2);
2806  }
2807  MachineInstrBuilder MIB = BuildMI(MBB, MI, DL, TII.get(StrOpc));
2808  if (!MRI.isReserved(Reg1))
2809  MBB.addLiveIn(Reg1);
2810  if (RPI.isPaired()) {
2811  if (!MRI.isReserved(Reg2))
2812  MBB.addLiveIn(Reg2);
2813  MIB.addReg(Reg2, getPrologueDeath(MF, Reg2));
2815  MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
2816  MachineMemOperand::MOStore, Size, Alignment));
2817  }
2818  MIB.addReg(Reg1, getPrologueDeath(MF, Reg1))
2819  .addReg(AArch64::SP)
2820  .addImm(RPI.Offset) // [sp, #offset*scale],
2821  // where factor*scale is implicit
2824  MachinePointerInfo::getFixedStack(MF, FrameIdxReg1),
2825  MachineMemOperand::MOStore, Size, Alignment));
2826  if (NeedsWinCFI)
2828 
2829  // Update the StackIDs of the SVE stack slots.
2830  MachineFrameInfo &MFI = MF.getFrameInfo();
2831  if (RPI.Type == RegPairInfo::ZPR || RPI.Type == RegPairInfo::PPR)
2833 
2834  }
2835  return true;
2836 }
2837 
2841  MachineFunction &MF = *MBB.getParent();
2842  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
2843  DebugLoc DL;
2844  SmallVector<RegPairInfo, 8> RegPairs;
2845  bool NeedsWinCFI = needsWinCFI(MF);
2846 
2847  if (MBBI != MBB.end())
2848  DL = MBBI->getDebugLoc();
2849 
2850  computeCalleeSaveRegisterPairs(MF, CSI, TRI, RegPairs, hasFP(MF));
2851 
2852  auto EmitMI = [&](const RegPairInfo &RPI) -> MachineBasicBlock::iterator {
2853  unsigned Reg1 = RPI.Reg1;
2854  unsigned Reg2 = RPI.Reg2;
2855 
2856  // Issue sequence of restores for cs regs. The last restore may be converted
2857  // to a post-increment load later by emitEpilogue if the callee-save stack
2858  // area allocation can't be combined with the local stack area allocation.
2859  // For example:
2860  // ldp fp, lr, [sp, #32] // addImm(+4)
2861  // ldp x20, x19, [sp, #16] // addImm(+2)
2862  // ldp x22, x21, [sp, #0] // addImm(+0)
2863  // Note: see comment in spillCalleeSavedRegisters()
2864  unsigned LdrOpc;
2865  unsigned Size;
2866  Align Alignment;
2867  switch (RPI.Type) {
2868  case RegPairInfo::GPR:
2869  LdrOpc = RPI.isPaired() ? AArch64::LDPXi : AArch64::LDRXui;
2870  Size = 8;
2871  Alignment = Align(8);
2872  break;
2873  case RegPairInfo::FPR64:
2874  LdrOpc = RPI.isPaired() ? AArch64::LDPDi : AArch64::LDRDui;
2875  Size = 8;
2876  Alignment = Align(8);
2877  break;
2878  case RegPairInfo::FPR128:
2879  LdrOpc = RPI.isPaired() ? AArch64::LDPQi : AArch64::LDRQui;
2880  Size = 16;
2881  Alignment = Align(16);
2882  break;
2883  case RegPairInfo::ZPR:
2884  LdrOpc = AArch64::LDR_ZXI;
2885  Size = 16;
2886  Alignment = Align(16);
2887  break;
2888  case RegPairInfo::PPR:
2889  LdrOpc = AArch64::LDR_PXI;
2890  Size = 2;
2891  Alignment = Align(2);
2892  break;
2893  }
2894  LLVM_DEBUG(dbgs() << "CSR restore: (" << printReg(Reg1, TRI);
2895  if (RPI.isPaired()) dbgs() << ", " << printReg(Reg2, TRI);
2896  dbgs() << ") -> fi#(" << RPI.FrameIdx;
2897  if (RPI.isPaired()) dbgs() << ", " << RPI.FrameIdx + 1;
2898  dbgs() << ")\n");
2899 
2900  // Windows unwind codes require consecutive registers if registers are
2901  // paired. Make the switch here, so that the code below will save (x,x+1)
2902  // and not (x+1,x).
2903  unsigned FrameIdxReg1 = RPI.FrameIdx;
2904  unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
2905  if (NeedsWinCFI && RPI.isPaired()) {
2906  std::swap(Reg1, Reg2);
2907  std::swap(FrameIdxReg1, FrameIdxReg2);
2908  }
2909  MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII.get(LdrOpc));
2910  if (RPI.isPaired()) {
2911  MIB.addReg(Reg2, getDefRegState(true));
2913  MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
2914  MachineMemOperand::MOLoad, Size, Alignment));
2915  }
2916  MIB.addReg(Reg1, getDefRegState(true))
2917  .addReg(AArch64::SP)
2918  .addImm(RPI.Offset) // [sp, #offset*scale]
2919  // where factor*scale is implicit
2922  MachinePointerInfo::getFixedStack(MF, FrameIdxReg1),
2923  MachineMemOperand::MOLoad, Size, Alignment));
2924  if (NeedsWinCFI)
2926 
2927  return MIB->getIterator();
2928  };
2929 
2930  // SVE objects are always restored in reverse order.
2931  for (const RegPairInfo &RPI : reverse(RegPairs))
2932  if (RPI.isScalable())
2933  EmitMI(RPI);
2934 
2935  if (homogeneousPrologEpilog(MF, &MBB)) {
2936  auto MIB = BuildMI(MBB, MBBI, DL, TII.get(AArch64::HOM_Epilog))
2938  for (auto &RPI : RegPairs) {
2939  MIB.addReg(RPI.Reg1, RegState::Define);
2940  MIB.addReg(RPI.Reg2, RegState::Define);
2941  }
2942  return true;
2943  }
2944 
2945  if (ReverseCSRRestoreSeq) {
2947  for (const RegPairInfo &RPI : reverse(RegPairs)) {
2948  if (RPI.isScalable())
2949  continue;
2950  MachineBasicBlock::iterator It = EmitMI(RPI);
2951  if (First == MBB.end())
2952  First = It;
2953  }
2954  if (First != MBB.end())
2955  MBB.splice(MBBI, &MBB, First);
2956  } else {
2957  for (const RegPairInfo &RPI : RegPairs) {
2958  if (RPI.isScalable())
2959  continue;
2960  (void)EmitMI(RPI);
2961  }
2962  }
2963 
2964  return true;
2965 }
2966 
2968  BitVector &SavedRegs,
2969  RegScavenger *RS) const {
2970  // All calls are tail calls in GHC calling conv, and functions have no
2971  // prologue/epilogue.
2973  return;
2974 
2975  TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS);
2976  const AArch64RegisterInfo *RegInfo = static_cast<const AArch64RegisterInfo *>(
2977  MF.getSubtarget().getRegisterInfo());
2978  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2980  unsigned UnspilledCSGPR = AArch64::NoRegister;
2981  unsigned UnspilledCSGPRPaired = AArch64::NoRegister;
2982 
2983  MachineFrameInfo &MFI = MF.getFrameInfo();
2984  const MCPhysReg *CSRegs = MF.getRegInfo().getCalleeSavedRegs();
2985 
2986  unsigned BasePointerReg = RegInfo->hasBasePointer(MF)
2987  ? RegInfo->getBaseRegister()
2988  : (unsigned)AArch64::NoRegister;
2989 
2990  unsigned ExtraCSSpill = 0;
2991  // Figure out which callee-saved registers to save/restore.
2992  for (unsigned i = 0; CSRegs[i]; ++i) {
2993  const unsigned Reg = CSRegs[i];
2994 
2995  // Add the base pointer register to SavedRegs if it is callee-save.
2996  if (Reg == BasePointerReg)
2997  SavedRegs.set(Reg);
2998 
2999  bool RegUsed = SavedRegs.test(Reg);
3000  unsigned PairedReg = AArch64::NoRegister;
3001  if (AArch64::GPR64RegClass.contains(Reg) ||
3002  AArch64::FPR64RegClass.contains(Reg) ||
3003  AArch64::FPR128RegClass.contains(Reg))
3004  PairedReg = CSRegs[i ^ 1];
3005 
3006  if (!RegUsed) {
3007  if (AArch64::GPR64RegClass.contains(Reg) &&
3008  !RegInfo->isReservedReg(MF, Reg)) {
3009  UnspilledCSGPR = Reg;
3010  UnspilledCSGPRPaired = PairedReg;
3011  }
3012  continue;
3013  }
3014 
3015  // MachO's compact unwind format relies on all registers being stored in
3016  // pairs.
3017  // FIXME: the usual format is actually better if unwinding isn't needed.
3018  if (producePairRegisters(MF) && PairedReg != AArch64::NoRegister &&
3019  !SavedRegs.test(PairedReg)) {
3020  SavedRegs.set(PairedReg);
3021  if (AArch64::GPR64RegClass.contains(PairedReg) &&
3022  !RegInfo->isReservedReg(MF, PairedReg))
3023  ExtraCSSpill = PairedReg;
3024  }
3025  }
3026 
3028  !Subtarget.isTargetWindows()) {
3029  // For Windows calling convention on a non-windows OS, where X18 is treated
3030  // as reserved, back up X18 when entering non-windows code (marked with the
3031  // Windows calling convention) and restore when returning regardless of
3032  // whether the individual function uses it - it might call other functions
3033  // that clobber it.
3034  SavedRegs.set(AArch64::X18);
3035  }
3036 
3037  // Calculates the callee saved stack size.
3038  unsigned CSStackSize = 0;
3039  unsigned SVECSStackSize = 0;
3041  const MachineRegisterInfo &MRI = MF.getRegInfo();
3042  for (unsigned Reg : SavedRegs.set_bits()) {
3043  auto RegSize = TRI->getRegSizeInBits(Reg, MRI) / 8;
3044  if (AArch64::PPRRegClass.contains(Reg) ||
3045  AArch64::ZPRRegClass.contains(Reg))
3046  SVECSStackSize += RegSize;
3047  else
3048  CSStackSize += RegSize;
3049  }
3050 
3051  // Save number of saved regs, so we can easily update CSStackSize later.
3052  unsigned NumSavedRegs = SavedRegs.count();
3053 
3054  // The frame record needs to be created by saving the appropriate registers
3055  uint64_t EstimatedStackSize = MFI.estimateStackSize(MF);
3056  if (hasFP(MF) ||
3057  windowsRequiresStackProbe(MF, EstimatedStackSize + CSStackSize + 16)) {
3058  SavedRegs.set(AArch64::FP);
3059  SavedRegs.set(AArch64::LR);
3060  }
3061 
3062  LLVM_DEBUG(dbgs() << "*** determineCalleeSaves\nSaved CSRs:";
3063  for (unsigned Reg
3064  : SavedRegs.set_bits()) dbgs()
3065  << ' ' << printReg(Reg, RegInfo);
3066  dbgs() << "\n";);
3067 
3068  // If any callee-saved registers are used, the frame cannot be eliminated.
3069  int64_t SVEStackSize =
3070  alignTo(SVECSStackSize + estimateSVEStackObjectOffsets(MFI), 16);
3071  bool CanEliminateFrame = (SavedRegs.count() == 0) && !SVEStackSize;
3072 
3073  // The CSR spill slots have not been allocated yet, so estimateStackSize
3074  // won't include them.
3075  unsigned EstimatedStackSizeLimit = estimateRSStackSizeLimit(MF);
3076 
3077  // Conservatively always assume BigStack when there are SVE spills.
3078  bool BigStack = SVEStackSize ||
3079  (EstimatedStackSize + CSStackSize) > EstimatedStackSizeLimit;
3080  if (BigStack || !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF))
3081  AFI->setHasStackFrame(true);
3082 
3083  // Estimate if we might need to scavenge a register at some point in order
3084  // to materialize a stack offset. If so, either spill one additional
3085  // callee-saved register or reserve a special spill slot to facilitate
3086  // register scavenging. If we already spilled an extra callee-saved register
3087  // above to keep the number of spills even, we don't need to do anything else
3088  // here.
3089  if (BigStack) {
3090  if (!ExtraCSSpill && UnspilledCSGPR != AArch64::NoRegister) {
3091  LLVM_DEBUG(dbgs() << "Spilling " << printReg(UnspilledCSGPR, RegInfo)
3092  << " to get a scratch register.\n");
3093  SavedRegs.set(UnspilledCSGPR);
3094  // MachO's compact unwind format relies on all registers being stored in
3095  // pairs, so if we need to spill one extra for BigStack, then we need to
3096  // store the pair.
3097  if (producePairRegisters(MF))
3098  SavedRegs.set(UnspilledCSGPRPaired);
3099  ExtraCSSpill = UnspilledCSGPR;
3100  }
3101 
3102  // If we didn't find an extra callee-saved register to spill, create
3103  // an emergency spill slot.
3104  if (!ExtraCSSpill || MF.getRegInfo().isPhysRegUsed(ExtraCSSpill)) {
3106  const TargetRegisterClass &RC = AArch64::GPR64RegClass;
3107  unsigned Size = TRI->getSpillSize(RC);
3108  Align Alignment = TRI->getSpillAlign(RC);
3109  int FI = MFI.CreateStackObject(Size, Alignment, false);
3110  RS->addScavengingFrameIndex(FI);
3111  LLVM_DEBUG(dbgs() << "No available CS registers, allocated fi#" << FI
3112  << " as the emergency spill slot.\n");
3113  }
3114  }
3115 
3116  // Adding the size of additional 64bit GPR saves.
3117  CSStackSize += 8 * (SavedRegs.count() - NumSavedRegs);
3118 
3119  // A Swift asynchronous context extends the frame record with a pointer
3120  // directly before FP.
3121  if (hasFP(MF) && AFI->hasSwiftAsyncContext())
3122  CSStackSize += 8;
3123 
3124  uint64_t AlignedCSStackSize = alignTo(CSStackSize, 16);
3125  LLVM_DEBUG(dbgs() << "Estimated stack frame size: "
3126  << EstimatedStackSize + AlignedCSStackSize
3127  << " bytes.\n");
3128 
3129  assert((!MFI.isCalleeSavedInfoValid() ||
3130  AFI->getCalleeSavedStackSize() == AlignedCSStackSize) &&
3131  "Should not invalidate callee saved info");
3132 
3133  // Round up to register pair alignment to avoid additional SP adjustment
3134  // instructions.
3135  AFI->setCalleeSavedStackSize(AlignedCSStackSize);
3136  AFI->setCalleeSaveStackHasFreeSpace(AlignedCSStackSize != CSStackSize);
3137  AFI->setSVECalleeSavedStackSize(alignTo(SVECSStackSize, 16));
3138 }
3139 
3141  MachineFunction &MF, const TargetRegisterInfo *RegInfo,
3142  std::vector<CalleeSavedInfo> &CSI, unsigned &MinCSFrameIndex,
3143  unsigned &MaxCSFrameIndex) const {
3144  bool NeedsWinCFI = needsWinCFI(MF);
3145  // To match the canonical windows frame layout, reverse the list of
3146  // callee saved registers to get them laid out by PrologEpilogInserter
3147  // in the right order. (PrologEpilogInserter allocates stack objects top
3148  // down. Windows canonical prologs store higher numbered registers at
3149  // the top, thus have the CSI array start from the highest registers.)
3150  if (NeedsWinCFI)
3151  std::reverse(CSI.begin(), CSI.end());
3152 
3153  if (CSI.empty())
3154  return true; // Early exit if no callee saved registers are modified!
3155 
3156  // Now that we know which registers need to be saved and restored, allocate
3157  // stack slots for them.
3158  MachineFrameInfo &MFI = MF.getFrameInfo();
3159  auto *AFI = MF.getInfo<AArch64FunctionInfo>();
3160 
3161  bool UsesWinAAPCS = isTargetWindows(MF);
3162  if (UsesWinAAPCS && hasFP(MF) && AFI->hasSwiftAsyncContext()) {
3163  int FrameIdx = MFI.CreateStackObject(8, Align(16), true);
3164  AFI->setSwiftAsyncContextFrameIdx(FrameIdx);
3165  if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3166  if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3167  }
3168 
3169  for (auto &CS : CSI) {
3170  Register Reg = CS.getReg();
3171  const TargetRegisterClass *RC = RegInfo->getMinimalPhysRegClass(Reg);
3172 
3173  unsigned Size = RegInfo->getSpillSize(*RC);
3174  Align Alignment(RegInfo->getSpillAlign(*RC));
3175  int FrameIdx = MFI.CreateStackObject(Size, Alignment, true);
3176  CS.setFrameIdx(FrameIdx);
3177 
3178  if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3179  if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3180 
3181  // Grab 8 bytes below FP for the extended asynchronous frame info.
3182  if (hasFP(MF) && AFI->hasSwiftAsyncContext() && !UsesWinAAPCS &&
3183  Reg == AArch64::FP) {
3184  FrameIdx = MFI.CreateStackObject(8, Alignment, true);
3185  AFI->setSwiftAsyncContextFrameIdx(FrameIdx);
3186  if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3187  if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3188  }
3189  }
3190  return true;
3191 }
3192 
3194  const MachineFunction &MF) const {
3195  const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
3196  return AFI->hasCalleeSaveStackFreeSpace();
3197 }
3198 
3199 /// returns true if there are any SVE callee saves.
3201  int &Min, int &Max) {
3204 
3205  if (!MFI.isCalleeSavedInfoValid())
3206  return false;
3207 
3208  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
3209  for (auto &CS : CSI) {
3210  if (AArch64::ZPRRegClass.contains(CS.getReg()) ||
3211  AArch64::PPRRegClass.contains(CS.getReg())) {
3213  Max + 1 == CS.getFrameIdx()) &&
3214  "SVE CalleeSaves are not consecutive");
3215 
3216  Min = std::min(Min, CS.getFrameIdx());
3217  Max = std::max(Max, CS.getFrameIdx());
3218  }
3219  }
3220  return Min != std::numeric_limits<int>::max();
3221 }
3222 
3223 // Process all the SVE stack objects and determine offsets for each
3224 // object. If AssignOffsets is true, the offsets get assigned.
3225 // Fills in the first and last callee-saved frame indices into
3226 // Min/MaxCSFrameIndex, respectively.
3227 // Returns the size of the stack.
3229  int &MinCSFrameIndex,
3230  int &MaxCSFrameIndex,
3231  bool AssignOffsets) {
3232 #ifndef NDEBUG
3233  // First process all fixed stack objects.
3234  for (int I = MFI.getObjectIndexBegin(); I != 0; ++I)
3236  "SVE vectors should never be passed on the stack by value, only by "
3237  "reference.");
3238 #endif
3239 
3240  auto Assign = [&MFI](int FI, int64_t Offset) {
3241  LLVM_DEBUG(dbgs() << "alloc FI(" << FI << ") at SP[" << Offset << "]\n");
3242  MFI.setObjectOffset(FI, Offset);
3243  };
3244 
3245  int64_t Offset = 0;
3246 
3247  // Then process all callee saved slots.
3248  if (getSVECalleeSaveSlotRange(MFI, MinCSFrameIndex, MaxCSFrameIndex)) {
3249  // Assign offsets to the callee save slots.
3250  for (int I = MinCSFrameIndex; I <= MaxCSFrameIndex; ++I) {
3251  Offset += MFI.getObjectSize(I);
3253  if (AssignOffsets)
3254  Assign(I, -Offset);
3255  }
3256  }
3257 
3258  // Ensure that the Callee-save area is aligned to 16bytes.
3259  Offset = alignTo(Offset, Align(16U));
3260 
3261  // Create a buffer of SVE objects to allocate and sort it.
3262  SmallVector<int, 8> ObjectsToAllocate;
3263  // If we have a stack protector, and we've previously decided that we have SVE
3264  // objects on the stack and thus need it to go in the SVE stack area, then it
3265  // needs to go first.
3266  int StackProtectorFI = -1;
3267  if (MFI.hasStackProtectorIndex()) {
3268  StackProtectorFI = MFI.getStackProtectorIndex();
3269  if (MFI.getStackID(StackProtectorFI) == TargetStackID::ScalableVector)
3270  ObjectsToAllocate.push_back(StackProtectorFI);
3271  }
3272  for (int I = 0, E = MFI.getObjectIndexEnd(); I != E; ++I) {
3273  unsigned StackID = MFI.getStackID(I);
3274  if (StackID != TargetStackID::ScalableVector)
3275  continue;
3276  if (I == StackProtectorFI)
3277  continue;
3278  if (MaxCSFrameIndex >= I && I >= MinCSFrameIndex)
3279  continue;
3280  if (MFI.isDeadObjectIndex(I))
3281  continue;
3282 
3283  ObjectsToAllocate.push_back(I);
3284  }
3285 
3286  // Allocate all SVE locals and spills
3287  for (unsigned FI : ObjectsToAllocate) {
3288  Align Alignment = MFI.getObjectAlign(FI);
3289  // FIXME: Given that the length of SVE vectors is not necessarily a power of
3290  // two, we'd need to align every object dynamically at runtime if the
3291  // alignment is larger than 16. This is not yet supported.
3292  if (Alignment > Align(16))
3294  "Alignment of scalable vectors > 16 bytes is not yet supported");
3295 
3296  Offset = alignTo(Offset + MFI.getObjectSize(FI), Alignment);
3297  if (AssignOffsets)
3298  Assign(FI, -Offset);
3299  }
3300 
3301  return Offset;
3302 }
3303 
3304 int64_t AArch64FrameLowering::estimateSVEStackObjectOffsets(
3305  MachineFrameInfo &MFI) const {
3306  int MinCSFrameIndex, MaxCSFrameIndex;
3307  return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex, false);
3308 }
3309 
3310 int64_t AArch64FrameLowering::assignSVEStackObjectOffsets(
3311  MachineFrameInfo &MFI, int &MinCSFrameIndex, int &MaxCSFrameIndex) const {
3312  return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex,
3313  true);
3314 }
3315 
3317  MachineFunction &MF, RegScavenger *RS) const {
3318  MachineFrameInfo &MFI = MF.getFrameInfo();
3319 
3321  "Upwards growing stack unsupported");
3322 
3323  int MinCSFrameIndex, MaxCSFrameIndex;
3324  int64_t SVEStackSize =
3325  assignSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex);
3326 
3328  AFI->setStackSizeSVE(alignTo(SVEStackSize, 16U));
3329  AFI->setMinMaxSVECSFrameIndex(MinCSFrameIndex, MaxCSFrameIndex);
3330 
3331  // If this function isn't doing Win64-style C++ EH, we don't need to do
3332  // anything.
3333  if (!MF.hasEHFunclets())
3334  return;
3335  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
3336  WinEHFuncInfo &EHInfo = *MF.getWinEHFuncInfo();
3337 
3338  MachineBasicBlock &MBB = MF.front();
3339  auto MBBI = MBB.begin();
3340  while (MBBI != MBB.end() && MBBI->getFlag(MachineInstr::FrameSetup))
3341  ++MBBI;
3342 
3343  // Create an UnwindHelp object.
3344  // The UnwindHelp object is allocated at the start of the fixed object area
3345  int64_t FixedObject =
3346  getFixedObjectSize(MF, AFI, /*IsWin64*/ true, /*IsFunclet*/ false);
3347  int UnwindHelpFI = MFI.CreateFixedObject(/*Size*/ 8,
3348  /*SPOffset*/ -FixedObject,
3349  /*IsImmutable=*/false);
3350  EHInfo.UnwindHelpFrameIdx = UnwindHelpFI;
3351 
3352  // We need to store -2 into the UnwindHelp object at the start of the
3353  // function.
3354  DebugLoc DL;
3355  RS->enterBasicBlockEnd(MBB);
3356  RS->backward(std::prev(MBBI));
3357  Register DstReg = RS->FindUnusedReg(&AArch64::GPR64commonRegClass);
3358  assert(DstReg && "There must be a free register after frame setup");
3359  BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVi64imm), DstReg).addImm(-2);
3360  BuildMI(MBB, MBBI, DL, TII.get(AArch64::STURXi))
3361  .addReg(DstReg, getKillRegState(true))
3362  .addFrameIndex(UnwindHelpFI)
3363  .addImm(0);
3364 }
3365 
3366 namespace {
3367 struct TagStoreInstr {
3368  MachineInstr *MI;
3369  int64_t Offset, Size;
3370  explicit TagStoreInstr(MachineInstr *MI, int64_t Offset, int64_t Size)
3371  : MI(MI), Offset(Offset), Size(Size) {}
3372 };
3373 
3374 class TagStoreEdit {
3375  MachineFunction *MF;
3378  // Tag store instructions that are being replaced.
3380  // Combined memref arguments of the above instructions.
3381  SmallVector<MachineMemOperand *, 8> CombinedMemRefs;
3382 
3383  // Replace allocation tags in [FrameReg + FrameRegOffset, FrameReg +
3384  // FrameRegOffset + Size) with the address tag of SP.
3385  Register FrameReg;
3386  StackOffset FrameRegOffset;
3387  int64_t Size;
3388  // If not None, move FrameReg to (FrameReg + FrameRegUpdate) at the end.
3389  std::optional<int64_t> FrameRegUpdate;
3390  // MIFlags for any FrameReg updating instructions.
3391  unsigned FrameRegUpdateFlags;
3392 
3393  // Use zeroing instruction variants.
3394  bool ZeroData;
3395  DebugLoc DL;
3396 
3397  void emitUnrolled(MachineBasicBlock::iterator InsertI);
3398  void emitLoop(MachineBasicBlock::iterator InsertI);
3399 
3400 public:
3401  TagStoreEdit(MachineBasicBlock *MBB, bool ZeroData)
3402  : MBB(MBB), ZeroData(ZeroData) {
3403  MF = MBB->getParent();
3404  MRI = &MF->getRegInfo();
3405  }
3406  // Add an instruction to be replaced. Instructions must be added in the
3407  // ascending order of Offset, and have to be adjacent.
3408  void addInstruction(TagStoreInstr I) {
3409  assert((TagStores.empty() ||
3410  TagStores.back().Offset + TagStores.back().Size == I.Offset) &&
3411  "Non-adjacent tag store instructions.");
3412  TagStores.push_back(I);
3413  }
3414  void clear() { TagStores.clear(); }
3415  // Emit equivalent code at the given location, and erase the current set of
3416  // instructions. May skip if the replacement is not profitable. May invalidate
3417  // the input iterator and replace it with a valid one.
3418  void emitCode(MachineBasicBlock::iterator &InsertI,
3419  const AArch64FrameLowering *TFI, bool TryMergeSPUpdate);
3420 };
3421 
3422 void TagStoreEdit::emitUnrolled(MachineBasicBlock::iterator InsertI) {
3423  const AArch64InstrInfo *TII =
3424  MF->getSubtarget<AArch64Subtarget>().getInstrInfo();
3425 
3426  const int64_t kMinOffset = -256 * 16;
3427  const int64_t kMaxOffset = 255 * 16;
3428 
3429  Register BaseReg = FrameReg;
3430  int64_t BaseRegOffsetBytes = FrameRegOffset.getFixed();
3431  if (BaseRegOffsetBytes < kMinOffset ||
3432  BaseRegOffsetBytes + (Size - Size % 32) > kMaxOffset) {
3433  Register ScratchReg = MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3434  emitFrameOffset(*MBB, InsertI, DL, ScratchReg, BaseReg,
3435  StackOffset::getFixed(BaseRegOffsetBytes), TII);
3436  BaseReg = ScratchReg;
3437  BaseRegOffsetBytes = 0;
3438  }
3439 
3440  MachineInstr *LastI = nullptr;
3441  while (Size) {
3442  int64_t InstrSize = (Size > 16) ? 32 : 16;
3443  unsigned Opcode =
3444  InstrSize == 16
3445  ? (ZeroData ? AArch64::STZGOffset : AArch64::STGOffset)
3446  : (ZeroData ? AArch64::STZ2GOffset : AArch64::ST2GOffset);
3447  MachineInstr *I = BuildMI(*MBB, InsertI, DL, TII->get(Opcode))
3448  .addReg(AArch64::SP)
3449  .addReg(BaseReg)
3450  .addImm(BaseRegOffsetBytes / 16)
3451  .setMemRefs(CombinedMemRefs);
3452  // A store to [BaseReg, #0] should go last for an opportunity to fold the
3453  // final SP adjustment in the epilogue.
3454  if (BaseRegOffsetBytes == 0)
3455  LastI = I;
3456  BaseRegOffsetBytes += InstrSize;
3457  Size -= InstrSize;
3458  }
3459 
3460  if (LastI)
3461  MBB->splice(InsertI, MBB, LastI);
3462 }
3463 
3464 void TagStoreEdit::emitLoop(MachineBasicBlock::iterator InsertI) {
3465  const AArch64InstrInfo *TII =
3466  MF->getSubtarget<AArch64Subtarget>().getInstrInfo();
3467 
3468  Register BaseReg = FrameRegUpdate
3469  ? FrameReg
3470  : MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3471  Register SizeReg = MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3472 
3473  emitFrameOffset(*MBB, InsertI, DL, BaseReg, FrameReg, FrameRegOffset, TII);
3474 
3475  int64_t LoopSize = Size;
3476  // If the loop size is not a multiple of 32, split off one 16-byte store at
3477  // the end to fold BaseReg update into.
3478  if (FrameRegUpdate && *FrameRegUpdate)
3479  LoopSize -= LoopSize % 32;
3480  MachineInstr *LoopI = BuildMI(*MBB, InsertI, DL,
3481  TII->get(ZeroData ? AArch64::STZGloop_wback
3482  : AArch64::STGloop_wback))
3483  .addDef(SizeReg)
3484  .addDef(BaseReg)
3485  .addImm(LoopSize)
3486  .addReg(BaseReg)
3487  .setMemRefs(CombinedMemRefs);
3488  if (FrameRegUpdate)
3489  LoopI->setFlags(FrameRegUpdateFlags);
3490 
3491  int64_t ExtraBaseRegUpdate =
3492  FrameRegUpdate ? (*FrameRegUpdate - FrameRegOffset.getFixed() - Size) : 0;
3493  if (LoopSize < Size) {
3494  assert(FrameRegUpdate);
3495  assert(Size - LoopSize == 16);
3496  // Tag 16 more bytes at BaseReg and update BaseReg.
3497  BuildMI(*MBB, InsertI, DL,
3498  TII->get(ZeroData ? AArch64::STZGPostIndex : AArch64::STGPostIndex))
3499  .addDef(BaseReg)
3500  .addReg(BaseReg)
3501  .addReg(BaseReg)
3502  .addImm(1 + ExtraBaseRegUpdate / 16)
3503  .setMemRefs(CombinedMemRefs)
3504  .setMIFlags(FrameRegUpdateFlags);
3505  } else if (ExtraBaseRegUpdate) {
3506  // Update BaseReg.
3507  BuildMI(
3508  *MBB, InsertI, DL,
3509  TII->get(ExtraBaseRegUpdate > 0 ? AArch64::ADDXri : AArch64::SUBXri))
3510  .addDef(BaseReg)
3511  .addReg(BaseReg)
3512  .addImm(std::abs(ExtraBaseRegUpdate))
3513  .addImm(0)
3514  .setMIFlags(FrameRegUpdateFlags);
3515  }
3516 }
3517 
3518 // Check if *II is a register update that can be merged into STGloop that ends
3519 // at (Reg + Size). RemainingOffset is the required adjustment to Reg after the
3520 // end of the loop.
3521 bool canMergeRegUpdate(MachineBasicBlock::iterator II, unsigned Reg,
3522  int64_t Size, int64_t *TotalOffset) {
3523  MachineInstr &MI = *II;
3524  if ((MI.getOpcode() == AArch64::ADDXri ||
3525  MI.getOpcode() == AArch64::SUBXri) &&
3526  MI.getOperand(0).getReg() == Reg && MI.getOperand(1).getReg() == Reg) {
3527  unsigned Shift = AArch64_AM::getShiftValue(MI.getOperand(3).getImm());
3528  int64_t Offset = MI.getOperand(2).getImm() << Shift;
3529  if (MI.getOpcode() == AArch64::SUBXri)
3530  Offset = -Offset;
3531  int64_t AbsPostOffset = std::abs(Offset - Size);
3532  const int64_t kMaxOffset =
3533  0xFFF; // Max encoding for unshifted ADDXri / SUBXri
3534  if (AbsPostOffset <= kMaxOffset && AbsPostOffset % 16 == 0) {
3535  *TotalOffset = Offset;
3536  return true;
3537  }
3538  }
3539  return false;
3540 }
3541 
3542 void mergeMemRefs(const SmallVectorImpl<TagStoreInstr> &TSE,
3544  MemRefs.clear();
3545  for (auto &TS : TSE) {
3546  MachineInstr *MI = TS.MI;
3547  // An instruction without memory operands may access anything. Be
3548  // conservative and return an empty list.
3549  if (MI->memoperands_empty()) {
3550  MemRefs.clear();
3551  return;
3552  }
3553  MemRefs.append(MI->memoperands_begin(), MI->memoperands_end());
3554  }
3555 }
3556 
3557 void TagStoreEdit::emitCode(MachineBasicBlock::iterator &InsertI,
3558  const AArch64FrameLowering *TFI,
3559  bool TryMergeSPUpdate) {
3560  if (TagStores.empty())
3561  return;
3562  TagStoreInstr &FirstTagStore = TagStores[0];
3563  TagStoreInstr &LastTagStore = TagStores[TagStores.size() - 1];
3564  Size = LastTagStore.Offset - FirstTagStore.Offset + LastTagStore.Size;
3565  DL = TagStores[0].MI->getDebugLoc();
3566 
3567  Register Reg;
3568  FrameRegOffset = TFI->resolveFrameOffsetReference(
3569  *MF, FirstTagStore.Offset, false /*isFixed*/, false /*isSVE*/, Reg,
3570  /*PreferFP=*/false, /*ForSimm=*/true);
3571  FrameReg = Reg;
3572  FrameRegUpdate = std::nullopt;
3573 
3574  mergeMemRefs(TagStores, CombinedMemRefs);
3575 
3576  LLVM_DEBUG(dbgs() << "Replacing adjacent STG instructions:\n";
3577  for (const auto &Instr
3578  : TagStores) { dbgs() << " " << *Instr.MI; });
3579 
3580  // Size threshold where a loop becomes shorter than a linear sequence of
3581  // tagging instructions.
3582  const int kSetTagLoopThreshold = 176;
3583  if (Size < kSetTagLoopThreshold) {
3584  if (TagStores.size() < 2)
3585  return;
3586  emitUnrolled(InsertI);
3587  } else {
3588  MachineInstr *UpdateInstr = nullptr;
3589  int64_t TotalOffset = 0;
3590  if (TryMergeSPUpdate) {
3591  // See if we can merge base register update into the STGloop.
3592  // This is done in AArch64LoadStoreOptimizer for "normal" stores,
3593  // but STGloop is way too unusual for that, and also it only
3594  // realistically happens in function epilogue. Also, STGloop is expanded
3595  // before that pass.
3596  if (InsertI != MBB->end() &&
3597  canMergeRegUpdate(InsertI, FrameReg, FrameRegOffset.getFixed() + Size,
3598  &TotalOffset)) {
3599  UpdateInstr = &*InsertI++;
3600  LLVM_DEBUG(dbgs() << "Folding SP update into loop:\n "
3601  << *UpdateInstr);
3602  }
3603  }
3604 
3605  if (!UpdateInstr && TagStores.size() < 2)
3606  return;
3607 
3608  if (UpdateInstr) {
3609  FrameRegUpdate = TotalOffset;
3610  FrameRegUpdateFlags = UpdateInstr->getFlags();
3611  }
3612  emitLoop(InsertI);
3613  if (UpdateInstr)
3614  UpdateInstr->eraseFromParent();
3615  }
3616 
3617  for (auto &TS : TagStores)
3618  TS.MI->eraseFromParent();
3619 }
3620 
3621 bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
3622  int64_t &Size, bool &ZeroData) {
3623  MachineFunction &MF = *MI.getParent()->getParent();
3624  const MachineFrameInfo &MFI = MF.getFrameInfo();
3625 
3626  unsigned Opcode = MI.getOpcode();
3627  ZeroData = (Opcode == AArch64::STZGloop || Opcode == AArch64::STZGOffset ||
3628  Opcode == AArch64::STZ2GOffset);
3629 
3630  if (Opcode == AArch64::STGloop || Opcode == AArch64::STZGloop) {
3631  if (!MI.getOperand(0).isDead() || !MI.getOperand(1).isDead())
3632  return false;
3633  if (!MI.getOperand(2).isImm() || !MI.getOperand(3).isFI())
3634  return false;
3635  Offset = MFI.getObjectOffset(MI.getOperand(3).getIndex());
3636  Size = MI.getOperand(2).getImm();
3637  return true;
3638  }
3639 
3640  if (Opcode == AArch64::STGOffset || Opcode == AArch64::STZGOffset)
3641  Size = 16;
3642  else if (Opcode == AArch64::ST2GOffset || Opcode == AArch64::STZ2GOffset)
3643  Size = 32;
3644  else
3645  return false;
3646 
3647  if (MI.getOperand(0).getReg() != AArch64::SP || !MI.getOperand(1).isFI())
3648  return false;
3649 
3650  Offset = MFI.getObjectOffset(MI.getOperand(1).getIndex()) +
3651  16 * MI.getOperand(2).getImm();
3652  return true;
3653 }
3654 
3655 // Detect a run of memory tagging instructions for adjacent stack frame slots,
3656 // and replace them with a shorter instruction sequence:
3657 // * replace STG + STG with ST2G
3658 // * replace STGloop + STGloop with STGloop
3659 // This code needs to run when stack slot offsets are already known, but before
3660 // FrameIndex operands in STG instructions are eliminated.
3662  const AArch64FrameLowering *TFI,
3663  RegScavenger *RS) {
3664  bool FirstZeroData;
3665  int64_t Size, Offset;
3666  MachineInstr &MI = *II;
3667  MachineBasicBlock *MBB = MI.getParent();
3668  MachineBasicBlock::iterator NextI = ++II;
3669  if (&MI == &MBB->instr_back())
3670  return II;
3671  if (!isMergeableStackTaggingInstruction(MI, Offset, Size, FirstZeroData))
3672  return II;
3673 
3675  Instrs.emplace_back(&MI, Offset, Size);
3676 
3677  constexpr int kScanLimit = 10;
3678  int Count = 0;
3680  NextI != E && Count < kScanLimit; ++NextI) {
3681  MachineInstr &MI = *NextI;
3682  bool ZeroData;
3683  int64_t Size, Offset;
3684  // Collect instructions that update memory tags with a FrameIndex operand
3685  // and (when applicable) constant size, and whose output registers are dead
3686  // (the latter is almost always the case in practice). Since these
3687  // instructions effectively have no inputs or outputs, we are free to skip
3688  // any non-aliasing instructions in between without tracking used registers.
3689  if (isMergeableStackTaggingInstruction(MI, Offset, Size, ZeroData)) {
3690  if (ZeroData != FirstZeroData)
3691  break;
3692  Instrs.emplace_back(&MI, Offset, Size);
3693  continue;
3694  }
3695 
3696  // Only count non-transient, non-tagging instructions toward the scan
3697  // limit.
3698  if (!MI.isTransient())
3699  ++Count;
3700 
3701  // Just in case, stop before the epilogue code starts.
3702  if (MI.getFlag(MachineInstr::FrameSetup) ||
3703  MI.getFlag(MachineInstr::FrameDestroy))
3704  break;
3705 
3706  // Reject anything that may alias the collected instructions.
3707  if (MI.mayLoadOrStore() || MI.hasUnmodeledSideEffects())
3708  break;
3709  }
3710 
3711  // New code will be inserted after the last tagging instruction we've found.
3712  MachineBasicBlock::iterator InsertI = Instrs.back().MI;
3713  InsertI++;
3714 
3715  llvm::stable_sort(Instrs,
3716  [](const TagStoreInstr &Left, const TagStoreInstr &Right) {
3717  return Left.Offset < Right.Offset;
3718  });
3719 
3720  // Make sure that we don't have any overlapping stores.
3721  int64_t CurOffset = Instrs[0].Offset;
3722  for (auto &Instr : Instrs) {
3723  if (CurOffset > Instr.Offset)
3724  return NextI;
3725  CurOffset = Instr.Offset + Instr.Size;
3726  }
3727 
3728  // Find contiguous runs of tagged memory and emit shorter instruction
3729  // sequencies for them when possible.
3730  TagStoreEdit TSE(MBB, FirstZeroData);
3731  std::optional<int64_t> EndOffset;
3732  for (auto &Instr : Instrs) {
3733  if (EndOffset && *EndOffset != Instr.Offset) {
3734  // Found a gap.
3735  TSE.emitCode(InsertI, TFI, /*TryMergeSPUpdate = */ false);
3736  TSE.clear();
3737  }
3738 
3739  TSE.addInstruction(Instr);
3740  EndOffset = Instr.Offset + Instr.Size;
3741  }
3742 
3743  const MachineFunction *MF = MBB->getParent();
3744  // Multiple FP/SP updates in a loop cannot be described by CFI instructions.
3745  TSE.emitCode(
3746  InsertI, TFI, /*TryMergeSPUpdate = */
3748 
3749  return InsertI;
3750 }
3751 } // namespace
3752 
3754  MachineFunction &MF, RegScavenger *RS = nullptr) const {
3756  for (auto &BB : MF)
3757  for (MachineBasicBlock::iterator II = BB.begin(); II != BB.end();)
3758  II = tryMergeAdjacentSTG(II, this, RS);
3759 }
3760 
3761 /// For Win64 AArch64 EH, the offset to the Unwind object is from the SP
3762 /// before the update. This is easily retrieved as it is exactly the offset
3763 /// that is set in processFunctionBeforeFrameFinalized.
3765  const MachineFunction &MF, int FI, Register &FrameReg,
3766  bool IgnoreSPUpdates) const {
3767  const MachineFrameInfo &MFI = MF.getFrameInfo();
3768  if (IgnoreSPUpdates) {
3769  LLVM_DEBUG(dbgs() << "Offset from the SP for " << FI << " is "
3770  << MFI.getObjectOffset(FI) << "\n");
3771  FrameReg = AArch64::SP;
3772  return StackOffset::getFixed(MFI.getObjectOffset(FI));
3773  }
3774 
3775  // Go to common code if we cannot provide sp + offset.
3776  if (MFI.hasVarSizedObjects() ||
3779  return getFrameIndexReference(MF, FI, FrameReg);
3780 
3781  FrameReg = AArch64::SP;
3782  return getStackOffset(MF, MFI.getObjectOffset(FI));
3783 }
3784 
3785 /// The parent frame offset (aka dispFrame) is only used on X86_64 to retrieve
3786 /// the parent's frame pointer
3788  const MachineFunction &MF) const {
3789  return 0;
3790 }
3791 
3792 /// Funclets only need to account for space for the callee saved registers,
3793 /// as the locals are accounted for in the parent's stack frame.
3795  const MachineFunction &MF) const {
3796  // This is the size of the pushed CSRs.
3797  unsigned CSSize =
3798  MF.getInfo<AArch64FunctionInfo>()->getCalleeSavedStackSize();
3799  // This is the amount of stack a funclet needs to allocate.
3800  return alignTo(CSSize + MF.getFrameInfo().getMaxCallFrameSize(),
3801  getStackAlign());
3802 }
3803 
3804 namespace {
3805 struct FrameObject {
3806  bool IsValid = false;
3807  // Index of the object in MFI.
3808  int ObjectIndex = 0;
3809  // Group ID this object belongs to.
3810  int GroupIndex = -1;
3811  // This object should be placed first (closest to SP).
3812  bool ObjectFirst = false;
3813  // This object's group (which always contains the object with
3814  // ObjectFirst==true) should be placed first.
3815  bool GroupFirst = false;
3816 };
3817 
3818 class GroupBuilder {
3819  SmallVector<int, 8> CurrentMembers;
3820  int NextGroupIndex = 0;
3821  std::vector<FrameObject> &Objects;
3822 
3823 public:
3824  GroupBuilder(std::vector<FrameObject> &Objects) : Objects(Objects) {}
3825  void AddMember(int Index) { CurrentMembers.push_back(Index); }
3826  void EndCurrentGroup() {
3827  if (CurrentMembers.size() > 1) {
3828  // Create a new group with the current member list. This might remove them
3829  // from their pre-existing groups. That's OK, dealing with overlapping
3830  // groups is too hard and unlikely to make a difference.
3831  LLVM_DEBUG(dbgs() << "group:");
3832  for (int Index : CurrentMembers) {
3833  Objects[Index].GroupIndex = NextGroupIndex;
3834  LLVM_DEBUG(dbgs() << " " << Index);
3835  }
3836  LLVM_DEBUG(dbgs() << "\n");
3837  NextGroupIndex++;
3838  }
3839  CurrentMembers.clear();
3840  }
3841 };
3842 
3843 bool FrameObjectCompare(const FrameObject &A, const FrameObject &B) {
3844  // Objects at a lower index are closer to FP; objects at a higher index are
3845  // closer to SP.
3846  //
3847  // For consistency in our comparison, all invalid objects are placed
3848  // at the end. This also allows us to stop walking when we hit the
3849  // first invalid item after it's all sorted.
3850  //
3851  // The "first" object goes first (closest to SP), followed by the members of
3852  // the "first" group.
3853  //
3854  // The rest are sorted by the group index to keep the groups together.
3855  // Higher numbered groups are more likely to be around longer (i.e. untagged
3856  // in the function epilogue and not at some earlier point). Place them closer
3857  // to SP.
3858  //
3859  // If all else equal, sort by the object index to keep the objects in the
3860  // original order.
3861  return std::make_tuple(!A.IsValid, A.ObjectFirst, A.GroupFirst, A.GroupIndex,
3862  A.ObjectIndex) <
3863  std::make_tuple(!B.IsValid, B.ObjectFirst, B.GroupFirst, B.GroupIndex,
3864  B.ObjectIndex);
3865 }
3866 } // namespace
3867 
3869  const MachineFunction &MF, SmallVectorImpl<int> &ObjectsToAllocate) const {
3870  if (!OrderFrameObjects || ObjectsToAllocate.empty())
3871  return;
3872 
3873  const MachineFrameInfo &MFI = MF.getFrameInfo();
3874  std::vector<FrameObject> FrameObjects(MFI.getObjectIndexEnd());
3875  for (auto &Obj : ObjectsToAllocate) {
3876  FrameObjects[Obj].IsValid = true;
3877  FrameObjects[Obj].ObjectIndex = Obj;
3878  }
3879 
3880  // Identify stack slots that are tagged at the same time.
3881  GroupBuilder GB(FrameObjects);
3882  for (auto &MBB : MF) {
3883  for (auto &MI : MBB) {
3884  if (MI.isDebugInstr())
3885  continue;
3886  int OpIndex;
3887  switch (MI.getOpcode()) {
3888  case AArch64::STGloop:
3889  case AArch64::STZGloop:
3890  OpIndex = 3;
3891  break;
3892  case AArch64::STGOffset:
3893  case AArch64::STZGOffset:
3894  case AArch64::ST2GOffset:
3895  case AArch64::STZ2GOffset:
3896  OpIndex = 1;
3897  break;
3898  default:
3899  OpIndex = -1;
3900  }
3901 
3902  int TaggedFI = -1;
3903  if (OpIndex >= 0) {
3904  const MachineOperand &MO = MI.getOperand(OpIndex);
3905  if (MO.isFI()) {
3906  int FI = MO.getIndex();
3907  if (FI >= 0 && FI < MFI.getObjectIndexEnd() &&
3908  FrameObjects[FI].IsValid)
3909  TaggedFI = FI;
3910  }
3911  }
3912 
3913  // If this is a stack tagging instruction for a slot that is not part of a
3914  // group yet, either start a new group or add it to the current one.
3915  if (TaggedFI >= 0)
3916  GB.AddMember(TaggedFI);
3917  else
3918  GB.EndCurrentGroup();
3919  }
3920  // Groups should never span multiple basic blocks.
3921  GB.EndCurrentGroup();
3922  }
3923 
3924  // If the function's tagged base pointer is pinned to a stack slot, we want to
3925  // put that slot first when possible. This will likely place it at SP + 0,
3926  // and save one instruction when generating the base pointer because IRG does
3927  // not allow an immediate offset.
3928  const AArch64FunctionInfo &AFI = *MF.getInfo<AArch64FunctionInfo>();
3929  std::optional<int> TBPI = AFI.getTaggedBasePointerIndex();
3930  if (TBPI) {
3931  FrameObjects[*TBPI].ObjectFirst = true;
3932  FrameObjects[*TBPI].GroupFirst = true;
3933  int FirstGroupIndex = FrameObjects[*TBPI].GroupIndex;
3934  if (FirstGroupIndex >= 0)
3935  for (FrameObject &Object : FrameObjects)
3936  if (Object.GroupIndex == FirstGroupIndex)
3937  Object.GroupFirst = true;
3938  }
3939 
3940  llvm::stable_sort(FrameObjects, FrameObjectCompare);
3941 
3942  int i = 0;
3943  for (auto &Obj : FrameObjects) {
3944  // All invalid items are sorted at the end, so it's safe to stop.
3945  if (!Obj.IsValid)
3946  break;
3947  ObjectsToAllocate[i++] = Obj.ObjectIndex;
3948  }
3949 
3950  LLVM_DEBUG(dbgs() << "Final frame order:\n"; for (auto &Obj
3951  : FrameObjects) {
3952  if (!Obj.IsValid)
3953  break;
3954  dbgs() << " " << Obj.ObjectIndex << ": group " << Obj.GroupIndex;
3955  if (Obj.ObjectFirst)
3956  dbgs() << ", first";
3957  if (Obj.GroupFirst)
3958  dbgs() << ", group-first";
3959  dbgs() << "\n";
3960  });
3961 }
llvm::Check::Size
@ Size
Definition: FileCheck.h:77
llvm::MachineFunction::hasWinCFI
bool hasWinCFI() const
Definition: MachineFunction.h:754
i
i
Definition: README.txt:29
llvm::alignTo
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition: Alignment.h:155
llvm::isAsynchronousEHPersonality
bool isAsynchronousEHPersonality(EHPersonality Pers)
Returns true if this personality function catches asynchronous exceptions.
Definition: EHPersonalities.h:49
llvm::AArch64ISD::LOADgot
@ LOADgot
Definition: AArch64ISelLowering.h:78
llvm::MachineFrameInfo::isMaxCallFrameSizeComputed
bool isMaxCallFrameSizeComputed() const
Definition: MachineFrameInfo.h:661
llvm::MachineFrameInfo::hasVarSizedObjects
bool hasVarSizedObjects() const
This method may be called any time after instruction selection is complete to determine if the stack ...
Definition: MachineFrameInfo.h:355
llvm::AArch64Subtarget::isTargetWindows
bool isTargetWindows() const
Definition: AArch64Subtarget.h:264
AArch64RegisterInfo.h
Attrs
Function Attrs
Definition: README_ALTIVEC.txt:215
MCDwarf.h
MI
IRTranslator LLVM IR MI
Definition: IRTranslator.cpp:109
MachineInstr.h
MathExtras.h
llvm::MachineInstrBuilder::addImm
const MachineInstrBuilder & addImm(int64_t Val) const
Add a new immediate operand.
Definition: MachineInstrBuilder.h:131
llvm::MachineFrameInfo::estimateStackSize
uint64_t estimateStackSize(const MachineFunction &MF) const
Estimate and return the size of the stack frame.
Definition: MachineFrameInfo.cpp:137
llvm
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
llvm::MachineInstrBuilder::copyImplicitOps
const MachineInstrBuilder & copyImplicitOps(const MachineInstr &OtherMI) const
Copy all the implicit operands from OtherMI onto this one.
Definition: MachineInstrBuilder.h:321
AArch64MachineFunctionInfo.h
llvm::MachineRegisterInfo::isPhysRegUsed
bool isPhysRegUsed(MCRegister PhysReg, bool SkipRegMaskTest=false) const
Return true if the specified register is modified or read in this function.
Definition: MachineRegisterInfo.cpp:589
llvm::CallingConv::Win64
@ Win64
The C convention as implemented on Windows/x86-64 and AArch64.
Definition: CallingConv.h:156
llvm::MCSymbol
MCSymbol - Instances of this class represent a symbol name in the MC file, and MCSymbols are created ...
Definition: MCSymbol.h:41
llvm::AArch64Subtarget::swiftAsyncContextIsDynamicallySet
bool swiftAsyncContextIsDynamicallySet() const
Return whether FrameLowering should always set the "extended frame present" bit in FP,...
Definition: AArch64Subtarget.h:340
getFixedObjectSize
static unsigned getFixedObjectSize(const MachineFunction &MF, const AArch64FunctionInfo *AFI, bool IsWin64, bool IsFunclet)
Returns the size of the fixed object area (allocated next to sp on entry) On Win64 this may include a...
Definition: AArch64FrameLowering.cpp:383
llvm::LivePhysRegs::addReg
void addReg(MCPhysReg Reg)
Adds a physical register and all its sub-registers to the set.
Definition: LivePhysRegs.h:81
DefaultSafeSPDisplacement
static const unsigned DefaultSafeSPDisplacement
This is the biggest offset to the stack pointer we can encode in aarch64 instructions (without using ...
Definition: AArch64FrameLowering.cpp:346
llvm::AArch64_AM::LSL
@ LSL
Definition: AArch64AddressingModes.h:35
llvm::MachineRegisterInfo::createVirtualRegister
Register createVirtualRegister(const TargetRegisterClass *RegClass, StringRef Name="")
createVirtualRegister - Create and return a new virtual register in the function with the specified r...
Definition: MachineRegisterInfo.cpp:157
llvm::MachineModuleInfo::getContext
const MCContext & getContext() const
Definition: MachineModuleInfo.h:139
llvm::TargetRegisterInfo::isGeneralPurposeRegister
virtual bool isGeneralPurposeRegister(const MachineFunction &MF, MCRegister PhysReg) const
Returns true if PhysReg is a general purpose register.
Definition: TargetRegisterInfo.h:588
llvm::StackOffset::get
static StackOffset get(int64_t Fixed, int64_t Scalable)
Definition: TypeSize.h:47
llvm::MachineRegisterInfo
MachineRegisterInfo - Keep track of information for virtual and physical registers,...
Definition: MachineRegisterInfo.h:51
produceCompactUnwindFrame
static bool produceCompactUnwindFrame(MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:2438
RegSize
unsigned RegSize
Definition: AArch64MIPeepholeOpt.cpp:123
llvm::MachineInstrBuilder::add
const MachineInstrBuilder & add(const MachineOperand &MO) const
Definition: MachineInstrBuilder.h:224
llvm::Function
Definition: Function.h:59
llvm::AArch64FunctionInfo::needsDwarfUnwindInfo
bool needsDwarfUnwindInfo(const MachineFunction &MF) const
Definition: AArch64MachineFunctionInfo.cpp:132
llvm::BitVector::set
BitVector & set()
Definition: BitVector.h:344
llvm::TargetSubtargetInfo::getInstrInfo
virtual const TargetInstrInfo * getInstrInfo() const
Definition: TargetSubtargetInfo.h:95
llvm::MachineInstrBuilder::addCFIIndex
const MachineInstrBuilder & addCFIIndex(unsigned CFIIndex) const
Definition: MachineInstrBuilder.h:247
llvm::MachineBasicBlock::isEHFuncletEntry
bool isEHFuncletEntry() const
Returns true if this is the entry block of an EH funclet.
Definition: MachineBasicBlock.h:606
contains
return AArch64::GPR64RegClass contains(Reg)
llvm::CodeModel::Medium
@ Medium
Definition: CodeGen.h:31
llvm::SmallVector
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1199
Statistic.h
llvm::MachineFunction::getMachineMemOperand
MachineMemOperand * getMachineMemOperand(MachinePointerInfo PtrInfo, MachineMemOperand::Flags f, uint64_t s, Align base_alignment, const AAMDNodes &AAInfo=AAMDNodes(), const MDNode *Ranges=nullptr, SyncScope::ID SSID=SyncScope::System, AtomicOrdering Ordering=AtomicOrdering::NotAtomic, AtomicOrdering FailureOrdering=AtomicOrdering::NotAtomic)
getMachineMemOperand - Allocate a new MachineMemOperand.
Definition: MachineFunction.cpp:469
llvm::HexagonISD::PFALSE
@ PFALSE
Definition: HexagonISelLowering.h:81
llvm::CallingConv::PreserveMost
@ PreserveMost
Used for runtime calls that preserves most registers.
Definition: CallingConv.h:63
ErrorHandling.h
llvm::MCRegisterInfo::getDwarfRegNum
int getDwarfRegNum(MCRegister RegNum, bool isEH) const
Map a target register to an equivalent dwarf register number.
Definition: MCRegisterInfo.cpp:68
llvm::NVPTX::PTXCvtMode::RPI
@ RPI
Definition: NVPTX.h:136
llvm::MCRegisterInfo::getNumRegs
unsigned getNumRegs() const
Return the number of registers this target has (useful for sizing arrays holding per register informa...
Definition: MCRegisterInfo.h:491
llvm::getBLRCallOpcode
unsigned getBLRCallOpcode(const MachineFunction &MF)
Return opcode to be used for indirect calls.
Definition: AArch64InstrInfo.cpp:8308
llvm::AArch64RegisterInfo::getBaseRegister
unsigned getBaseRegister() const
Definition: AArch64RegisterInfo.cpp:494
llvm::X86Disassembler::Reg
Reg
All possible values of the reg field in the ModR/M byte.
Definition: X86DisassemblerDecoder.h:462
llvm::BitVector::set_bits
iterator_range< const_set_bits_iterator > set_bits() const
Definition: BitVector.h:133
llvm::AArch64FunctionInfo::setSwiftAsyncContextFrameIdx
void setSwiftAsyncContextFrameIdx(int FI)
Definition: AArch64MachineFunctionInfo.h:442
MachineBasicBlock.h
llvm::LivePhysRegs
A set of physical registers with utility functions to track liveness when walking backward/forward th...
Definition: LivePhysRegs.h:50
Right
Vector Shift Left Right
Definition: README_P9.txt:118
invalidateRegisterPairing
static bool invalidateRegisterPairing(unsigned Reg1, unsigned Reg2, bool UsesWinAAPCS, bool NeedsWinCFI, bool NeedsFrameRecord, bool IsFirst, const TargetRegisterInfo *TRI)
Returns true if Reg1 and Reg2 cannot be paired using a ldp/stp instruction.
Definition: AArch64FrameLowering.cpp:2478
llvm::TargetSubtargetInfo::getRegisterInfo
virtual const TargetRegisterInfo * getRegisterInfo() const
getRegisterInfo - If register information is available, return it.
Definition: TargetSubtargetInfo.h:127
llvm::cl::Hidden
@ Hidden
Definition: CommandLine.h:138
llvm::MachineOperand::setImm
void setImm(int64_t immVal)
Definition: MachineOperand.h:681
llvm::MachineBasicBlock::findDebugLoc
DebugLoc findDebugLoc(instr_iterator MBBI)
Find the next valid DebugLoc starting at MBBI, skipping any DBG_VALUE and DBG_LABEL instructions.
Definition: MachineBasicBlock.cpp:1397
llvm::TargetRegisterInfo
TargetRegisterInfo base class - We assume that the target defines a static array of TargetRegisterDes...
Definition: TargetRegisterInfo.h:236
llvm::MCRegisterInfo::getEncodingValue
uint16_t getEncodingValue(MCRegister RegNo) const
Returns the encoding for RegNo.
Definition: MCRegisterInfo.h:553
Shift
bool Shift
Definition: README.txt:468
llvm::AArch64FrameLowering::enableStackSlotScavenging
bool enableStackSlotScavenging(const MachineFunction &MF) const override
Returns true if the stack slot holes in the fixed and callee-save stack area should be used when allo...
Definition: AArch64FrameLowering.cpp:3193
llvm::AArch64Subtarget::getInstrInfo
const AArch64InstrInfo * getInstrInfo() const override
Definition: AArch64Subtarget.h:181
llvm::Type
The instances of the Type class are immutable: once they are created, they are never changed.
Definition: Type.h:45
llvm::AttributeList
Definition: Attributes.h:432
TargetInstrInfo.h
fixupSEHOpcode
static void fixupSEHOpcode(MachineBasicBlock::iterator MBBI, unsigned LocalStackSize)
Definition: AArch64FrameLowering.cpp:1090
getPrologueDeath
static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg)
Definition: AArch64FrameLowering.cpp:2428
llvm::AArch64FrameLowering::assignCalleeSavedSpillSlots
bool assignCalleeSavedSpillSlots(MachineFunction &MF, const TargetRegisterInfo *TRI, std::vector< CalleeSavedInfo > &CSI, unsigned &MinCSFrameIndex, unsigned &MaxCSFrameIndex) const override
assignCalleeSavedSpillSlots - Allows target to override spill slot assignment logic.
Definition: AArch64FrameLowering.cpp:3140
llvm::max
Expected< ExpressionValue > max(const ExpressionValue &Lhs, const ExpressionValue &Rhs)
Definition: FileCheck.cpp:337
llvm::AArch64FrameLowering::hasFP
bool hasFP(const MachineFunction &MF) const override
hasFP - Return true if the specified function should have a dedicated frame pointer register.
Definition: AArch64FrameLowering.cpp:427
llvm::MachineFrameInfo::getObjectIndexEnd
int getObjectIndexEnd() const
Return one past the maximum frame object index.
Definition: MachineFrameInfo.h:410
determineSVEStackObjectOffsets
static int64_t determineSVEStackObjectOffsets(MachineFrameInfo &MFI, int &MinCSFrameIndex, int &MaxCSFrameIndex, bool AssignOffsets)
Definition: AArch64FrameLowering.cpp:3228
llvm::CodeModel::Kernel
@ Kernel
Definition: CodeGen.h:31
llvm::MachineOperand::isFI
bool isFI() const
isFI - Tests if this is a MO_FrameIndex operand.
Definition: MachineOperand.h:336
llvm::AArch64FunctionInfo::setTaggedBasePointerOffset
void setTaggedBasePointerOffset(unsigned Offset)
Definition: AArch64MachineFunctionInfo.h:418
llvm::MCCFIInstruction::createSameValue
static MCCFIInstruction createSameValue(MCSymbol *L, unsigned Register)
.cfi_same_value Current value of Register is the same as in the previous frame.
Definition: MCDwarf.h:616
TRI
unsigned const TargetRegisterInfo * TRI
Definition: MachineSink.cpp:1628
llvm::RegScavenger::FindUnusedReg
Register FindUnusedReg(const TargetRegisterClass *RC) const
Find an unused register of the specified register class.
Definition: RegisterScavenging.cpp:266
llvm::MachineFrameInfo::getMaxCallFrameSize
unsigned getMaxCallFrameSize() const
Return the maximum size of a call frame that must be allocated for an outgoing function call.
Definition: MachineFrameInfo.h:654
OpIndex
unsigned OpIndex
Definition: SPIRVModuleAnalysis.cpp:46
llvm::ARCISD::BL
@ BL
Definition: ARCISelLowering.h:34
llvm::TypeSize::Fixed
static constexpr TypeSize Fixed(ScalarTy ExactSize)
Definition: TypeSize.h:331
llvm::ArrayRef::empty
bool empty() const
empty - Check if the array is empty.
Definition: ArrayRef.h:158
LLVM_DEBUG
#define LLVM_DEBUG(X)
Definition: Debug.h:101
llvm::AArch64Subtarget::getTargetLowering
const AArch64TargetLowering * getTargetLowering() const override
Definition: AArch64Subtarget.h:178
F
#define F(x, y, z)
Definition: MD5.cpp:55
llvm::MCCFIInstruction::cfiDefCfaOffset
static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, int Offset)
.cfi_def_cfa_offset modifies a rule for computing CFA.
Definition: MCDwarf.h:547
MachineRegisterInfo.h
llvm::MachineInstr::FrameDestroy
@ FrameDestroy
Definition: MachineInstr.h:86
isTargetWindows
static bool isTargetWindows(const MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:1275
llvm::MachineBasicBlock::erase
instr_iterator erase(instr_iterator I)
Remove an instruction from the instruction list and delete it.
Definition: MachineBasicBlock.cpp:1323
llvm::SwiftAsyncFramePointerMode::Never
@ Never
Never set the bit.
llvm::dbgs
raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition: Debug.cpp:163
llvm::AArch64RegisterInfo::cannotEliminateFrame
bool cannotEliminateFrame(const MachineFunction &MF) const
Definition: AArch64RegisterInfo.cpp:635
clear
static void clear(coro::Shape &Shape)
Definition: Coroutines.cpp:149
llvm::classifyEHPersonality
EHPersonality classifyEHPersonality(const Value *Pers)
See if the given exception handling personality function is one that we understand.
Definition: EHPersonalities.cpp:22
llvm::AArch64FrameLowering
Definition: AArch64FrameLowering.h:21
llvm::AArch64FrameOffsetCannotUpdate
@ AArch64FrameOffsetCannotUpdate
Offset cannot apply.
Definition: AArch64InstrInfo.h:437
llvm::MachineInstr::getFlags
uint16_t getFlags() const
Return the MI flags bitvector.
Definition: MachineInstr.h:352
llvm::AlignStyle::Left
@ Left
ReverseCSRRestoreSeq
static cl::opt< bool > ReverseCSRRestoreSeq("reverse-csr-restore-seq", cl::desc("reverse the CSR restore sequence"), cl::init(false), cl::Hidden)
llvm::VFISAKind::SVE
@ SVE
CommandLine.h
llvm::AArch64FrameLowering::processFunctionBeforeFrameFinalized
void processFunctionBeforeFrameFinalized(MachineFunction &MF, RegScavenger *RS) const override
processFunctionBeforeFrameFinalized - This method is called immediately before the specified function...
Definition: AArch64FrameLowering.cpp:3316
llvm::MachineInstrBuilder::addDef
const MachineInstrBuilder & addDef(Register RegNo, unsigned Flags=0, unsigned SubReg=0) const
Add a virtual register definition operand.
Definition: MachineInstrBuilder.h:116
llvm::getDefRegState
unsigned getDefRegState(bool B)
Definition: MachineInstrBuilder.h:525
llvm::MachineFunction::front
const MachineBasicBlock & front() const
Definition: MachineFunction.h:882
llvm::MachineFunction::getRegInfo
MachineRegisterInfo & getRegInfo()
getRegInfo - Return information about the registers currently in use.
Definition: MachineFunction.h:682
OrderFrameObjects
static cl::opt< bool > OrderFrameObjects("aarch64-order-frame-objects", cl::desc("sort stack allocations"), cl::init(true), cl::Hidden)
llvm::MachineBasicBlock::insertAfter
iterator insertAfter(iterator I, MachineInstr *MI)
Insert MI into the instruction list after I.
Definition: MachineBasicBlock.h:958
llvm::TargetInstrInfo
TargetInstrInfo - Interface to description of machine instruction set.
Definition: TargetInstrInfo.h:98
AArch64TargetMachine.h
AArch64InstrInfo.h
llvm::AArch64FrameLowering::resolveFrameOffsetReference
StackOffset resolveFrameOffsetReference(const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE, Register &FrameReg, bool PreferFP, bool ForSimm) const
Definition: AArch64FrameLowering.cpp:2294
llvm::TargetFrameLowering::getOffsetOfLocalArea
int getOffsetOfLocalArea() const
getOffsetOfLocalArea - This method returns the offset of the local area from the stack pointer on ent...
Definition: TargetFrameLowering.h:140
TargetMachine.h
llvm::MutableArrayRef
MutableArrayRef - Represent a mutable reference to an array (0 or more elements consecutively in memo...
Definition: ArrayRef.h:27
llvm::AArch64FunctionInfo::isStackRealigned
bool isStackRealigned() const
Definition: AArch64MachineFunctionInfo.h:236
llvm::MachineOperand::CreateImm
static MachineOperand CreateImm(int64_t Val)
Definition: MachineOperand.h:812
llvm::AArch64InstrInfo
Definition: AArch64InstrInfo.h:35
P2
This might compile to this xmm1 xorps xmm0 movss xmm0 ret Now consider if the code caused xmm1 to get spilled This might produce this xmm1 movaps xmm0 movaps xmm1 movss xmm0 ret since the reload is only used by these we could fold it into the producing something like xmm1 movaps xmm0 ret saving two instructions The basic idea is that a reload from a spill if only one byte chunk is bring in zeros the one element instead of elements This can be used to simplify a variety of shuffle where the elements are fixed zeros This code generates ugly probably due to costs being off or< 4 x float > * P2
Definition: README-SSE.txt:278
E
static GCRegistry::Add< CoreCLRGC > E("coreclr", "CoreCLR-compatible GC")
CASE
#define CASE(n)
llvm::MachineOperand::getImm
int64_t getImm() const
Definition: MachineOperand.h:553
llvm::AArch64FrameLowering::eliminateCallFramePseudoInstr
MachineBasicBlock::iterator eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator I) const override
This method is called during prolog/epilog code insertion to eliminate call frame setup and destroy p...
Definition: AArch64FrameLowering.cpp:467
llvm::MachineFunction::getInfo
Ty * getInfo()
getInfo - Keep track of various per-function pieces of information for backends that would like to do...
Definition: MachineFunction.h:770
llvm::LivePhysRegs::addLiveIns
void addLiveIns(const MachineBasicBlock &MBB)
Adds all live-in registers of basic block MBB.
Definition: LivePhysRegs.cpp:238
llvm::ISD::CATCHRET
@ CATCHRET
CATCHRET - Represents a return from a catch block funclet.
Definition: ISDOpcodes.h:1043
llvm::AArch64Subtarget::isTargetILP32
bool isTargetILP32() const
Definition: AArch64Subtarget.h:273
llvm::TargetFrameLowering::getStackAlign
Align getStackAlign() const
getStackAlignment - This method returns the number of bytes to which the stack pointer must be aligne...
Definition: TargetFrameLowering.h:100
llvm::ARM_PROC::A
@ A
Definition: ARMBaseInfo.h:34
llvm::MachineRegisterInfo::isReserved
bool isReserved(MCRegister PhysReg) const
isReserved - Returns true when PhysReg is a reserved register.
Definition: MachineRegisterInfo.h:956
llvm::BitVector::count
size_type count() const
count - Returns the number of bits which are set.
Definition: BitVector.h:155
llvm::AArch64TargetLowering::supportSwiftError
bool supportSwiftError() const override
Return true if the target supports swifterror attribute.
Definition: AArch64ISelLowering.h:850
InsertReturnAddressAuth
static void InsertReturnAddressAuth(MachineFunction &MF, MachineBasicBlock &MBB, bool NeedsWinCFI, bool *HasWinCFI)
Definition: AArch64FrameLowering.cpp:1863
int
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s LLVM produces ret int
Definition: README.txt:536
EnableHomogeneousPrologEpilog
cl::opt< bool > EnableHomogeneousPrologEpilog("homogeneous-prolog-epilog", cl::Hidden, cl::desc("Emit homogeneous prologue and epilogue for the size " "optimization (default = off)"))
llvm::AArch64FunctionInfo::setHasRedZone
void setHasRedZone(bool s)
Definition: AArch64MachineFunctionInfo.h:337
getStackOffset
static StackOffset getStackOffset(const MachineFunction &MF, int64_t ObjectOffset)
Definition: AArch64FrameLowering.cpp:2266
llvm::TargetRegisterClass
Definition: TargetRegisterInfo.h:45
TII
const HexagonInstrInfo * TII
Definition: HexagonCopyToCombine.cpp:125
llvm::dwarf::Index
Index
Definition: Dwarf.h:550
llvm::MCInstrDesc
Describe properties that are true of each instruction in the target description file.
Definition: MCInstrDesc.h:198
B
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
First
into llvm powi allowing the code generator to produce balanced multiplication trees First
Definition: README.txt:54
llvm::MachineOperand
MachineOperand class - Representation of each machine instruction operand.
Definition: MachineOperand.h:48
llvm::CodeModel::Small
@ Small
Definition: CodeGen.h:31
llvm::MachineInstr::FrameSetup
@ FrameSetup
Definition: MachineInstr.h:84
llvm::MCID::Flag
Flag
These should be considered private to the implementation of the MCInstrDesc class.
Definition: MCInstrDesc.h:148
llvm::MachineModuleInfo
This class contains meta information specific to a module.
Definition: MachineModuleInfo.h:74
getSVEStackSize
static StackOffset getSVEStackSize(const MachineFunction &MF)
Returns the size of the entire SVE stackframe (calleesaves + spills).
Definition: AArch64FrameLowering.cpp:400
llvm::StackOffset::getFixed
static StackOffset getFixed(int64_t Fixed)
Definition: TypeSize.h:45
findScratchNonCalleeSaveRegister
static unsigned findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB)
Definition: AArch64FrameLowering.cpp:822
llvm::AArch64_AM::getShifterImm
static unsigned getShifterImm(AArch64_AM::ShiftExtendType ST, unsigned Imm)
getShifterImm - Encode the shift type and amount: imm: 6-bit shift amount shifter: 000 ==> lsl 001 ==...
Definition: AArch64AddressingModes.h:99
llvm::RegScavenger::backward
void backward()
Update internal register state and move MBB iterator backwards.
Definition: RegisterScavenging.cpp:239
llvm::report_fatal_error
void report_fatal_error(Error Err, bool gen_crash_diag=true)
Report a serious error, calling any installed error handler.
Definition: Error.cpp:145
llvm::STATISTIC
STATISTIC(NumFunctions, "Total number of functions")
llvm::MachineFrameInfo::getStackID
uint8_t getStackID(int ObjectIdx) const
Definition: MachineFrameInfo.h:731
llvm::RegScavenger::enterBasicBlockEnd
void enterBasicBlockEnd(MachineBasicBlock &MBB)
Start tracking liveness from the end of basic block MBB.
Definition: RegisterScavenging.cpp:87
llvm::MachineBasicBlock::instr_back
MachineInstr & instr_back()
Definition: MachineBasicBlock.h:284
llvm::AArch64_AM::getShiftValue
static unsigned getShiftValue(unsigned Imm)
getShiftValue - Extract the shift value.
Definition: AArch64AddressingModes.h:86
llvm::MachineFunction::setHasWinCFI
void setHasWinCFI(bool v)
Definition: MachineFunction.h:757
llvm::MachineFrameInfo::getStackSize
uint64_t getStackSize() const
Return the number of bytes that must be allocated to hold all of the fixed size frame objects.
Definition: MachineFrameInfo.h:585
DebugLoc.h
emitShadowCallStackPrologue
static void emitShadowCallStackPrologue(const TargetInstrInfo &TII, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, bool NeedsWinCFI, bool NeedsUnwindInfo)
Definition: AArch64FrameLowering.cpp:1306
llvm::MachineFrameInfo::getObjectOffset
int64_t getObjectOffset(int ObjectIdx) const
Return the assigned stack offset of the specified object from the incoming stack pointer.
Definition: MachineFrameInfo.h:526
llvm::AArch64FrameLowering::getFrameIndexReference
StackOffset getFrameIndexReference(const MachineFunction &MF, int FI, Register &FrameReg) const override
getFrameIndexReference - Provide a base+offset reference to an FI slot for debug info.
Definition: AArch64FrameLowering.cpp:2237
Info
Analysis containing CSE Info
Definition: CSEInfo.cpp:27
llvm::BitVector
Definition: BitVector.h:75
llvm::SmallVectorImpl::append
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
Definition: SmallVector.h:687
Align
uint64_t Align
Definition: ELFObjHandler.cpp:82
llvm::MCCFIInstruction::createEscape
static MCCFIInstruction createEscape(MCSymbol *L, StringRef Vals, StringRef Comment="")
.cfi_escape Allows the user to add arbitrary bytes to the unwind info.
Definition: MCDwarf.h:632
llvm::AArch64FunctionInfo::hasStackFrame
bool hasStackFrame() const
Definition: AArch64MachineFunctionInfo.h:233
llvm::Align
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition: Alignment.h:39
llvm::MachineFrameInfo::getObjectIndexBegin
int getObjectIndexBegin() const
Return the minimum frame object index.
Definition: MachineFrameInfo.h:407
llvm::MachineInstrBuilder::addExternalSymbol
const MachineInstrBuilder & addExternalSymbol(const char *FnName, unsigned TargetFlags=0) const
Definition: MachineInstrBuilder.h:184
convertCalleeSaveRestoreToSPPrePostIncDec
static MachineBasicBlock::iterator convertCalleeSaveRestoreToSPPrePostIncDec(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, const TargetInstrInfo *TII, int CSStackSizeInc, bool NeedsWinCFI, bool *HasWinCFI, bool EmitCFI, MachineInstr::MIFlag FrameFlag=MachineInstr::FrameSetup, int CFAOffset=0)
Definition: AArch64FrameLowering.cpp:1112
llvm::MachineFrameInfo::isCalleeSavedInfoValid
bool isCalleeSavedInfoValid() const
Has the callee saved info been calculated yet?
Definition: MachineFrameInfo.h:800
llvm::CallingConv::ID
unsigned ID
LLVM IR allows to use arbitrary numbers as calling convention identifiers.
Definition: CallingConv.h:24
fixupCalleeSaveRestoreStackOffset
static void fixupCalleeSaveRestoreStackOffset(MachineInstr &MI, uint64_t LocalStackSize, bool NeedsWinCFI, bool *HasWinCFI)
Definition: AArch64FrameLowering.cpp:1226
llvm::MachineBasicBlock
Definition: MachineBasicBlock.h:94
llvm::MachineFrameInfo::isDeadObjectIndex
bool isDeadObjectIndex(int ObjectIdx) const
Returns true if the specified index corresponds to a dead object.
Definition: MachineFrameInfo.h:745
llvm::Function::getAttributes
AttributeList getAttributes() const
Return the attribute list for this Function.
Definition: Function.h:313
computeCalleeSaveRegisterPairs
static void computeCalleeSaveRegisterPairs(MachineFunction &MF, ArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI, SmallVectorImpl< RegPairInfo > &RegPairs, bool NeedsFrameRecord)
Definition: AArch64FrameLowering.cpp:2526
AArch64AddressingModes.h
llvm::OutputFileType::Object
@ Object
StackTaggingMergeSetTag
static cl::opt< bool > StackTaggingMergeSetTag("stack-tagging-merge-settag", cl::desc("merge settag instruction in function epilog"), cl::init(true), cl::Hidden)
llvm::TargetOptions::DisableFramePointerElim
bool DisableFramePointerElim(const MachineFunction &MF) const
DisableFramePointerElim - This returns true if frame pointer elimination optimization should be disab...
Definition: TargetOptionsImpl.cpp:23
llvm::MachineFunction::getMMI
MachineModuleInfo & getMMI() const
Definition: MachineFunction.h:623
llvm::TargetRegisterInfo::getSpillAlign
Align getSpillAlign(const TargetRegisterClass &RC) const
Return the minimum required alignment in bytes for a spill slot for a register of this class.
Definition: TargetRegisterInfo.h:291
llvm::AArch64FrameLowering::resolveFrameIndexReference
StackOffset resolveFrameIndexReference(const MachineFunction &MF, int FI, Register &FrameReg, bool PreferFP, bool ForSimm) const
Definition: AArch64FrameLowering.cpp:2283
llvm::MachineFunction::getSubtarget
const TargetSubtargetInfo & getSubtarget() const
getSubtarget - Return the subtarget for which this machine code is being compiled.
Definition: MachineFunction.h:672
llvm::MachineInstrBuilder::addFrameIndex
const MachineInstrBuilder & addFrameIndex(int Idx) const
Definition: MachineInstrBuilder.h:152
llvm::MachineInstrBuilder::setMIFlag
const MachineInstrBuilder & setMIFlag(MachineInstr::MIFlag Flag) const
Definition: MachineInstrBuilder.h:278
llvm::isAArch64FrameOffsetLegal
int isAArch64FrameOffsetLegal(const MachineInstr &MI, StackOffset &Offset, bool *OutUseUnscaledOp=nullptr, unsigned *OutUnscaledOp=nullptr, int64_t *EmittableOffset=nullptr)
Check if the Offset is a valid frame offset for MI.
Definition: AArch64InstrInfo.cpp:4646
llvm::AArch64FunctionInfo::getCalleeSaveBaseToFrameRecordOffset
int getCalleeSaveBaseToFrameRecordOffset() const
Definition: AArch64MachineFunctionInfo.h:422
llvm::Function::hasFnAttribute
bool hasFnAttribute(Attribute::AttrKind Kind) const
Return true if the function has the attribute.
Definition: Function.cpp:640
llvm::cl::opt< bool >
llvm::TargetRegisterInfo::getSpillSize
unsigned getSpillSize(const TargetRegisterClass &RC) const
Return the size in bytes of the stack slot allocated to hold a spilled copy of a register from class ...
Definition: TargetRegisterInfo.h:285
llvm::WinEHFuncInfo
Definition: WinEHFuncInfo.h:90
llvm::AArch64FrameLowering::resetCFIToInitialState
void resetCFIToInitialState(MachineBasicBlock &MBB) const override
Emit CFI instructions that recreate the state of the unwind information upon fucntion entry.
Definition: AArch64FrameLowering.cpp:589
getSVECalleeSaveSlotRange
static bool getSVECalleeSaveSlotRange(const MachineFrameInfo &MFI, int &Min, int &Max)
returns true if there are any SVE callee saves.
Definition: AArch64FrameLowering.cpp:3200
Index
uint32_t Index
Definition: ELFObjHandler.cpp:83
llvm::MachineInstr
Representation of each machine instruction.
Definition: MachineInstr.h:66
llvm::MachineInstrBuilder
Definition: MachineInstrBuilder.h:69
uint64_t
llvm::MachineFrameInfo::getObjectSize
int64_t getObjectSize(int ObjectIdx) const
Return the size of the specified object.
Definition: MachineFrameInfo.h:470
AArch64FrameLowering.h
llvm::Function::getCallingConv
CallingConv::ID getCallingConv() const
getCallingConv()/setCallingConv(CC) - These method get and set the calling convention of this functio...
Definition: Function.h:237
llvm::CallingConv::CXX_FAST_TLS
@ CXX_FAST_TLS
Used for access functions.
Definition: CallingConv.h:72
llvm::MachineFrameInfo::hasStackProtectorIndex
bool hasStackProtectorIndex() const
Definition: MachineFrameInfo.h:360
llvm::AArch64FrameLowering::spillCalleeSavedRegisters
bool spillCalleeSavedRegisters(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, ArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI) const override
spillCalleeSavedRegisters - Issues instruction(s) to spill all callee saved registers and returns tru...
Definition: AArch64FrameLowering.cpp:2718
llvm::AArch64Subtarget::getChkStkName
const char * getChkStkName() const
Definition: AArch64Subtarget.h:393
llvm::TargetOptions::SwiftAsyncFramePointer
SwiftAsyncFramePointerMode SwiftAsyncFramePointer
Control when and how the Swift async frame pointer bit should be set.
Definition: TargetOptions.h:243
llvm::LivePhysRegs::available
bool available(const MachineRegisterInfo &MRI, MCPhysReg Reg) const
Returns true if register Reg and no aliasing register is in the set.
Definition: LivePhysRegs.cpp:141
llvm::MCCFIInstruction::createNegateRAState
static MCCFIInstruction createNegateRAState(MCSymbol *L)
.cfi_negate_ra_state AArch64 negate RA state.
Definition: MCDwarf.h:597
llvm::MachineRegisterInfo::getCalleeSavedRegs
const MCPhysReg * getCalleeSavedRegs() const
Returns list of callee saved registers.
Definition: MachineRegisterInfo.cpp:625
llvm::AArch64FunctionInfo::setCalleeSaveBaseToFrameRecordOffset
void setCalleeSaveBaseToFrameRecordOffset(int Offset)
Definition: AArch64MachineFunctionInfo.h:425
llvm::ISD::CLEANUPRET
@ CLEANUPRET
CLEANUPRET - Represents a return from a cleanup block funclet.
Definition: ISDOpcodes.h:1047
llvm::AArch64FunctionInfo
AArch64FunctionInfo - This class is derived from MachineFunctionInfo and contains private AArch64-spe...
Definition: AArch64MachineFunctionInfo.h:39
llvm::RegState::Implicit
@ Implicit
Not emitted register (e.g. carry, or temporary result).
Definition: MachineInstrBuilder.h:46
I
#define I(x, y, z)
Definition: MD5.cpp:58
llvm::MCPhysReg
uint16_t MCPhysReg
An unsigned integer type large enough to represent all physical registers, but not necessarily virtua...
Definition: MCRegister.h:21
llvm::RegScavenger
Definition: RegisterScavenging.h:34
llvm::MachineFrameInfo::getObjectAlign
Align getObjectAlign(int ObjectIdx) const
Return the alignment of the specified stack object.
Definition: MachineFrameInfo.h:484
llvm::cl::init
initializer< Ty > init(const Ty &Val)
Definition: CommandLine.h:445
llvm::MachineFrameInfo::setObjectOffset
void setObjectOffset(int ObjectIdx, int64_t SPOffset)
Set the stack frame offset of the specified object.
Definition: MachineFrameInfo.h:560
llvm::TargetStackID::ScalableVector
@ ScalableVector
Definition: TargetFrameLowering.h:30
windowsRequiresStackProbe
static bool windowsRequiresStackProbe(MachineFunction &MF, uint64_t StackSizeInBytes)
Definition: AArch64FrameLowering.cpp:866
llvm::MachineBasicBlock::getLastNonDebugInstr
iterator getLastNonDebugInstr(bool SkipPseudoOp=true)
Returns an iterator to the last non-debug instruction in the basic block, or end().
Definition: MachineBasicBlock.cpp:269
llvm::AArch64FrameLowering::getStackIDForScalableVectors
TargetStackID::Value getStackIDForScalableVectors() const override
Returns the StackID that scalable vectors should be associated with.
Definition: AArch64FrameLowering.cpp:377
llvm::TargetMachine::Options
TargetOptions Options
Definition: TargetMachine.h:119
llvm::MachineInstr::setFlags
void setFlags(unsigned flags)
Definition: MachineInstr.h:366
assert
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
llvm::AArch64FunctionInfo::getLocalStackSize
uint64_t getLocalStackSize() const
Definition: AArch64MachineFunctionInfo.h:249
InsertSEH
static MachineBasicBlock::iterator InsertSEH(MachineBasicBlock::iterator MBBI, const TargetInstrInfo &TII, MachineInstr::MIFlag Flag)
Definition: AArch64FrameLowering.cpp:970
llvm::MachineFrameInfo::CreateFixedObject
int CreateFixedObject(uint64_t Size, int64_t SPOffset, bool IsImmutable, bool isAliased=false)
Create a new object at a fixed location on the stack.
Definition: MachineFrameInfo.cpp:83
llvm::AArch64FunctionInfo::setStackRealigned
void setStackRealigned(bool s)
Definition: AArch64MachineFunctionInfo.h:237
std::swap
void swap(llvm::BitVector &LHS, llvm::BitVector &RHS)
Implement std::swap in terms of BitVector swap.
Definition: BitVector.h:853
llvm::MachineFunction::getFrameInfo
MachineFrameInfo & getFrameInfo()
getFrameInfo - Return the frame info object for the current function.
Definition: MachineFunction.h:688
llvm::MachineBasicBlock::getParent
const MachineFunction * getParent() const
Return the MachineFunction containing this basic block.
Definition: MachineBasicBlock.h:265
llvm::RegScavenger::addScavengingFrameIndex
void addScavengingFrameIndex(int FI)
Add a scavenging frame index.
Definition: RegisterScavenging.h:143
llvm::MachineInstrBuilder::addMemOperand
const MachineInstrBuilder & addMemOperand(MachineMemOperand *MMO) const
Definition: MachineInstrBuilder.h:202
llvm::TargetFrameLowering::getStackGrowthDirection
StackDirection getStackGrowthDirection() const
getStackGrowthDirection - Return the direction the stack grows
Definition: TargetFrameLowering.h:89
llvm::AArch64FunctionInfo::getVarArgsGPRSize
unsigned getVarArgsGPRSize() const
Definition: AArch64MachineFunctionInfo.h:348
llvm::AArch64Subtarget::isCallingConvWin64
bool isCallingConvWin64(CallingConv::ID CC) const
Definition: AArch64Subtarget.h:325
llvm::MCCFIInstruction::cfiDefCfa
static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register, int Offset)
.cfi_def_cfa defines a rule for computing CFA as: take address from Register and add Offset to it.
Definition: MCDwarf.h:533
llvm::emitFrameOffset
void emitFrameOffset(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, unsigned DestReg, unsigned SrcReg, StackOffset Offset, const TargetInstrInfo *TII, MachineInstr::MIFlag=MachineInstr::NoFlags, bool SetNZCV=false, bool NeedsWinCFI=false, bool *HasWinCFI=nullptr, bool EmitCFAOffset=false, StackOffset InitialOffset={}, unsigned FrameReg=AArch64::SP)
emitFrameOffset - Emit instructions as needed to set DestReg to SrcReg plus Offset.
Definition: AArch64InstrInfo.cpp:4402
MachineModuleInfo.h
llvm::WinEHFuncInfo::UnwindHelpFrameIdx
int UnwindHelpFrameIdx
Definition: WinEHFuncInfo.h:99
llvm::AArch64FrameLowering::getNonLocalFrameIndexReference
StackOffset getNonLocalFrameIndexReference(const MachineFunction &MF, int FI) const override
getNonLocalFrameIndexReference - This method returns the offset used to reference a frame index locat...
Definition: AArch64FrameLowering.cpp:2247
llvm::MachineInstrBuilder::addReg
const MachineInstrBuilder & addReg(Register RegNo, unsigned flags=0, unsigned SubReg=0) const
Add a new virtual register operand.
Definition: MachineInstrBuilder.h:97
llvm::MachineInstrBuilder::addUse
const MachineInstrBuilder & addUse(Register RegNo, unsigned Flags=0, unsigned SubReg=0) const
Add a virtual register use operand.
Definition: MachineInstrBuilder.h:123
llvm::AArch64FrameLowering::canUseRedZone
bool canUseRedZone(const MachineFunction &MF) const
Can this function use the red zone for local allocations.
Definition: AArch64FrameLowering.cpp:405
llvm::MachineInstr::MIFlag
MIFlag
Definition: MachineInstr.h:82
Cleanup
static const HTTPClientCleanup Cleanup
Definition: HTTPClient.cpp:42
llvm::MachineFunction
Definition: MachineFunction.h:258
TargetOptions.h
llvm::MachineFrameInfo::getCalleeSavedInfo
const std::vector< CalleeSavedInfo > & getCalleeSavedInfo() const
Returns a reference to call saved info vector for the current function.
Definition: MachineFrameInfo.h:787
llvm::TargetMachine::getMCAsmInfo
const MCAsmInfo * getMCAsmInfo() const
Return target specific asm information.
Definition: TargetMachine.h:213
llvm::MachineBasicBlock::getFirstTerminator
iterator getFirstTerminator()
Returns an iterator to the first terminator instruction of this basic block.
Definition: MachineBasicBlock.cpp:240
llvm::AArch64_AM::encodeLogicalImmediate
static uint64_t encodeLogicalImmediate(uint64_t imm, unsigned regSize)
encodeLogicalImmediate - Return the encoded immediate value for a logical immediate instruction of th...
Definition: AArch64AddressingModes.h:283
EnableRedZone
static cl::opt< bool > EnableRedZone("aarch64-redzone", cl::desc("enable use of redzone on AArch64"), cl::init(false), cl::Hidden)
llvm::ArrayRef
ArrayRef - Represent a constant reference to an array (0 or more elements consecutively in memory),...
Definition: APInt.h:33
llvm::AArch64Subtarget::isXRegisterReserved
bool isXRegisterReserved(size_t i) const
Definition: AArch64Subtarget.h:211
llvm::MachineFrameInfo::setStackID
void setStackID(int ObjectIdx, uint8_t ID)
Definition: MachineFrameInfo.h:736
llvm::MachineFrameInfo::hasPatchPoint
bool hasPatchPoint() const
This method may be called any time after instruction selection is complete to determine if there is a...
Definition: MachineFrameInfo.h:389
llvm::min
Expected< ExpressionValue > min(const ExpressionValue &Lhs, const ExpressionValue &Rhs)
Definition: FileCheck.cpp:357
MCAsmInfo.h
llvm::any_of
bool any_of(R &&range, UnaryPredicate P)
Provide wrappers to std::any_of which take ranges instead of having to pass begin/end explicitly.
Definition: STLExtras.h:1742
DataLayout.h
llvm::MachineFrameInfo::CreateStackObject
int CreateStackObject(uint64_t Size, Align Alignment, bool isSpillSlot, const AllocaInst *Alloca=nullptr, uint8_t ID=0)
Create a new statically sized stack object, returning a nonnegative identifier to represent it.
Definition: MachineFrameInfo.cpp:51
llvm::StringRef
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:50
llvm::MachineBasicBlock::splice
void splice(iterator Where, MachineBasicBlock *Other, iterator From)
Take an instruction from MBB 'Other' at the position From, and insert it into this MBB right before '...
Definition: MachineBasicBlock.h:1037
MBBI
MachineBasicBlock MachineBasicBlock::iterator MBBI
Definition: AArch64SLSHardening.cpp:75
llvm::Offset
@ Offset
Definition: DWP.cpp:406
llvm_unreachable
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
Definition: ErrorHandling.h:143
if
if(llvm_vc STREQUAL "") set(fake_version_inc "$
Definition: CMakeLists.txt:14
getRegisterOrZero
static MCRegister getRegisterOrZero(MCRegister Reg, bool HasSVE)
Definition: AArch64FrameLowering.cpp:674
uint32_t
llvm::StackOffset
StackOffset holds a fixed and a scalable offset in bytes.
Definition: TypeSize.h:36
llvm::ilist_node_impl::getIterator
self_iterator getIterator()
Definition: ilist_node.h:82
TargetSubtargetInfo.h
DL
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
Definition: AArch64SLSHardening.cpp:76
llvm::SwiftAsyncFramePointerMode::DeploymentBased
@ DeploymentBased
Determine whether to set the bit statically or dynamically based on the deployment target.
llvm::AArch64FunctionInfo::getTailCallReservedStack
unsigned getTailCallReservedStack() const
Definition: AArch64MachineFunctionInfo.h:219
CC
auto CC
Definition: RISCVRedundantCopyElimination.cpp:79
llvm::AArch64RegisterInfo
Definition: AArch64RegisterInfo.h:26
llvm::AArch64FunctionInfo::setMinMaxSVECSFrameIndex
void setMinMaxSVECSFrameIndex(int Min, int Max)
Definition: AArch64MachineFunctionInfo.h:323
llvm::MCContext::createTempSymbol
MCSymbol * createTempSymbol()
Create a temporary symbol with a unique name.
Definition: MCContext.cpp:318
llvm::CodeModel::Tiny
@ Tiny
Definition: CodeGen.h:31
llvm::EHPersonality
EHPersonality
Definition: EHPersonalities.h:21
llvm::TargetSubtargetInfo
TargetSubtargetInfo - Generic base class for all target subtargets.
Definition: TargetSubtargetInfo.h:62
llvm::StackOffset::getScalable
static StackOffset getScalable(int64_t Scalable)
Definition: TypeSize.h:46
Prolog
@ Prolog
Definition: AArch64LowerHomogeneousPrologEpilog.cpp:126
llvm::MachineMemOperand::MOLoad
@ MOLoad
The memory access reads data.
Definition: MachineMemOperand.h:134
MRI
unsigned const MachineRegisterInfo * MRI
Definition: AArch64AdvSIMDScalarPass.cpp:105
llvm::MachineFrameInfo::getMaxAlign
Align getMaxAlign() const
Return the alignment in bytes that this function must be aligned to, which is greater than the defaul...
Definition: MachineFrameInfo.h:601
llvm::Register
Wrapper class representing virtual and physical registers.
Definition: Register.h:19
llvm::MachineFunction::addFrameInst
unsigned addFrameInst(const MCCFIInstruction &Inst)
Definition: MachineFunction.cpp:317
llvm::make_scope_exit
detail::scope_exit< std::decay_t< Callable > > make_scope_exit(Callable &&F)
Definition: ScopeExit.h:59
llvm::Function::hasOptSize
bool hasOptSize() const
Optimize this function for size (-Os) or minimum size (-Oz).
Definition: Function.h:644
llvm::MachineBasicBlock::addLiveIn
void addLiveIn(MCRegister PhysReg, LaneBitmask LaneMask=LaneBitmask::getAll())
Adds the specified register as a live in.
Definition: MachineBasicBlock.h:408
llvm::MachineFrameInfo::isFrameAddressTaken
bool isFrameAddressTaken() const
This method may be called any time after instruction selection is complete to determine if there is a...
Definition: MachineFrameInfo.h:371
llvm::AArch64_AM::getArithExtendImm
static unsigned getArithExtendImm(AArch64_AM::ShiftExtendType ET, unsigned Imm)
getArithExtendImm - Encode the extend type and shift amount for an arithmetic instruction: imm: 3-bit...
Definition: AArch64AddressingModes.h:171
llvm::MachineFrameInfo::hasCalls
bool hasCalls() const
Return true if the current function has any function calls.
Definition: MachineFrameInfo.h:613
llvm::AArch64FrameLowering::hasReservedCallFrame
bool hasReservedCallFrame(const MachineFunction &MF) const override
hasReservedCallFrame - Under normal circumstances, when a frame pointer is not required,...
Definition: AArch64FrameLowering.cpp:463
emitShadowCallStackEpilogue
static void emitShadowCallStackEpilogue(const TargetInstrInfo &TII, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL)
Definition: AArch64FrameLowering.cpp:1345
CallingConv.h
MBB
MachineBasicBlock & MBB
Definition: AArch64SLSHardening.cpp:74
llvm::AArch64Subtarget::getRegisterInfo
const AArch64RegisterInfo * getRegisterInfo() const override
Definition: AArch64Subtarget.h:182
Attributes.h
llvm::AArch64FunctionInfo::needsAsyncDwarfUnwindInfo
bool needsAsyncDwarfUnwindInfo(const MachineFunction &MF) const
Definition: AArch64MachineFunctionInfo.cpp:141
emitCalleeSavedRestores
static void emitCalleeSavedRestores(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, bool SVE)
Definition: AArch64FrameLowering.cpp:631
isFuncletReturnInstr
static bool isFuncletReturnInstr(const MachineInstr &MI)
Definition: AArch64FrameLowering.cpp:1908
llvm::BitVector::test
bool test(unsigned Idx) const
Definition: BitVector.h:454
llvm::stable_sort
void stable_sort(R &&Range)
Definition: STLExtras.h:1948
llvm::AArch64_AM::UXTX
@ UXTX
Definition: AArch64AddressingModes.h:44
llvm::AArch64FrameLowering::getWinEHParentFrameOffset
unsigned getWinEHParentFrameOffset(const MachineFunction &MF) const override
The parent frame offset (aka dispFrame) is only used on X86_64 to retrieve the parent's frame pointer...
Definition: AArch64FrameLowering.cpp:3787
llvm::MachineRegisterInfo::isLiveIn
bool isLiveIn(Register Reg) const
Definition: MachineRegisterInfo.cpp:440
kSetTagLoopThreshold
static const int kSetTagLoopThreshold
Definition: AArch64SelectionDAGInfo.cpp:119
llvm::AArch64FunctionInfo::hasCalleeSaveStackFreeSpace
bool hasCalleeSaveStackFreeSpace() const
Definition: AArch64MachineFunctionInfo.h:239
llvm::MachineFunction::getFunction
Function & getFunction()
Return the LLVM function that this machine code represents.
Definition: MachineFunction.h:638
llvm::TargetRegisterInfo::getRegSizeInBits
unsigned getRegSizeInBits(const TargetRegisterClass &RC) const
Return the size in bits of a register from class RC.
Definition: TargetRegisterInfo.h:279
uint16_t
llvm::AArch64FunctionInfo::getStackSizeSVE
uint64_t getStackSizeSVE() const
Definition: AArch64MachineFunctionInfo.h:231
llvm::MachineFunction::getTarget
const LLVMTargetMachine & getTarget() const
getTarget - Return the target machine this machine code is compiled with
Definition: MachineFunction.h:668
MachineFrameInfo.h
llvm::Align::value
uint64_t value() const
This is a hole in the type system and should not be abused.
Definition: Alignment.h:85
llvm::MachineFunction::getWinEHFuncInfo
const WinEHFuncInfo * getWinEHFuncInfo() const
getWinEHFuncInfo - Return information about how the current function uses Windows exception handling.
Definition: MachineFunction.h:716
llvm::AArch64FrameLowering::processFunctionBeforeFrameIndicesReplaced
void processFunctionBeforeFrameIndicesReplaced(MachineFunction &MF, RegScavenger *RS) const override
processFunctionBeforeFrameIndicesReplaced - This method is called immediately before MO_FrameIndex op...
Definition: AArch64FrameLowering.cpp:3753
llvm::MachineOperand::getIndex
int getIndex() const
Definition: MachineOperand.h:573
llvm::CallingConv::SwiftTail
@ SwiftTail
This follows the Swift calling convention in how arguments are passed but guarantees tail calls will ...
Definition: CallingConv.h:87
Success
#define Success
Definition: AArch64Disassembler.cpp:300
llvm::TypeSize
Definition: TypeSize.h:314
Function.h
needsShadowCallStackPrologueEpilogue
static bool needsShadowCallStackPrologueEpilogue(MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:1293
llvm::TargetStackID::Value
Value
Definition: TargetFrameLowering.h:27
insertCFISameValue
static void insertCFISameValue(const MCInstrDesc &Desc, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator InsertPt, unsigned DwarfReg)
Definition: AArch64FrameLowering.cpp:580
llvm::AArch64FunctionInfo::setLocalStackSize
void setLocalStackSize(uint64_t Size)
Definition: AArch64MachineFunctionInfo.h:248
llvm::MachineInstrBuilder::setMemRefs
const MachineInstrBuilder & setMemRefs(ArrayRef< MachineMemOperand * > MMOs) const
Definition: MachineInstrBuilder.h:208
AArch64MCTargetDesc.h
llvm::AArch64FunctionInfo::getSVECalleeSavedStackSize
unsigned getSVECalleeSavedStackSize() const
Definition: AArch64MachineFunctionInfo.h:319
llvm::SmallVectorImpl::clear
void clear()
Definition: SmallVector.h:614
llvm::AArch64FrameLowering::getSEHFrameIndexOffset
int getSEHFrameIndexOffset(const MachineFunction &MF, int FI) const
Definition: AArch64FrameLowering.cpp:2273
llvm::AArch64FrameLowering::getFrameIndexReferencePreferSP
StackOffset getFrameIndexReferencePreferSP(const MachineFunction &MF, int FI, Register &FrameReg, bool IgnoreSPUpdates) const override
For Win64 AArch64 EH, the offset to the Unwind object is from the SP before the update.
Definition: AArch64FrameLowering.cpp:3764
llvm::MachineFunction::hasEHFunclets
bool hasEHFunclets() const
Definition: MachineFunction.h:1116
llvm::MachineMemOperand::MOStore
@ MOStore
The memory access writes data.
Definition: MachineMemOperand.h:136
llvm::AMDGPU::Hwreg::Width
Width
Definition: SIDefines.h:439
WinEHFuncInfo.h
llvm::AArch64FrameLowering::emitPrologue
void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override
emitProlog/emitEpilog - These methods insert prolog and epilog code into the function.
Definition: AArch64FrameLowering.cpp:1367
llvm::RISCVMatInt::Imm
@ Imm
Definition: RISCVMatInt.h:23
llvm::TargetRegisterInfo::hasStackRealignment
bool hasStackRealignment(const MachineFunction &MF) const
True if stack realignment is required and still possible.
Definition: TargetRegisterInfo.h:975
llvm::AArch64Subtarget::isTargetMachO
bool isTargetMachO() const
Definition: AArch64Subtarget.h:271
llvm::AArch64FunctionInfo::getArgumentStackToRestore
unsigned getArgumentStackToRestore() const
Definition: AArch64MachineFunctionInfo.h:214
llvm::CodeModel::Large
@ Large
Definition: CodeGen.h:31
llvm::RegState::Kill
@ Kill
The last use of a register.
Definition: MachineInstrBuilder.h:48
llvm::StackOffset::getScalable
int64_t getScalable() const
Returns the scalable component of the stack.
Definition: TypeSize.h:55
llvm::getKillRegState
unsigned getKillRegState(bool B)
Definition: MachineInstrBuilder.h:531
llvm::AArch64TargetLowering::getRedZoneSize
unsigned getRedZoneSize(const Function &F) const
Definition: AArch64ISelLowering.h:894
llvm::MachineFrameInfo
The MachineFrameInfo class represents an abstract stack frame until prolog/epilog code is inserted.
Definition: MachineFrameInfo.h:106
AArch64Subtarget.h
llvm::BuildMI
MachineInstrBuilder BuildMI(MachineFunction &MF, const MIMetadata &MIMD, const MCInstrDesc &MCID)
Builder interface. Specify how to create the initial instruction itself.
Definition: MachineInstrBuilder.h:357
SmallVector.h
estimateRSStackSizeLimit
static unsigned estimateRSStackSizeLimit(MachineFunction &MF)
Look at each instruction that references stack frames and return the stack size limit beyond which so...
Definition: AArch64FrameLowering.cpp:351
llvm::MachinePointerInfo::getFixedStack
static MachinePointerInfo getFixedStack(MachineFunction &MF, int FI, int64_t Offset=0)
Return a MachinePointerInfo record that refers to the specified FrameIndex.
Definition: MachineOperand.cpp:1044
llvm::MachineBasicBlock::begin
iterator begin()
Definition: MachineBasicBlock.h:309
MachineInstrBuilder.h
llvm::AArch64FrameLowering::emitEpilogue
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override
Definition: AArch64FrameLowering.cpp:1918
llvm::MachineInstrBuilder::setMIFlags
const MachineInstrBuilder & setMIFlags(unsigned Flags) const
Definition: MachineInstrBuilder.h:273
llvm::AArch64FunctionInfo::hasSwiftAsyncContext
bool hasSwiftAsyncContext() const
Definition: AArch64MachineFunctionInfo.h:440
getFPOffset
static StackOffset getFPOffset(const MachineFunction &MF, int64_t ObjectOffset)
Definition: AArch64FrameLowering.cpp:2252
llvm::AArch64FrameLowering::canUseAsPrologue
bool canUseAsPrologue(const MachineBasicBlock &MBB) const override
Check whether or not the given MBB can be used as a prologue for the target.
Definition: AArch64FrameLowering.cpp:851
llvm::ArrayRef::size
size_t size() const
size - Get the array size.
Definition: ArrayRef.h:163
llvm::AArch64FrameLowering::getWinEHFuncletFrameSize
unsigned getWinEHFuncletFrameSize(const MachineFunction &MF) const
Funclets only need to account for space for the callee saved registers, as the locals are accounted f...
Definition: AArch64FrameLowering.cpp:3794
llvm::MachineBasicBlock::empty
bool empty() const
Definition: MachineBasicBlock.h:281
llvm::CallingConv::GHC
@ GHC
Used by the Glasgow Haskell Compiler (GHC).
Definition: CallingConv.h:50
ScopeExit.h
llvm::AArch64FunctionInfo::getCalleeSavedStackSize
unsigned getCalleeSavedStackSize(const MachineFrameInfo &MFI) const
Definition: AArch64MachineFunctionInfo.h:266
MachineMemOperand.h
llvm::SmallVectorImpl
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: APFloat.h:42
llvm::reverse
auto reverse(ContainerTy &&C)
Definition: STLExtras.h:484
MachineOperand.h
llvm::TargetFrameLowering::StackGrowsDown
@ StackGrowsDown
Definition: TargetFrameLowering.h:47
llvm::Function::hasMinSize
bool hasMinSize() const
Optimize this function for minimum size (-Oz).
Definition: Function.h:641
llvm::MCCFIInstruction::createOffset
static MCCFIInstruction createOffset(MCSymbol *L, unsigned Register, int Offset)
.cfi_offset Previous value of Register is saved at offset Offset from CFA.
Definition: MCDwarf.h:571
llvm::AArch64FrameLowering::determineCalleeSaves
void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS) const override
This method determines which of the registers reported by TargetRegisterInfo::getCalleeSavedRegs() sh...
Definition: AArch64FrameLowering.cpp:2967
llvm::AArch64FrameLowering::restoreCalleeSavedRegisters
bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, MutableArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI) const override
restoreCalleeSavedRegisters - Issues instruction(s) to restore all callee saved registers and returns...
Definition: AArch64FrameLowering.cpp:2838
BB
Common register allocation spilling lr str ldr sxth r3 ldr mla r4 can lr mov lr str ldr sxth r3 mla r4 and then merge mul and lr str ldr sxth r3 mla r4 It also increase the likelihood the store may become dead bb27 Successors according to LLVM BB
Definition: README.txt:39
llvm::MachineFrameInfo::getStackProtectorIndex
int getStackProtectorIndex() const
Return the index for the stack protector object.
Definition: MachineFrameInfo.h:358
llvm::TargetMachine::getCodeModel
CodeModel::Model getCodeModel() const
Returns the code model.
Definition: TargetMachine.h:233
llvm::TargetFrameLowering::determineCalleeSaves
virtual void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS=nullptr) const
This method determines which of the registers reported by TargetRegisterInfo::getCalleeSavedRegs() sh...
Definition: TargetFrameLoweringImpl.cpp:83
llvm::MachineFrameInfo::hasStackMap
bool hasStackMap() const
This method may be called any time after instruction selection is complete to determine if there is a...
Definition: MachineFrameInfo.h:383
llvm::DebugLoc
A debug info location.
Definition: DebugLoc.h:33
llvm::cl::desc
Definition: CommandLine.h:411
llvm::RegState::Dead
@ Dead
Unused definition.
Definition: MachineInstrBuilder.h:50
llvm::TargetRegisterInfo::getMinimalPhysRegClass
const TargetRegisterClass * getMinimalPhysRegClass(MCRegister Reg, MVT VT=MVT::Other) const
Returns the Register Class of a physical register of the given type, picking the most sub register cl...
Definition: TargetRegisterInfo.cpp:212
RegisterScavenging.h
llvm::AArch64Subtarget
Definition: AArch64Subtarget.h:38
raw_ostream.h
MachineFunction.h
llvm::printReg
Printable printReg(Register Reg, const TargetRegisterInfo *TRI=nullptr, unsigned SubIdx=0, const MachineRegisterInfo *MRI=nullptr)
Prints virtual and physical registers with or without a TRI instance.
Definition: TargetRegisterInfo.cpp:111
llvm::StackOffset::getFixed
int64_t getFixed() const
Returns the fixed component of the stack.
Definition: TypeSize.h:52
llvm::AArch64FrameLowering::orderFrameObjects
void orderFrameObjects(const MachineFunction &MF, SmallVectorImpl< int > &ObjectsToAllocate) const override
Order the symbols in the local stack frame.
Definition: AArch64FrameLowering.cpp:3868
llvm::AArch64RegisterInfo::hasBasePointer
bool hasBasePointer(const MachineFunction &MF) const
Definition: AArch64RegisterInfo.cpp:496
llvm::createCFAOffset
MCCFIInstruction createCFAOffset(const TargetRegisterInfo &MRI, unsigned Reg, const StackOffset &OffsetFromDefCFA)
Definition: AArch64InstrInfo.cpp:4239
llvm::MachineInstr::eraseFromParent
void eraseFromParent()
Unlink 'this' from the containing basic block and delete it.
Definition: MachineInstr.cpp:705
llvm::MachineInstrBundleIterator< MachineInstr >
llvm::MCAsmInfo::usesWindowsCFI
bool usesWindowsCFI() const
Definition: MCAsmInfo.h:797