LLVM  16.0.0git
AArch64FrameLowering.cpp
Go to the documentation of this file.
1 //===- AArch64FrameLowering.cpp - AArch64 Frame Lowering -------*- C++ -*-====//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file contains the AArch64 implementation of TargetFrameLowering class.
10 //
11 // On AArch64, stack frames are structured as follows:
12 //
13 // The stack grows downward.
14 //
15 // All of the individual frame areas on the frame below are optional, i.e. it's
16 // possible to create a function so that the particular area isn't present
17 // in the frame.
18 //
19 // At function entry, the "frame" looks as follows:
20 //
21 // | | Higher address
22 // |-----------------------------------|
23 // | |
24 // | arguments passed on the stack |
25 // | |
26 // |-----------------------------------| <- sp
27 // | | Lower address
28 //
29 //
30 // After the prologue has run, the frame has the following general structure.
31 // Note that this doesn't depict the case where a red-zone is used. Also,
32 // technically the last frame area (VLAs) doesn't get created until in the
33 // main function body, after the prologue is run. However, it's depicted here
34 // for completeness.
35 //
36 // | | Higher address
37 // |-----------------------------------|
38 // | |
39 // | arguments passed on the stack |
40 // | |
41 // |-----------------------------------|
42 // | |
43 // | (Win64 only) varargs from reg |
44 // | |
45 // |-----------------------------------|
46 // | |
47 // | callee-saved gpr registers | <--.
48 // | | | On Darwin platforms these
49 // |- - - - - - - - - - - - - - - - - -| | callee saves are swapped,
50 // | prev_lr | | (frame record first)
51 // | prev_fp | <--'
52 // | async context if needed |
53 // | (a.k.a. "frame record") |
54 // |-----------------------------------| <- fp(=x29)
55 // | |
56 // | callee-saved fp/simd/SVE regs |
57 // | |
58 // |-----------------------------------|
59 // | |
60 // | SVE stack objects |
61 // | |
62 // |-----------------------------------|
63 // |.empty.space.to.make.part.below....|
64 // |.aligned.in.case.it.needs.more.than| (size of this area is unknown at
65 // |.the.standard.16-byte.alignment....| compile time; if present)
66 // |-----------------------------------|
67 // | |
68 // | local variables of fixed size |
69 // | including spill slots |
70 // |-----------------------------------| <- bp(not defined by ABI,
71 // |.variable-sized.local.variables....| LLVM chooses X19)
72 // |.(VLAs)............................| (size of this area is unknown at
73 // |...................................| compile time)
74 // |-----------------------------------| <- sp
75 // | | Lower address
76 //
77 //
78 // To access the data in a frame, at-compile time, a constant offset must be
79 // computable from one of the pointers (fp, bp, sp) to access it. The size
80 // of the areas with a dotted background cannot be computed at compile-time
81 // if they are present, making it required to have all three of fp, bp and
82 // sp to be set up to be able to access all contents in the frame areas,
83 // assuming all of the frame areas are non-empty.
84 //
85 // For most functions, some of the frame areas are empty. For those functions,
86 // it may not be necessary to set up fp or bp:
87 // * A base pointer is definitely needed when there are both VLAs and local
88 // variables with more-than-default alignment requirements.
89 // * A frame pointer is definitely needed when there are local variables with
90 // more-than-default alignment requirements.
91 //
92 // For Darwin platforms the frame-record (fp, lr) is stored at the top of the
93 // callee-saved area, since the unwind encoding does not allow for encoding
94 // this dynamically and existing tools depend on this layout. For other
95 // platforms, the frame-record is stored at the bottom of the (gpr) callee-saved
96 // area to allow SVE stack objects (allocated directly below the callee-saves,
97 // if available) to be accessed directly from the framepointer.
98 // The SVE spill/fill instructions have VL-scaled addressing modes such
99 // as:
100 // ldr z8, [fp, #-7 mul vl]
101 // For SVE the size of the vector length (VL) is not known at compile-time, so
102 // '#-7 mul vl' is an offset that can only be evaluated at runtime. With this
103 // layout, we don't need to add an unscaled offset to the framepointer before
104 // accessing the SVE object in the frame.
105 //
106 // In some cases when a base pointer is not strictly needed, it is generated
107 // anyway when offsets from the frame pointer to access local variables become
108 // so large that the offset can't be encoded in the immediate fields of loads
109 // or stores.
110 //
111 // Outgoing function arguments must be at the bottom of the stack frame when
112 // calling another function. If we do not have variable-sized stack objects, we
113 // can allocate a "reserved call frame" area at the bottom of the local
114 // variable area, large enough for all outgoing calls. If we do have VLAs, then
115 // the stack pointer must be decremented and incremented around each call to
116 // make space for the arguments below the VLAs.
117 //
118 // FIXME: also explain the redzone concept.
119 //
120 // An example of the prologue:
121 //
122 // .globl __foo
123 // .align 2
124 // __foo:
125 // Ltmp0:
126 // .cfi_startproc
127 // .cfi_personality 155, ___gxx_personality_v0
128 // Leh_func_begin:
129 // .cfi_lsda 16, Lexception33
130 //
131 // stp xa,bx, [sp, -#offset]!
132 // ...
133 // stp x28, x27, [sp, #offset-32]
134 // stp fp, lr, [sp, #offset-16]
135 // add fp, sp, #offset - 16
136 // sub sp, sp, #1360
137 //
138 // The Stack:
139 // +-------------------------------------------+
140 // 10000 | ........ | ........ | ........ | ........ |
141 // 10004 | ........ | ........ | ........ | ........ |
142 // +-------------------------------------------+
143 // 10008 | ........ | ........ | ........ | ........ |
144 // 1000c | ........ | ........ | ........ | ........ |
145 // +===========================================+
146 // 10010 | X28 Register |
147 // 10014 | X28 Register |
148 // +-------------------------------------------+
149 // 10018 | X27 Register |
150 // 1001c | X27 Register |
151 // +===========================================+
152 // 10020 | Frame Pointer |
153 // 10024 | Frame Pointer |
154 // +-------------------------------------------+
155 // 10028 | Link Register |
156 // 1002c | Link Register |
157 // +===========================================+
158 // 10030 | ........ | ........ | ........ | ........ |
159 // 10034 | ........ | ........ | ........ | ........ |
160 // +-------------------------------------------+
161 // 10038 | ........ | ........ | ........ | ........ |
162 // 1003c | ........ | ........ | ........ | ........ |
163 // +-------------------------------------------+
164 //
165 // [sp] = 10030 :: >>initial value<<
166 // sp = 10020 :: stp fp, lr, [sp, #-16]!
167 // fp = sp == 10020 :: mov fp, sp
168 // [sp] == 10020 :: stp x28, x27, [sp, #-16]!
169 // sp == 10010 :: >>final value<<
170 //
171 // The frame pointer (w29) points to address 10020. If we use an offset of
172 // '16' from 'w29', we get the CFI offsets of -8 for w30, -16 for w29, -24
173 // for w27, and -32 for w28:
174 //
175 // Ltmp1:
176 // .cfi_def_cfa w29, 16
177 // Ltmp2:
178 // .cfi_offset w30, -8
179 // Ltmp3:
180 // .cfi_offset w29, -16
181 // Ltmp4:
182 // .cfi_offset w27, -24
183 // Ltmp5:
184 // .cfi_offset w28, -32
185 //
186 //===----------------------------------------------------------------------===//
187 
188 #include "AArch64FrameLowering.h"
189 #include "AArch64InstrInfo.h"
191 #include "AArch64RegisterInfo.h"
192 #include "AArch64Subtarget.h"
193 #include "AArch64TargetMachine.h"
196 #include "llvm/ADT/ScopeExit.h"
197 #include "llvm/ADT/SmallVector.h"
198 #include "llvm/ADT/Statistic.h"
214 #include "llvm/IR/Attributes.h"
215 #include "llvm/IR/CallingConv.h"
216 #include "llvm/IR/DataLayout.h"
217 #include "llvm/IR/DebugLoc.h"
218 #include "llvm/IR/Function.h"
219 #include "llvm/MC/MCAsmInfo.h"
220 #include "llvm/MC/MCDwarf.h"
222 #include "llvm/Support/Debug.h"
224 #include "llvm/Support/MathExtras.h"
228 #include <cassert>
229 #include <cstdint>
230 #include <iterator>
231 #include <vector>
232 
233 using namespace llvm;
234 
235 #define DEBUG_TYPE "frame-info"
236 
237 static cl::opt<bool> EnableRedZone("aarch64-redzone",
238  cl::desc("enable use of redzone on AArch64"),
239  cl::init(false), cl::Hidden);
240 
241 static cl::opt<bool>
242  ReverseCSRRestoreSeq("reverse-csr-restore-seq",
243  cl::desc("reverse the CSR restore sequence"),
244  cl::init(false), cl::Hidden);
245 
247  "stack-tagging-merge-settag",
248  cl::desc("merge settag instruction in function epilog"), cl::init(true),
249  cl::Hidden);
250 
251 static cl::opt<bool> OrderFrameObjects("aarch64-order-frame-objects",
252  cl::desc("sort stack allocations"),
253  cl::init(true), cl::Hidden);
254 
256  "homogeneous-prolog-epilog", cl::Hidden,
257  cl::desc("Emit homogeneous prologue and epilogue for the size "
258  "optimization (default = off)"));
259 
260 STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");
261 
262 /// Returns how much of the incoming argument stack area (in bytes) we should
263 /// clean up in an epilogue. For the C calling convention this will be 0, for
264 /// guaranteed tail call conventions it can be positive (a normal return or a
265 /// tail call to a function that uses less stack space for arguments) or
266 /// negative (for a tail call to a function that needs more stack space than us
267 /// for arguments).
271  bool IsTailCallReturn = false;
272  if (MBB.end() != MBBI) {
273  unsigned RetOpcode = MBBI->getOpcode();
274  IsTailCallReturn = RetOpcode == AArch64::TCRETURNdi ||
275  RetOpcode == AArch64::TCRETURNri ||
276  RetOpcode == AArch64::TCRETURNriBTI;
277  }
279 
280  int64_t ArgumentPopSize = 0;
281  if (IsTailCallReturn) {
282  MachineOperand &StackAdjust = MBBI->getOperand(1);
283 
284  // For a tail-call in a callee-pops-arguments environment, some or all of
285  // the stack may actually be in use for the call's arguments, this is
286  // calculated during LowerCall and consumed here...
287  ArgumentPopSize = StackAdjust.getImm();
288  } else {
289  // ... otherwise the amount to pop is *all* of the argument space,
290  // conveniently stored in the MachineFunctionInfo by
291  // LowerFormalArguments. This will, of course, be zero for the C calling
292  // convention.
293  ArgumentPopSize = AFI->getArgumentStackToRestore();
294  }
295 
296  return ArgumentPopSize;
297 }
298 
300 static bool needsWinCFI(const MachineFunction &MF);
303 
304 /// Returns true if a homogeneous prolog or epilog code can be emitted
305 /// for the size optimization. If possible, a frame helper call is injected.
306 /// When Exit block is given, this check is for epilog.
307 bool AArch64FrameLowering::homogeneousPrologEpilog(
308  MachineFunction &MF, MachineBasicBlock *Exit) const {
309  if (!MF.getFunction().hasMinSize())
310  return false;
312  return false;
314  return false;
315  if (EnableRedZone)
316  return false;
317 
318  // TODO: Window is supported yet.
319  if (needsWinCFI(MF))
320  return false;
321  // TODO: SVE is not supported yet.
322  if (getSVEStackSize(MF))
323  return false;
324 
325  // Bail on stack adjustment needed on return for simplicity.
326  const MachineFrameInfo &MFI = MF.getFrameInfo();
327  const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
328  if (MFI.hasVarSizedObjects() || RegInfo->hasStackRealignment(MF))
329  return false;
330  if (Exit && getArgumentStackToRestore(MF, *Exit))
331  return false;
332 
333  return true;
334 }
335 
336 /// Returns true if CSRs should be paired.
337 bool AArch64FrameLowering::producePairRegisters(MachineFunction &MF) const {
338  return produceCompactUnwindFrame(MF) || homogeneousPrologEpilog(MF);
339 }
340 
341 /// This is the biggest offset to the stack pointer we can encode in aarch64
342 /// instructions (without using a separate calculation and a temp register).
343 /// Note that the exception here are vector stores/loads which cannot encode any
344 /// displacements (see estimateRSStackSizeLimit(), isAArch64FrameOffsetLegal()).
345 static const unsigned DefaultSafeSPDisplacement = 255;
346 
347 /// Look at each instruction that references stack frames and return the stack
348 /// size limit beyond which some of these instructions will require a scratch
349 /// register during their expansion later.
351  // FIXME: For now, just conservatively guestimate based on unscaled indexing
352  // range. We'll end up allocating an unnecessary spill slot a lot, but
353  // realistically that's not a big deal at this stage of the game.
354  for (MachineBasicBlock &MBB : MF) {
355  for (MachineInstr &MI : MBB) {
356  if (MI.isDebugInstr() || MI.isPseudo() ||
357  MI.getOpcode() == AArch64::ADDXri ||
358  MI.getOpcode() == AArch64::ADDSXri)
359  continue;
360 
361  for (const MachineOperand &MO : MI.operands()) {
362  if (!MO.isFI())
363  continue;
364 
365  StackOffset Offset;
366  if (isAArch64FrameOffsetLegal(MI, Offset, nullptr, nullptr, nullptr) ==
368  return 0;
369  }
370  }
371  }
373 }
374 
378 }
379 
380 /// Returns the size of the fixed object area (allocated next to sp on entry)
381 /// On Win64 this may include a var args area and an UnwindHelp object for EH.
382 static unsigned getFixedObjectSize(const MachineFunction &MF,
383  const AArch64FunctionInfo *AFI, bool IsWin64,
384  bool IsFunclet) {
385  if (!IsWin64 || IsFunclet) {
386  return AFI->getTailCallReservedStack();
387  } else {
388  if (AFI->getTailCallReservedStack() != 0)
389  report_fatal_error("cannot generate ABI-changing tail call for Win64");
390  // Var args are stored here in the primary function.
391  const unsigned VarArgsArea = AFI->getVarArgsGPRSize();
392  // To support EH funclets we allocate an UnwindHelp object
393  const unsigned UnwindHelpObject = (MF.hasEHFunclets() ? 8 : 0);
394  return alignTo(VarArgsArea + UnwindHelpObject, 16);
395  }
396 }
397 
398 /// Returns the size of the entire SVE stackframe (calleesaves + spills).
401  return StackOffset::getScalable((int64_t)AFI->getStackSizeSVE());
402 }
403 
405  if (!EnableRedZone)
406  return false;
407 
408  // Don't use the red zone if the function explicitly asks us not to.
409  // This is typically used for kernel code.
410  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
411  const unsigned RedZoneSize =
412  Subtarget.getTargetLowering()->getRedZoneSize(MF.getFunction());
413  if (!RedZoneSize)
414  return false;
415 
416  const MachineFrameInfo &MFI = MF.getFrameInfo();
418  uint64_t NumBytes = AFI->getLocalStackSize();
419 
420  return !(MFI.hasCalls() || hasFP(MF) || NumBytes > RedZoneSize ||
421  getSVEStackSize(MF));
422 }
423 
424 /// hasFP - Return true if the specified function should have a dedicated frame
425 /// pointer register.
427  const MachineFrameInfo &MFI = MF.getFrameInfo();
428  const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
429  // Win64 EH requires a frame pointer if funclets are present, as the locals
430  // are accessed off the frame pointer in both the parent function and the
431  // funclets.
432  if (MF.hasEHFunclets())
433  return true;
434  // Retain behavior of always omitting the FP for leaf functions when possible.
436  return true;
437  if (MFI.hasVarSizedObjects() || MFI.isFrameAddressTaken() ||
438  MFI.hasStackMap() || MFI.hasPatchPoint() ||
439  RegInfo->hasStackRealignment(MF))
440  return true;
441  // With large callframes around we may need to use FP to access the scavenging
442  // emergency spillslot.
443  //
444  // Unfortunately some calls to hasFP() like machine verifier ->
445  // getReservedReg() -> hasFP in the middle of global isel are too early
446  // to know the max call frame size. Hopefully conservatively returning "true"
447  // in those cases is fine.
448  // DefaultSafeSPDisplacement is fine as we only emergency spill GP regs.
449  if (!MFI.isMaxCallFrameSizeComputed() ||
451  return true;
452 
453  return false;
454 }
455 
456 /// hasReservedCallFrame - Under normal circumstances, when a frame pointer is
457 /// not required, we reserve argument space for call sites in the function
458 /// immediately on entry to the current function. This eliminates the need for
459 /// add/sub sp brackets around call sites. Returns true if the call frame is
460 /// included as part of the stack frame.
461 bool
463  return !MF.getFrameInfo().hasVarSizedObjects();
464 }
465 
469  const AArch64InstrInfo *TII =
470  static_cast<const AArch64InstrInfo *>(MF.getSubtarget().getInstrInfo());
471  DebugLoc DL = I->getDebugLoc();
472  unsigned Opc = I->getOpcode();
473  bool IsDestroy = Opc == TII->getCallFrameDestroyOpcode();
474  uint64_t CalleePopAmount = IsDestroy ? I->getOperand(1).getImm() : 0;
475 
476  if (!hasReservedCallFrame(MF)) {
477  int64_t Amount = I->getOperand(0).getImm();
478  Amount = alignTo(Amount, getStackAlign());
479  if (!IsDestroy)
480  Amount = -Amount;
481 
482  // N.b. if CalleePopAmount is valid but zero (i.e. callee would pop, but it
483  // doesn't have to pop anything), then the first operand will be zero too so
484  // this adjustment is a no-op.
485  if (CalleePopAmount == 0) {
486  // FIXME: in-function stack adjustment for calls is limited to 24-bits
487  // because there's no guaranteed temporary register available.
488  //
489  // ADD/SUB (immediate) has only LSL #0 and LSL #12 available.
490  // 1) For offset <= 12-bit, we use LSL #0
491  // 2) For 12-bit <= offset <= 24-bit, we use two instructions. One uses
492  // LSL #0, and the other uses LSL #12.
493  //
494  // Most call frames will be allocated at the start of a function so
495  // this is OK, but it is a limitation that needs dealing with.
496  assert(Amount > -0xffffff && Amount < 0xffffff && "call frame too large");
497  emitFrameOffset(MBB, I, DL, AArch64::SP, AArch64::SP,
498  StackOffset::getFixed(Amount), TII);
499  }
500  } else if (CalleePopAmount != 0) {
501  // If the calling convention demands that the callee pops arguments from the
502  // stack, we want to add it back if we have a reserved call frame.
503  assert(CalleePopAmount < 0xffffff && "call frame too large");
504  emitFrameOffset(MBB, I, DL, AArch64::SP, AArch64::SP,
505  StackOffset::getFixed(-(int64_t)CalleePopAmount), TII);
506  }
507  return MBB.erase(I);
508 }
509 
510 void AArch64FrameLowering::emitCalleeSavedGPRLocations(
512  MachineFunction &MF = *MBB.getParent();
513  MachineFrameInfo &MFI = MF.getFrameInfo();
514 
515  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
516  if (CSI.empty())
517  return;
518 
519  const TargetSubtargetInfo &STI = MF.getSubtarget();
520  const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
521  const TargetInstrInfo &TII = *STI.getInstrInfo();
523 
524  for (const auto &Info : CSI) {
525  if (MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector)
526  continue;
527 
528  assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
529  unsigned DwarfReg = TRI.getDwarfRegNum(Info.getReg(), true);
530 
531  int64_t Offset =
532  MFI.getObjectOffset(Info.getFrameIdx()) - getOffsetOfLocalArea();
533  unsigned CFIIndex = MF.addFrameInst(
534  MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
535  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
536  .addCFIIndex(CFIIndex)
538  }
539 }
540 
541 void AArch64FrameLowering::emitCalleeSavedSVELocations(
543  MachineFunction &MF = *MBB.getParent();
544  MachineFrameInfo &MFI = MF.getFrameInfo();
545 
546  // Add callee saved registers to move list.
547  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
548  if (CSI.empty())
549  return;
550 
551  const TargetSubtargetInfo &STI = MF.getSubtarget();
552  const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
553  const TargetInstrInfo &TII = *STI.getInstrInfo();
556 
557  for (const auto &Info : CSI) {
558  if (!(MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
559  continue;
560 
561  // Not all unwinders may know about SVE registers, so assume the lowest
562  // common demoninator.
563  assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
564  unsigned Reg = Info.getReg();
565  if (!static_cast<const AArch64RegisterInfo &>(TRI).regNeedsCFI(Reg, Reg))
566  continue;
567 
569  StackOffset::getScalable(MFI.getObjectOffset(Info.getFrameIdx())) -
571 
572  unsigned CFIIndex = MF.addFrameInst(createCFAOffset(TRI, Reg, Offset));
573  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
574  .addCFIIndex(CFIIndex)
576  }
577 }
578 
581  emitCalleeSavedGPRLocations(MBB, MBBI);
582  emitCalleeSavedSVELocations(MBB, MBBI);
583 }
584 
585 static void insertCFISameValue(const MCInstrDesc &Desc, MachineFunction &MF,
588  unsigned DwarfReg) {
589  unsigned CFIIndex =
590  MF.addFrameInst(MCCFIInstruction::createSameValue(nullptr, DwarfReg));
591  BuildMI(MBB, InsertPt, DebugLoc(), Desc).addCFIIndex(CFIIndex);
592 }
593 
595  MachineBasicBlock &MBB) const {
596 
597  MachineFunction &MF = *MBB.getParent();
598  const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
599  const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
600  const auto &TRI =
601  static_cast<const AArch64RegisterInfo &>(*Subtarget.getRegisterInfo());
602  const auto &MFI = *MF.getInfo<AArch64FunctionInfo>();
603 
604  const MCInstrDesc &CFIDesc = TII.get(TargetOpcode::CFI_INSTRUCTION);
605  DebugLoc DL;
606 
607  // Reset the CFA to `SP + 0`.
609  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::cfiDefCfa(
610  nullptr, TRI.getDwarfRegNum(AArch64::SP, true), 0));
611  BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
612 
613  // Flip the RA sign state.
614  if (MFI.shouldSignReturnAddress()) {
615  CFIIndex = MF.addFrameInst(MCCFIInstruction::createNegateRAState(nullptr));
616  BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
617  }
618 
619  // Shadow call stack uses X18, reset it.
621  insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
622  TRI.getDwarfRegNum(AArch64::X18, true));
623 
624  // Emit .cfi_same_value for callee-saved registers.
625  const std::vector<CalleeSavedInfo> &CSI =
627  for (const auto &Info : CSI) {
628  unsigned Reg = Info.getReg();
629  if (!TRI.regNeedsCFI(Reg, Reg))
630  continue;
631  insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
632  TRI.getDwarfRegNum(Reg, true));
633  }
634 }
635 
638  bool SVE) {
639  MachineFunction &MF = *MBB.getParent();
640  MachineFrameInfo &MFI = MF.getFrameInfo();
641 
642  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
643  if (CSI.empty())
644  return;
645 
646  const TargetSubtargetInfo &STI = MF.getSubtarget();
647  const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
648  const TargetInstrInfo &TII = *STI.getInstrInfo();
650 
651  for (const auto &Info : CSI) {
652  if (SVE !=
653  (MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
654  continue;
655 
656  unsigned Reg = Info.getReg();
657  if (SVE &&
658  !static_cast<const AArch64RegisterInfo &>(TRI).regNeedsCFI(Reg, Reg))
659  continue;
660 
661  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
662  nullptr, TRI.getDwarfRegNum(Info.getReg(), true)));
663  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
664  .addCFIIndex(CFIIndex)
666  }
667 }
668 
669 void AArch64FrameLowering::emitCalleeSavedGPRRestores(
672 }
673 
674 void AArch64FrameLowering::emitCalleeSavedSVERestores(
677 }
678 
680  switch (Reg.id()) {
681  default:
682  // The called routine is expected to preserve r19-r28
683  // r29 and r30 are used as frame pointer and link register resp.
684  return 0;
685 
686  // GPRs
687 #define CASE(n) \
688  case AArch64::W##n: \
689  case AArch64::X##n: \
690  return AArch64::X##n
691  CASE(0);
692  CASE(1);
693  CASE(2);
694  CASE(3);
695  CASE(4);
696  CASE(5);
697  CASE(6);
698  CASE(7);
699  CASE(8);
700  CASE(9);
701  CASE(10);
702  CASE(11);
703  CASE(12);
704  CASE(13);
705  CASE(14);
706  CASE(15);
707  CASE(16);
708  CASE(17);
709  CASE(18);
710 #undef CASE
711 
712  // FPRs
713 #define CASE(n) \
714  case AArch64::B##n: \
715  case AArch64::H##n: \
716  case AArch64::S##n: \
717  case AArch64::D##n: \
718  case AArch64::Q##n: \
719  return HasSVE ? AArch64::Z##n : AArch64::Q##n
720  CASE(0);
721  CASE(1);
722  CASE(2);
723  CASE(3);
724  CASE(4);
725  CASE(5);
726  CASE(6);
727  CASE(7);
728  CASE(8);
729  CASE(9);
730  CASE(10);
731  CASE(11);
732  CASE(12);
733  CASE(13);
734  CASE(14);
735  CASE(15);
736  CASE(16);
737  CASE(17);
738  CASE(18);
739  CASE(19);
740  CASE(20);
741  CASE(21);
742  CASE(22);
743  CASE(23);
744  CASE(24);
745  CASE(25);
746  CASE(26);
747  CASE(27);
748  CASE(28);
749  CASE(29);
750  CASE(30);
751  CASE(31);
752 #undef CASE
753  }
754 }
755 
756 void AArch64FrameLowering::emitZeroCallUsedRegs(BitVector RegsToZero,
757  MachineBasicBlock &MBB) const {
758  // Insertion point.
760 
761  // Fake a debug loc.
762  DebugLoc DL;
763  if (MBBI != MBB.end())
764  DL = MBBI->getDebugLoc();
765 
766  const MachineFunction &MF = *MBB.getParent();
768  const AArch64RegisterInfo &TRI = *STI.getRegisterInfo();
769 
770  BitVector GPRsToZero(TRI.getNumRegs());
771  BitVector FPRsToZero(TRI.getNumRegs());
772  bool HasSVE = STI.hasSVE();
773  for (MCRegister Reg : RegsToZero.set_bits()) {
774  if (TRI.isGeneralPurposeRegister(MF, Reg)) {
775  // For GPRs, we only care to clear out the 64-bit register.
776  if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
777  GPRsToZero.set(XReg);
778  } else if (AArch64::FPR128RegClass.contains(Reg) ||
779  AArch64::FPR64RegClass.contains(Reg) ||
780  AArch64::FPR32RegClass.contains(Reg) ||
781  AArch64::FPR16RegClass.contains(Reg) ||
782  AArch64::FPR8RegClass.contains(Reg)) {
783  // For FPRs,
784  if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
785  FPRsToZero.set(XReg);
786  }
787  }
788 
789  const AArch64InstrInfo &TII = *STI.getInstrInfo();
790 
791  // Zero out GPRs.
792  for (MCRegister Reg : GPRsToZero.set_bits())
793  BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVi64imm), Reg).addImm(0);
794 
795  // Zero out FP/vector registers.
796  for (MCRegister Reg : FPRsToZero.set_bits())
797  if (HasSVE)
798  BuildMI(MBB, MBBI, DL, TII.get(AArch64::DUP_ZI_D), Reg)
799  .addImm(0)
800  .addImm(0);
801  else
802  BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVIv2d_ns), Reg).addImm(0);
803 
804  if (HasSVE) {
805  for (MCRegister PReg :
806  {AArch64::P0, AArch64::P1, AArch64::P2, AArch64::P3, AArch64::P4,
807  AArch64::P5, AArch64::P6, AArch64::P7, AArch64::P8, AArch64::P9,
808  AArch64::P10, AArch64::P11, AArch64::P12, AArch64::P13, AArch64::P14,
809  AArch64::P15}) {
810  if (RegsToZero[PReg])
811  BuildMI(MBB, MBBI, DL, TII.get(AArch64::PFALSE), PReg);
812  }
813  }
814 }
815 
816 // Find a scratch register that we can use at the start of the prologue to
817 // re-align the stack pointer. We avoid using callee-save registers since they
818 // may appear to be free when this is called from canUseAsPrologue (during
819 // shrink wrapping), but then no longer be free when this is called from
820 // emitPrologue.
821 //
822 // FIXME: This is a bit conservative, since in the above case we could use one
823 // of the callee-save registers as a scratch temp to re-align the stack pointer,
824 // but we would then have to make sure that we were in fact saving at least one
825 // callee-save register in the prologue, which is additional complexity that
826 // doesn't seem worth the benefit.
828  MachineFunction *MF = MBB->getParent();
829 
830  // If MBB is an entry block, use X9 as the scratch register
831  if (&MF->front() == MBB)
832  return AArch64::X9;
833 
834  const AArch64Subtarget &Subtarget = MF->getSubtarget<AArch64Subtarget>();
835  const AArch64RegisterInfo &TRI = *Subtarget.getRegisterInfo();
836  LivePhysRegs LiveRegs(TRI);
837  LiveRegs.addLiveIns(*MBB);
838 
839  // Mark callee saved registers as used so we will not choose them.
840  const MCPhysReg *CSRegs = MF->getRegInfo().getCalleeSavedRegs();
841  for (unsigned i = 0; CSRegs[i]; ++i)
842  LiveRegs.addReg(CSRegs[i]);
843 
844  // Prefer X9 since it was historically used for the prologue scratch reg.
845  const MachineRegisterInfo &MRI = MF->getRegInfo();
846  if (LiveRegs.available(MRI, AArch64::X9))
847  return AArch64::X9;
848 
849  for (unsigned Reg : AArch64::GPR64RegClass) {
850  if (LiveRegs.available(MRI, Reg))
851  return Reg;
852  }
853  return AArch64::NoRegister;
854 }
855 
857  const MachineBasicBlock &MBB) const {
858  const MachineFunction *MF = MBB.getParent();
859  MachineBasicBlock *TmpMBB = const_cast<MachineBasicBlock *>(&MBB);
860  const AArch64Subtarget &Subtarget = MF->getSubtarget<AArch64Subtarget>();
861  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
862 
863  // Don't need a scratch register if we're not going to re-align the stack.
864  if (!RegInfo->hasStackRealignment(*MF))
865  return true;
866  // Otherwise, we can use any block as long as it has a scratch register
867  // available.
868  return findScratchNonCalleeSaveRegister(TmpMBB) != AArch64::NoRegister;
869 }
870 
872  uint64_t StackSizeInBytes) {
873  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
874  if (!Subtarget.isTargetWindows())
875  return false;
876  const Function &F = MF.getFunction();
877  // TODO: When implementing stack protectors, take that into account
878  // for the probe threshold.
879  unsigned StackProbeSize = 4096;
880  if (F.hasFnAttribute("stack-probe-size"))
881  F.getFnAttribute("stack-probe-size")
882  .getValueAsString()
883  .getAsInteger(0, StackProbeSize);
884  return (StackSizeInBytes >= StackProbeSize) &&
885  !F.hasFnAttribute("no-stack-arg-probe");
886 }
887 
888 static bool needsWinCFI(const MachineFunction &MF) {
889  const Function &F = MF.getFunction();
890  return MF.getTarget().getMCAsmInfo()->usesWindowsCFI() &&
891  F.needsUnwindTableEntry();
892 }
893 
894 bool AArch64FrameLowering::shouldCombineCSRLocalStackBump(
895  MachineFunction &MF, uint64_t StackBumpBytes) const {
897  const MachineFrameInfo &MFI = MF.getFrameInfo();
898  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
899  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
900  if (homogeneousPrologEpilog(MF))
901  return false;
902 
903  if (AFI->getLocalStackSize() == 0)
904  return false;
905 
906  // For WinCFI, if optimizing for size, prefer to not combine the stack bump
907  // (to force a stp with predecrement) to match the packed unwind format,
908  // provided that there actually are any callee saved registers to merge the
909  // decrement with.
910  // This is potentially marginally slower, but allows using the packed
911  // unwind format for functions that both have a local area and callee saved
912  // registers. Using the packed unwind format notably reduces the size of
913  // the unwind info.
914  if (needsWinCFI(MF) && AFI->getCalleeSavedStackSize() > 0 &&
915  MF.getFunction().hasOptSize())
916  return false;
917 
918  // 512 is the maximum immediate for stp/ldp that will be used for
919  // callee-save save/restores
920  if (StackBumpBytes >= 512 || windowsRequiresStackProbe(MF, StackBumpBytes))
921  return false;
922 
923  if (MFI.hasVarSizedObjects())
924  return false;
925 
926  if (RegInfo->hasStackRealignment(MF))
927  return false;
928 
929  // This isn't strictly necessary, but it simplifies things a bit since the
930  // current RedZone handling code assumes the SP is adjusted by the
931  // callee-save save/restore code.
932  if (canUseRedZone(MF))
933  return false;
934 
935  // When there is an SVE area on the stack, always allocate the
936  // callee-saves and spills/locals separately.
937  if (getSVEStackSize(MF))
938  return false;
939 
940  return true;
941 }
942 
943 bool AArch64FrameLowering::shouldCombineCSRLocalStackBumpInEpilogue(
944  MachineBasicBlock &MBB, unsigned StackBumpBytes) const {
945  if (!shouldCombineCSRLocalStackBump(*MBB.getParent(), StackBumpBytes))
946  return false;
947 
948  if (MBB.empty())
949  return true;
950 
951  // Disable combined SP bump if the last instruction is an MTE tag store. It
952  // is almost always better to merge SP adjustment into those instructions.
955  while (LastI != Begin) {
956  --LastI;
957  if (LastI->isTransient())
958  continue;
959  if (!LastI->getFlag(MachineInstr::FrameDestroy))
960  break;
961  }
962  switch (LastI->getOpcode()) {
963  case AArch64::STGloop:
964  case AArch64::STZGloop:
965  case AArch64::STGOffset:
966  case AArch64::STZGOffset:
967  case AArch64::ST2GOffset:
968  case AArch64::STZ2GOffset:
969  return false;
970  default:
971  return true;
972  }
973  llvm_unreachable("unreachable");
974 }
975 
976 // Given a load or a store instruction, generate an appropriate unwinding SEH
977 // code on Windows.
979  const TargetInstrInfo &TII,
981  unsigned Opc = MBBI->getOpcode();
983  MachineFunction &MF = *MBB->getParent();
984  DebugLoc DL = MBBI->getDebugLoc();
985  unsigned ImmIdx = MBBI->getNumOperands() - 1;
986  int Imm = MBBI->getOperand(ImmIdx).getImm();
988  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
989  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
990 
991  switch (Opc) {
992  default:
993  llvm_unreachable("No SEH Opcode for this instruction");
994  case AArch64::LDPDpost:
995  Imm = -Imm;
996  [[fallthrough]];
997  case AArch64::STPDpre: {
998  unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
999  unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(2).getReg());
1000  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFRegP_X))
1001  .addImm(Reg0)
1002  .addImm(Reg1)
1003  .addImm(Imm * 8)
1004  .setMIFlag(Flag);
1005  break;
1006  }
1007  case AArch64::LDPXpost:
1008  Imm = -Imm;
1009  [[fallthrough]];
1010  case AArch64::STPXpre: {
1011  Register Reg0 = MBBI->getOperand(1).getReg();
1012  Register Reg1 = MBBI->getOperand(2).getReg();
1013  if (Reg0 == AArch64::FP && Reg1 == AArch64::LR)
1014  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFPLR_X))
1015  .addImm(Imm * 8)
1016  .setMIFlag(Flag);
1017  else
1018  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveRegP_X))
1019  .addImm(RegInfo->getSEHRegNum(Reg0))
1020  .addImm(RegInfo->getSEHRegNum(Reg1))
1021  .addImm(Imm * 8)
1022  .setMIFlag(Flag);
1023  break;
1024  }
1025  case AArch64::LDRDpost:
1026  Imm = -Imm;
1027  [[fallthrough]];
1028  case AArch64::STRDpre: {
1029  unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1030  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFReg_X))
1031  .addImm(Reg)
1032  .addImm(Imm)
1033  .setMIFlag(Flag);
1034  break;
1035  }
1036  case AArch64::LDRXpost:
1037  Imm = -Imm;
1038  [[fallthrough]];
1039  case AArch64::STRXpre: {
1040  unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1041  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveReg_X))
1042  .addImm(Reg)
1043  .addImm(Imm)
1044  .setMIFlag(Flag);
1045  break;
1046  }
1047  case AArch64::STPDi:
1048  case AArch64::LDPDi: {
1049  unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1050  unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1051  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFRegP))
1052  .addImm(Reg0)
1053  .addImm(Reg1)
1054  .addImm(Imm * 8)
1055  .setMIFlag(Flag);
1056  break;
1057  }
1058  case AArch64::STPXi:
1059  case AArch64::LDPXi: {
1060  Register Reg0 = MBBI->getOperand(0).getReg();
1061  Register Reg1 = MBBI->getOperand(1).getReg();
1062  if (Reg0 == AArch64::FP && Reg1 == AArch64::LR)
1063  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFPLR))
1064  .addImm(Imm * 8)
1065  .setMIFlag(Flag);
1066  else
1067  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveRegP))
1068  .addImm(RegInfo->getSEHRegNum(Reg0))
1069  .addImm(RegInfo->getSEHRegNum(Reg1))
1070  .addImm(Imm * 8)
1071  .setMIFlag(Flag);
1072  break;
1073  }
1074  case AArch64::STRXui:
1075  case AArch64::LDRXui: {
1076  int Reg = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1077  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveReg))
1078  .addImm(Reg)
1079  .addImm(Imm * 8)
1080  .setMIFlag(Flag);
1081  break;
1082  }
1083  case AArch64::STRDui:
1084  case AArch64::LDRDui: {
1085  unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1086  MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFReg))
1087  .addImm(Reg)
1088  .addImm(Imm * 8)
1089  .setMIFlag(Flag);
1090  break;
1091  }
1092  }
1093  auto I = MBB->insertAfter(MBBI, MIB);
1094  return I;
1095 }
1096 
1097 // Fix up the SEH opcode associated with the save/restore instruction.
1099  unsigned LocalStackSize) {
1100  MachineOperand *ImmOpnd = nullptr;
1101  unsigned ImmIdx = MBBI->getNumOperands() - 1;
1102  switch (MBBI->getOpcode()) {
1103  default:
1104  llvm_unreachable("Fix the offset in the SEH instruction");
1105  case AArch64::SEH_SaveFPLR:
1106  case AArch64::SEH_SaveRegP:
1107  case AArch64::SEH_SaveReg:
1108  case AArch64::SEH_SaveFRegP:
1109  case AArch64::SEH_SaveFReg:
1110  ImmOpnd = &MBBI->getOperand(ImmIdx);
1111  break;
1112  }
1113  if (ImmOpnd)
1114  ImmOpnd->setImm(ImmOpnd->getImm() + LocalStackSize);
1115 }
1116 
1117 // Convert callee-save register save/restore instruction to do stack pointer
1118 // decrement/increment to allocate/deallocate the callee-save stack area by
1119 // converting store/load to use pre/post increment version.
1122  const DebugLoc &DL, const TargetInstrInfo *TII, int CSStackSizeInc,
1123  bool NeedsWinCFI, bool *HasWinCFI, bool EmitCFI,
1125  int CFAOffset = 0) {
1126  unsigned NewOpc;
1127  switch (MBBI->getOpcode()) {
1128  default:
1129  llvm_unreachable("Unexpected callee-save save/restore opcode!");
1130  case AArch64::STPXi:
1131  NewOpc = AArch64::STPXpre;
1132  break;
1133  case AArch64::STPDi:
1134  NewOpc = AArch64::STPDpre;
1135  break;
1136  case AArch64::STPQi:
1137  NewOpc = AArch64::STPQpre;
1138  break;
1139  case AArch64::STRXui:
1140  NewOpc = AArch64::STRXpre;
1141  break;
1142  case AArch64::STRDui:
1143  NewOpc = AArch64::STRDpre;
1144  break;
1145  case AArch64::STRQui:
1146  NewOpc = AArch64::STRQpre;
1147  break;
1148  case AArch64::LDPXi:
1149  NewOpc = AArch64::LDPXpost;
1150  break;
1151  case AArch64::LDPDi:
1152  NewOpc = AArch64::LDPDpost;
1153  break;
1154  case AArch64::LDPQi:
1155  NewOpc = AArch64::LDPQpost;
1156  break;
1157  case AArch64::LDRXui:
1158  NewOpc = AArch64::LDRXpost;
1159  break;
1160  case AArch64::LDRDui:
1161  NewOpc = AArch64::LDRDpost;
1162  break;
1163  case AArch64::LDRQui:
1164  NewOpc = AArch64::LDRQpost;
1165  break;
1166  }
1167  // Get rid of the SEH code associated with the old instruction.
1168  if (NeedsWinCFI) {
1169  auto SEH = std::next(MBBI);
1171  SEH->eraseFromParent();
1172  }
1173 
1174  TypeSize Scale = TypeSize::Fixed(1);
1175  unsigned Width;
1176  int64_t MinOffset, MaxOffset;
1177  bool Success = static_cast<const AArch64InstrInfo *>(TII)->getMemOpInfo(
1178  NewOpc, Scale, Width, MinOffset, MaxOffset);
1179  (void)Success;
1180  assert(Success && "unknown load/store opcode");
1181 
1182  // If the first store isn't right where we want SP then we can't fold the
1183  // update in so create a normal arithmetic instruction instead.
1184  MachineFunction &MF = *MBB.getParent();
1185  if (MBBI->getOperand(MBBI->getNumOperands() - 1).getImm() != 0 ||
1186  CSStackSizeInc < MinOffset || CSStackSizeInc > MaxOffset) {
1187  emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1188  StackOffset::getFixed(CSStackSizeInc), TII, FrameFlag,
1189  false, false, nullptr, EmitCFI,
1190  StackOffset::getFixed(CFAOffset));
1191 
1192  return std::prev(MBBI);
1193  }
1194 
1195  MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII->get(NewOpc));
1196  MIB.addReg(AArch64::SP, RegState::Define);
1197 
1198  // Copy all operands other than the immediate offset.
1199  unsigned OpndIdx = 0;
1200  for (unsigned OpndEnd = MBBI->getNumOperands() - 1; OpndIdx < OpndEnd;
1201  ++OpndIdx)
1202  MIB.add(MBBI->getOperand(OpndIdx));
1203 
1204  assert(MBBI->getOperand(OpndIdx).getImm() == 0 &&
1205  "Unexpected immediate offset in first/last callee-save save/restore "
1206  "instruction!");
1207  assert(MBBI->getOperand(OpndIdx - 1).getReg() == AArch64::SP &&
1208  "Unexpected base register in callee-save save/restore instruction!");
1209  assert(CSStackSizeInc % Scale == 0);
1210  MIB.addImm(CSStackSizeInc / (int)Scale);
1211 
1212  MIB.setMIFlags(MBBI->getFlags());
1213  MIB.setMemRefs(MBBI->memoperands());
1214 
1215  // Generate a new SEH code that corresponds to the new instruction.
1216  if (NeedsWinCFI) {
1217  *HasWinCFI = true;
1218  InsertSEH(*MIB, *TII, FrameFlag);
1219  }
1220 
1221  if (EmitCFI) {
1222  unsigned CFIIndex = MF.addFrameInst(
1223  MCCFIInstruction::cfiDefCfaOffset(nullptr, CFAOffset - CSStackSizeInc));
1224  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1225  .addCFIIndex(CFIIndex)
1226  .setMIFlags(FrameFlag);
1227  }
1228 
1229  return std::prev(MBB.erase(MBBI));
1230 }
1231 
1232 // Fixup callee-save register save/restore instructions to take into account
1233 // combined SP bump by adding the local stack size to the stack offsets.
1235  uint64_t LocalStackSize,
1236  bool NeedsWinCFI,
1237  bool *HasWinCFI) {
1239  return;
1240 
1241  unsigned Opc = MI.getOpcode();
1242  unsigned Scale;
1243  switch (Opc) {
1244  case AArch64::STPXi:
1245  case AArch64::STRXui:
1246  case AArch64::STPDi:
1247  case AArch64::STRDui:
1248  case AArch64::LDPXi:
1249  case AArch64::LDRXui:
1250  case AArch64::LDPDi:
1251  case AArch64::LDRDui:
1252  Scale = 8;
1253  break;
1254  case AArch64::STPQi:
1255  case AArch64::STRQui:
1256  case AArch64::LDPQi:
1257  case AArch64::LDRQui:
1258  Scale = 16;
1259  break;
1260  default:
1261  llvm_unreachable("Unexpected callee-save save/restore opcode!");
1262  }
1263 
1264  unsigned OffsetIdx = MI.getNumExplicitOperands() - 1;
1265  assert(MI.getOperand(OffsetIdx - 1).getReg() == AArch64::SP &&
1266  "Unexpected base register in callee-save save/restore instruction!");
1267  // Last operand is immediate offset that needs fixing.
1268  MachineOperand &OffsetOpnd = MI.getOperand(OffsetIdx);
1269  // All generated opcodes have scaled offsets.
1270  assert(LocalStackSize % Scale == 0);
1271  OffsetOpnd.setImm(OffsetOpnd.getImm() + LocalStackSize / Scale);
1272 
1273  if (NeedsWinCFI) {
1274  *HasWinCFI = true;
1275  auto MBBI = std::next(MachineBasicBlock::iterator(MI));
1276  assert(MBBI != MI.getParent()->end() && "Expecting a valid instruction");
1278  "Expecting a SEH instruction");
1279  fixupSEHOpcode(MBBI, LocalStackSize);
1280  }
1281 }
1282 
1283 static bool isTargetWindows(const MachineFunction &MF) {
1285 }
1286 
1287 // Convenience function to determine whether I is an SVE callee save.
1289  switch (I->getOpcode()) {
1290  default:
1291  return false;
1292  case AArch64::STR_ZXI:
1293  case AArch64::STR_PXI:
1294  case AArch64::LDR_ZXI:
1295  case AArch64::LDR_PXI:
1296  return I->getFlag(MachineInstr::FrameSetup) ||
1297  I->getFlag(MachineInstr::FrameDestroy);
1298  }
1299 }
1300 
1302  if (!(llvm::any_of(
1304  [](const auto &Info) { return Info.getReg() == AArch64::LR; }) &&
1305  MF.getFunction().hasFnAttribute(Attribute::ShadowCallStack)))
1306  return false;
1307 
1309  report_fatal_error("Must reserve x18 to use shadow call stack");
1310 
1311  return true;
1312 }
1313 
1315  MachineFunction &MF,
1318  const DebugLoc &DL, bool NeedsWinCFI,
1319  bool NeedsUnwindInfo) {
1320  // Shadow call stack prolog: str x30, [x18], #8
1321  BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXpost))
1322  .addReg(AArch64::X18, RegState::Define)
1323  .addReg(AArch64::LR)
1324  .addReg(AArch64::X18)
1325  .addImm(8)
1327 
1328  // This instruction also makes x18 live-in to the entry block.
1329  MBB.addLiveIn(AArch64::X18);
1330 
1331  if (NeedsWinCFI)
1332  BuildMI(MBB, MBBI, DL, TII.get(AArch64::SEH_Nop))
1334 
1335  if (NeedsUnwindInfo) {
1336  // Emit a CFI instruction that causes 8 to be subtracted from the value of
1337  // x18 when unwinding past this frame.
1338  static const char CFIInst[] = {
1339  dwarf::DW_CFA_val_expression,
1340  18, // register
1341  2, // length
1342  static_cast<char>(unsigned(dwarf::DW_OP_breg18)),
1343  static_cast<char>(-8) & 0x7f, // addend (sleb128)
1344  };
1345  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createEscape(
1346  nullptr, StringRef(CFIInst, sizeof(CFIInst))));
1347  BuildMI(MBB, MBBI, DL, TII.get(AArch64::CFI_INSTRUCTION))
1348  .addCFIIndex(CFIIndex)
1350  }
1351 }
1352 
1354  MachineFunction &MF,
1357  const DebugLoc &DL) {
1358  // Shadow call stack epilog: ldr x30, [x18, #-8]!
1359  BuildMI(MBB, MBBI, DL, TII.get(AArch64::LDRXpre))
1360  .addReg(AArch64::X18, RegState::Define)
1361  .addReg(AArch64::LR, RegState::Define)
1362  .addReg(AArch64::X18)
1363  .addImm(-8)
1365 
1367  unsigned CFIIndex =
1369  BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
1370  .addCFIIndex(CFIIndex)
1372  }
1373 }
1374 
1376  MachineBasicBlock &MBB) const {
1378  const MachineFrameInfo &MFI = MF.getFrameInfo();
1379  const Function &F = MF.getFunction();
1380  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1381  const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
1382  const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1383  MachineModuleInfo &MMI = MF.getMMI();
1385  bool EmitCFI = AFI->needsDwarfUnwindInfo();
1386  bool HasFP = hasFP(MF);
1387  bool NeedsWinCFI = needsWinCFI(MF);
1388  bool HasWinCFI = false;
1389  auto Cleanup = make_scope_exit([&]() { MF.setHasWinCFI(HasWinCFI); });
1390 
1391  bool IsFunclet = MBB.isEHFuncletEntry();
1392 
1393  // At this point, we're going to decide whether or not the function uses a
1394  // redzone. In most cases, the function doesn't have a redzone so let's
1395  // assume that's false and set it to true in the case that there's a redzone.
1396  AFI->setHasRedZone(false);
1397 
1398  // Debug location must be unknown since the first debug location is used
1399  // to determine the end of the prologue.
1400  DebugLoc DL;
1401 
1402  const auto &MFnI = *MF.getInfo<AArch64FunctionInfo>();
1404  emitShadowCallStackPrologue(*TII, MF, MBB, MBBI, DL, NeedsWinCFI,
1405  MFnI.needsDwarfUnwindInfo());
1406 
1407  if (MFnI.shouldSignReturnAddress()) {
1408  unsigned PACI;
1409  if (MFnI.shouldSignWithBKey()) {
1410  BuildMI(MBB, MBBI, DL, TII->get(AArch64::EMITBKEY))
1412  PACI = Subtarget.hasPAuth() ? AArch64::PACIB : AArch64::PACIBSP;
1413  } else {
1414  PACI = Subtarget.hasPAuth() ? AArch64::PACIA : AArch64::PACIASP;
1415  }
1416 
1417  auto MI = BuildMI(MBB, MBBI, DL, TII->get(PACI));
1418  if (Subtarget.hasPAuth())
1419  MI.addReg(AArch64::LR, RegState::Define)
1420  .addReg(AArch64::LR)
1421  .addReg(AArch64::SP, RegState::InternalRead);
1422  MI.setMIFlag(MachineInstr::FrameSetup);
1423  if (EmitCFI) {
1424  unsigned CFIIndex =
1426  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1427  .addCFIIndex(CFIIndex)
1429  }
1430  }
1431  if (EmitCFI && MFnI.isMTETagged()) {
1432  BuildMI(MBB, MBBI, DL, TII->get(AArch64::EMITMTETAGGED))
1434  }
1435 
1436  // We signal the presence of a Swift extended frame to external tools by
1437  // storing FP with 0b0001 in bits 63:60. In normal userland operation a simple
1438  // ORR is sufficient, it is assumed a Swift kernel would initialize the TBI
1439  // bits so that is still true.
1440  if (HasFP && AFI->hasSwiftAsyncContext()) {
1441  switch (MF.getTarget().Options.SwiftAsyncFramePointer) {
1443  if (Subtarget.swiftAsyncContextIsDynamicallySet()) {
1444  // The special symbol below is absolute and has a *value* that can be
1445  // combined with the frame pointer to signal an extended frame.
1446  BuildMI(MBB, MBBI, DL, TII->get(AArch64::LOADgot), AArch64::X16)
1447  .addExternalSymbol("swift_async_extendedFramePointerFlags",
1449  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrs), AArch64::FP)
1450  .addUse(AArch64::FP)
1451  .addUse(AArch64::X16)
1452  .addImm(Subtarget.isTargetILP32() ? 32 : 0);
1453  break;
1454  }
1455  [[fallthrough]];
1456 
1458  // ORR x29, x29, #0x1000_0000_0000_0000
1459  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXri), AArch64::FP)
1460  .addUse(AArch64::FP)
1461  .addImm(0x1100)
1463  break;
1464 
1466  break;
1467  }
1468  }
1469 
1470  // All calls are tail calls in GHC calling conv, and functions have no
1471  // prologue/epilogue.
1473  return;
1474 
1475  // Set tagged base pointer to the requested stack slot.
1476  // Ideally it should match SP value after prologue.
1478  if (TBPI)
1479  AFI->setTaggedBasePointerOffset(-MFI.getObjectOffset(*TBPI));
1480  else
1482 
1483  const StackOffset &SVEStackSize = getSVEStackSize(MF);
1484 
1485  // getStackSize() includes all the locals in its size calculation. We don't
1486  // include these locals when computing the stack size of a funclet, as they
1487  // are allocated in the parent's stack frame and accessed via the frame
1488  // pointer from the funclet. We only save the callee saved registers in the
1489  // funclet, which are really the callee saved registers of the parent
1490  // function, including the funclet.
1491  int64_t NumBytes = IsFunclet ? getWinEHFuncletFrameSize(MF)
1492  : MFI.getStackSize();
1493  if (!AFI->hasStackFrame() && !windowsRequiresStackProbe(MF, NumBytes)) {
1494  assert(!HasFP && "unexpected function without stack frame but with FP");
1495  assert(!SVEStackSize &&
1496  "unexpected function without stack frame but with SVE objects");
1497  // All of the stack allocation is for locals.
1498  AFI->setLocalStackSize(NumBytes);
1499  if (!NumBytes)
1500  return;
1501  // REDZONE: If the stack size is less than 128 bytes, we don't need
1502  // to actually allocate.
1503  if (canUseRedZone(MF)) {
1504  AFI->setHasRedZone(true);
1505  ++NumRedZoneFunctions;
1506  } else {
1507  emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1508  StackOffset::getFixed(-NumBytes), TII,
1509  MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI);
1510  if (EmitCFI) {
1511  // Label used to tie together the PROLOG_LABEL and the MachineMoves.
1512  MCSymbol *FrameLabel = MMI.getContext().createTempSymbol();
1513  // Encode the stack size of the leaf function.
1514  unsigned CFIIndex = MF.addFrameInst(
1515  MCCFIInstruction::cfiDefCfaOffset(FrameLabel, NumBytes));
1516  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1517  .addCFIIndex(CFIIndex)
1519  }
1520  }
1521 
1522  if (NeedsWinCFI) {
1523  HasWinCFI = true;
1524  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1526  }
1527 
1528  return;
1529  }
1530 
1531  bool IsWin64 =
1532  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
1533  unsigned FixedObject = getFixedObjectSize(MF, AFI, IsWin64, IsFunclet);
1534 
1535  auto PrologueSaveSize = AFI->getCalleeSavedStackSize() + FixedObject;
1536  // All of the remaining stack allocations are for locals.
1537  AFI->setLocalStackSize(NumBytes - PrologueSaveSize);
1538  bool CombineSPBump = shouldCombineCSRLocalStackBump(MF, NumBytes);
1539  bool HomPrologEpilog = homogeneousPrologEpilog(MF);
1540  if (CombineSPBump) {
1541  assert(!SVEStackSize && "Cannot combine SP bump with SVE");
1542  emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1543  StackOffset::getFixed(-NumBytes), TII,
1544  MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI,
1545  EmitCFI);
1546  NumBytes = 0;
1547  } else if (HomPrologEpilog) {
1548  // Stack has been already adjusted.
1549  NumBytes -= PrologueSaveSize;
1550  } else if (PrologueSaveSize != 0) {
1552  MBB, MBBI, DL, TII, -PrologueSaveSize, NeedsWinCFI, &HasWinCFI,
1553  EmitCFI);
1554  NumBytes -= PrologueSaveSize;
1555  }
1556  assert(NumBytes >= 0 && "Negative stack allocation size!?");
1557 
1558  // Move past the saves of the callee-saved registers, fixing up the offsets
1559  // and pre-inc if we decided to combine the callee-save and local stack
1560  // pointer bump above.
1562  while (MBBI != End && MBBI->getFlag(MachineInstr::FrameSetup) &&
1563  !IsSVECalleeSave(MBBI)) {
1564  if (CombineSPBump)
1566  NeedsWinCFI, &HasWinCFI);
1567  ++MBBI;
1568  }
1569 
1570  // For funclets the FP belongs to the containing function.
1571  if (!IsFunclet && HasFP) {
1572  // Only set up FP if we actually need to.
1573  int64_t FPOffset = AFI->getCalleeSaveBaseToFrameRecordOffset();
1574 
1575  if (CombineSPBump)
1576  FPOffset += AFI->getLocalStackSize();
1577 
1578  if (AFI->hasSwiftAsyncContext()) {
1579  // Before we update the live FP we have to ensure there's a valid (or
1580  // null) asynchronous context in its slot just before FP in the frame
1581  // record, so store it now.
1582  const auto &Attrs = MF.getFunction().getAttributes();
1583  bool HaveInitialContext = Attrs.hasAttrSomewhere(Attribute::SwiftAsync);
1584  if (HaveInitialContext)
1585  MBB.addLiveIn(AArch64::X22);
1586  BuildMI(MBB, MBBI, DL, TII->get(AArch64::StoreSwiftAsyncContext))
1587  .addUse(HaveInitialContext ? AArch64::X22 : AArch64::XZR)
1588  .addUse(AArch64::SP)
1589  .addImm(FPOffset - 8)
1591  }
1592 
1593  if (HomPrologEpilog) {
1594  auto Prolog = MBBI;
1595  --Prolog;
1596  assert(Prolog->getOpcode() == AArch64::HOM_Prolog);
1597  Prolog->addOperand(MachineOperand::CreateImm(FPOffset));
1598  } else {
1599  // Issue sub fp, sp, FPOffset or
1600  // mov fp,sp when FPOffset is zero.
1601  // Note: All stores of callee-saved registers are marked as "FrameSetup".
1602  // This code marks the instruction(s) that set the FP also.
1603  emitFrameOffset(MBB, MBBI, DL, AArch64::FP, AArch64::SP,
1604  StackOffset::getFixed(FPOffset), TII,
1605  MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI);
1606  }
1607  if (EmitCFI) {
1608  // Define the current CFA rule to use the provided FP.
1609  const int OffsetToFirstCalleeSaveFromFP =
1611  AFI->getCalleeSavedStackSize();
1612  Register FramePtr = RegInfo->getFrameRegister(MF);
1613  unsigned Reg = RegInfo->getDwarfRegNum(FramePtr, true);
1614  unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::cfiDefCfa(
1615  nullptr, Reg, FixedObject - OffsetToFirstCalleeSaveFromFP));
1616  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1617  .addCFIIndex(CFIIndex)
1619  }
1620  }
1621 
1622  // Now emit the moves for whatever callee saved regs we have (including FP,
1623  // LR if those are saved). Frame instructions for SVE register are emitted
1624  // later, after the instruction which actually save SVE regs.
1625  if (EmitCFI)
1626  emitCalleeSavedGPRLocations(MBB, MBBI);
1627 
1628  if (windowsRequiresStackProbe(MF, NumBytes)) {
1629  uint64_t NumWords = NumBytes >> 4;
1630  if (NeedsWinCFI) {
1631  HasWinCFI = true;
1632  // alloc_l can hold at most 256MB, so assume that NumBytes doesn't
1633  // exceed this amount. We need to move at most 2^24 - 1 into x15.
1634  // This is at most two instructions, MOVZ follwed by MOVK.
1635  // TODO: Fix to use multiple stack alloc unwind codes for stacks
1636  // exceeding 256MB in size.
1637  if (NumBytes >= (1 << 28))
1638  report_fatal_error("Stack size cannot exceed 256MB for stack "
1639  "unwinding purposes");
1640 
1641  uint32_t LowNumWords = NumWords & 0xFFFF;
1642  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVZXi), AArch64::X15)
1643  .addImm(LowNumWords)
1646  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1648  if ((NumWords & 0xFFFF0000) != 0) {
1649  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVKXi), AArch64::X15)
1650  .addReg(AArch64::X15)
1651  .addImm((NumWords & 0xFFFF0000) >> 16) // High half
1654  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1656  }
1657  } else {
1658  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVi64imm), AArch64::X15)
1659  .addImm(NumWords)
1661  }
1662 
1663  const char* ChkStk = Subtarget.getChkStkName();
1664  switch (MF.getTarget().getCodeModel()) {
1665  case CodeModel::Tiny:
1666  case CodeModel::Small:
1667  case CodeModel::Medium:
1668  case CodeModel::Kernel:
1669  BuildMI(MBB, MBBI, DL, TII->get(AArch64::BL))
1670  .addExternalSymbol(ChkStk)
1671  .addReg(AArch64::X15, RegState::Implicit)
1676  if (NeedsWinCFI) {
1677  HasWinCFI = true;
1678  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1680  }
1681  break;
1682  case CodeModel::Large:
1683  BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVaddrEXT))
1684  .addReg(AArch64::X16, RegState::Define)
1685  .addExternalSymbol(ChkStk)
1686  .addExternalSymbol(ChkStk)
1688  if (NeedsWinCFI) {
1689  HasWinCFI = true;
1690  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1692  }
1693 
1694  BuildMI(MBB, MBBI, DL, TII->get(getBLRCallOpcode(MF)))
1695  .addReg(AArch64::X16, RegState::Kill)
1696  .addReg(AArch64::X15, RegState::Implicit | RegState::Define)
1701  if (NeedsWinCFI) {
1702  HasWinCFI = true;
1703  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1705  }
1706  break;
1707  }
1708 
1709  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SUBXrx64), AArch64::SP)
1710  .addReg(AArch64::SP, RegState::Kill)
1711  .addReg(AArch64::X15, RegState::Kill)
1714  if (NeedsWinCFI) {
1715  HasWinCFI = true;
1716  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_StackAlloc))
1717  .addImm(NumBytes)
1719  }
1720  NumBytes = 0;
1721  }
1722 
1723  StackOffset AllocateBefore = SVEStackSize, AllocateAfter = {};
1724  MachineBasicBlock::iterator CalleeSavesBegin = MBBI, CalleeSavesEnd = MBBI;
1725 
1726  // Process the SVE callee-saves to determine what space needs to be
1727  // allocated.
1728  if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
1729  // Find callee save instructions in frame.
1730  CalleeSavesBegin = MBBI;
1731  assert(IsSVECalleeSave(CalleeSavesBegin) && "Unexpected instruction");
1733  ++MBBI;
1734  CalleeSavesEnd = MBBI;
1735 
1736  AllocateBefore = StackOffset::getScalable(CalleeSavedSize);
1737  AllocateAfter = SVEStackSize - AllocateBefore;
1738  }
1739 
1740  // Allocate space for the callee saves (if any).
1742  MBB, CalleeSavesBegin, DL, AArch64::SP, AArch64::SP, -AllocateBefore, TII,
1743  MachineInstr::FrameSetup, false, false, nullptr,
1744  EmitCFI && !HasFP && AllocateBefore,
1745  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
1746 
1747  if (EmitCFI)
1748  emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
1749 
1750  // Finally allocate remaining SVE stack space.
1751  emitFrameOffset(MBB, CalleeSavesEnd, DL, AArch64::SP, AArch64::SP,
1752  -AllocateAfter, TII, MachineInstr::FrameSetup, false, false,
1753  nullptr, EmitCFI && !HasFP && AllocateAfter,
1754  AllocateBefore + StackOffset::getFixed(
1755  (int64_t)MFI.getStackSize() - NumBytes));
1756 
1757  // Allocate space for the rest of the frame.
1758  if (NumBytes) {
1759  // Alignment is required for the parent frame, not the funclet
1760  const bool NeedsRealignment =
1761  !IsFunclet && RegInfo->hasStackRealignment(MF);
1762  unsigned scratchSPReg = AArch64::SP;
1763 
1764  if (NeedsRealignment) {
1765  scratchSPReg = findScratchNonCalleeSaveRegister(&MBB);
1766  assert(scratchSPReg != AArch64::NoRegister);
1767  }
1768 
1769  // If we're a leaf function, try using the red zone.
1770  if (!canUseRedZone(MF)) {
1771  // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
1772  // the correct value here, as NumBytes also includes padding bytes,
1773  // which shouldn't be counted here.
1775  MBB, MBBI, DL, scratchSPReg, AArch64::SP,
1777  false, NeedsWinCFI, &HasWinCFI, EmitCFI && !HasFP,
1778  SVEStackSize +
1779  StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes));
1780  }
1781  if (NeedsRealignment) {
1782  const unsigned NrBitsToZero = Log2(MFI.getMaxAlign());
1783  assert(NrBitsToZero > 1);
1784  assert(scratchSPReg != AArch64::SP);
1785 
1786  // SUB X9, SP, NumBytes
1787  // -- X9 is temporary register, so shouldn't contain any live data here,
1788  // -- free to use. This is already produced by emitFrameOffset above.
1789  // AND SP, X9, 0b11111...0000
1790  // The logical immediates have a non-trivial encoding. The following
1791  // formula computes the encoded immediate with all ones but
1792  // NrBitsToZero zero bits as least significant bits.
1793  uint32_t andMaskEncoded = (1 << 12) // = N
1794  | ((64 - NrBitsToZero) << 6) // immr
1795  | ((64 - NrBitsToZero - 1) << 0); // imms
1796 
1797  BuildMI(MBB, MBBI, DL, TII->get(AArch64::ANDXri), AArch64::SP)
1798  .addReg(scratchSPReg, RegState::Kill)
1799  .addImm(andMaskEncoded);
1800  AFI->setStackRealigned(true);
1801  if (NeedsWinCFI) {
1802  HasWinCFI = true;
1803  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_StackAlloc))
1804  .addImm(NumBytes & andMaskEncoded)
1806  }
1807  }
1808  }
1809 
1810  // If we need a base pointer, set it up here. It's whatever the value of the
1811  // stack pointer is at this point. Any variable size objects will be allocated
1812  // after this, so we can still use the base pointer to reference locals.
1813  //
1814  // FIXME: Clarify FrameSetup flags here.
1815  // Note: Use emitFrameOffset() like above for FP if the FrameSetup flag is
1816  // needed.
1817  // For funclets the BP belongs to the containing function.
1818  if (!IsFunclet && RegInfo->hasBasePointer(MF)) {
1819  TII->copyPhysReg(MBB, MBBI, DL, RegInfo->getBaseRegister(), AArch64::SP,
1820  false);
1821  if (NeedsWinCFI) {
1822  HasWinCFI = true;
1823  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1825  }
1826  }
1827 
1828  // The very last FrameSetup instruction indicates the end of prologue. Emit a
1829  // SEH opcode indicating the prologue end.
1830  if (NeedsWinCFI && HasWinCFI) {
1831  BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1833  }
1834 
1835  // SEH funclets are passed the frame pointer in X1. If the parent
1836  // function uses the base register, then the base register is used
1837  // directly, and is not retrieved from X1.
1838  if (IsFunclet && F.hasPersonalityFn()) {
1839  EHPersonality Per = classifyEHPersonality(F.getPersonalityFn());
1840  if (isAsynchronousEHPersonality(Per)) {
1841  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::COPY), AArch64::FP)
1842  .addReg(AArch64::X1)
1844  MBB.addLiveIn(AArch64::X1);
1845  }
1846  }
1847 }
1848 
1851  const auto &MFI = *MF.getInfo<AArch64FunctionInfo>();
1852  if (!MFI.shouldSignReturnAddress())
1853  return;
1854  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1855  const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1856 
1858  DebugLoc DL;
1859  if (MBBI != MBB.end())
1860  DL = MBBI->getDebugLoc();
1861 
1862  // The AUTIASP instruction assembles to a hint instruction before v8.3a so
1863  // this instruction can safely used for any v8a architecture.
1864  // From v8.3a onwards there are optimised authenticate LR and return
1865  // instructions, namely RETA{A,B}, that can be used instead. In this case the
1866  // DW_CFA_AARCH64_negate_ra_state can't be emitted.
1867  if (Subtarget.hasPAuth() && MBBI != MBB.end() &&
1868  MBBI->getOpcode() == AArch64::RET_ReallyLR) {
1869  BuildMI(MBB, MBBI, DL,
1870  TII->get(MFI.shouldSignWithBKey() ? AArch64::RETAB : AArch64::RETAA))
1871  .copyImplicitOps(*MBBI);
1872  MBB.erase(MBBI);
1873  } else {
1874  BuildMI(
1875  MBB, MBBI, DL,
1876  TII->get(MFI.shouldSignWithBKey() ? AArch64::AUTIBSP : AArch64::AUTIASP))
1878 
1879  unsigned CFIIndex =
1881  BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1882  .addCFIIndex(CFIIndex)
1884  }
1885 }
1886 
1887 static bool isFuncletReturnInstr(const MachineInstr &MI) {
1888  switch (MI.getOpcode()) {
1889  default:
1890  return false;
1891  case AArch64::CATCHRET:
1892  case AArch64::CLEANUPRET:
1893  return true;
1894  }
1895 }
1896 
1898  MachineBasicBlock &MBB) const {
1900  MachineFrameInfo &MFI = MF.getFrameInfo();
1901  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1902  const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1903  DebugLoc DL;
1904  bool NeedsWinCFI = needsWinCFI(MF);
1905  bool EmitCFI = MF.getInfo<AArch64FunctionInfo>()->needsAsyncDwarfUnwindInfo();
1906  bool HasWinCFI = false;
1907  bool IsFunclet = false;
1908  auto WinCFI = make_scope_exit([&]() { assert(HasWinCFI == MF.hasWinCFI()); });
1909 
1910  if (MBB.end() != MBBI) {
1911  DL = MBBI->getDebugLoc();
1912  IsFunclet = isFuncletReturnInstr(*MBBI);
1913  }
1914 
1915  auto FinishingTouches = make_scope_exit([&]() {
1919  if (EmitCFI)
1920  emitCalleeSavedGPRRestores(MBB, MBB.getFirstTerminator());
1921  });
1922 
1923  int64_t NumBytes = IsFunclet ? getWinEHFuncletFrameSize(MF)
1924  : MFI.getStackSize();
1926 
1927  // All calls are tail calls in GHC calling conv, and functions have no
1928  // prologue/epilogue.
1930  return;
1931 
1932  // How much of the stack used by incoming arguments this function is expected
1933  // to restore in this particular epilogue.
1934  int64_t ArgumentStackToRestore = getArgumentStackToRestore(MF, MBB);
1935  bool IsWin64 =
1936  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
1937  unsigned FixedObject = getFixedObjectSize(MF, AFI, IsWin64, IsFunclet);
1938 
1939  int64_t AfterCSRPopSize = ArgumentStackToRestore;
1940  auto PrologueSaveSize = AFI->getCalleeSavedStackSize() + FixedObject;
1941  // We cannot rely on the local stack size set in emitPrologue if the function
1942  // has funclets, as funclets have different local stack size requirements, and
1943  // the current value set in emitPrologue may be that of the containing
1944  // function.
1945  if (MF.hasEHFunclets())
1946  AFI->setLocalStackSize(NumBytes - PrologueSaveSize);
1947  if (homogeneousPrologEpilog(MF, &MBB)) {
1948  assert(!NeedsWinCFI);
1949  auto LastPopI = MBB.getFirstTerminator();
1950  if (LastPopI != MBB.begin()) {
1951  auto HomogeneousEpilog = std::prev(LastPopI);
1952  if (HomogeneousEpilog->getOpcode() == AArch64::HOM_Epilog)
1953  LastPopI = HomogeneousEpilog;
1954  }
1955 
1956  // Adjust local stack
1957  emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
1959  MachineInstr::FrameDestroy, false, NeedsWinCFI);
1960 
1961  // SP has been already adjusted while restoring callee save regs.
1962  // We've bailed-out the case with adjusting SP for arguments.
1963  assert(AfterCSRPopSize == 0);
1964  return;
1965  }
1966  bool CombineSPBump = shouldCombineCSRLocalStackBumpInEpilogue(MBB, NumBytes);
1967  // Assume we can't combine the last pop with the sp restore.
1968 
1969  bool CombineAfterCSRBump = false;
1970  if (!CombineSPBump && PrologueSaveSize != 0) {
1972  while (Pop->getOpcode() == TargetOpcode::CFI_INSTRUCTION ||
1974  Pop = std::prev(Pop);
1975  // Converting the last ldp to a post-index ldp is valid only if the last
1976  // ldp's offset is 0.
1977  const MachineOperand &OffsetOp = Pop->getOperand(Pop->getNumOperands() - 1);
1978  // If the offset is 0 and the AfterCSR pop is not actually trying to
1979  // allocate more stack for arguments (in space that an untimely interrupt
1980  // may clobber), convert it to a post-index ldp.
1981  if (OffsetOp.getImm() == 0 && AfterCSRPopSize >= 0) {
1983  MBB, Pop, DL, TII, PrologueSaveSize, NeedsWinCFI, &HasWinCFI, EmitCFI,
1984  MachineInstr::FrameDestroy, PrologueSaveSize);
1985  } else {
1986  // If not, make sure to emit an add after the last ldp.
1987  // We're doing this by transfering the size to be restored from the
1988  // adjustment *before* the CSR pops to the adjustment *after* the CSR
1989  // pops.
1990  AfterCSRPopSize += PrologueSaveSize;
1991  CombineAfterCSRBump = true;
1992  }
1993  }
1994 
1995  // Move past the restores of the callee-saved registers.
1996  // If we plan on combining the sp bump of the local stack size and the callee
1997  // save stack size, we might need to adjust the CSR save and restore offsets.
2000  while (LastPopI != Begin) {
2001  --LastPopI;
2002  if (!LastPopI->getFlag(MachineInstr::FrameDestroy) ||
2003  IsSVECalleeSave(LastPopI)) {
2004  ++LastPopI;
2005  break;
2006  } else if (CombineSPBump)
2008  NeedsWinCFI, &HasWinCFI);
2009  }
2010 
2011  if (MF.hasWinCFI()) {
2012  // If the prologue didn't contain any SEH opcodes and didn't set the
2013  // MF.hasWinCFI() flag, assume the epilogue won't either, and skip the
2014  // EpilogStart - to avoid generating CFI for functions that don't need it.
2015  // (And as we didn't generate any prologue at all, it would be asymmetrical
2016  // to the epilogue.) By the end of the function, we assert that
2017  // HasWinCFI is equal to MF.hasWinCFI(), to verify this assumption.
2018  HasWinCFI = true;
2019  BuildMI(MBB, LastPopI, DL, TII->get(AArch64::SEH_EpilogStart))
2021  }
2022 
2023  if (hasFP(MF) && AFI->hasSwiftAsyncContext()) {
2024  switch (MF.getTarget().Options.SwiftAsyncFramePointer) {
2026  // Avoid the reload as it is GOT relative, and instead fall back to the
2027  // hardcoded value below. This allows a mismatch between the OS and
2028  // application without immediately terminating on the difference.
2029  [[fallthrough]];
2031  // We need to reset FP to its untagged state on return. Bit 60 is
2032  // currently used to show the presence of an extended frame.
2033 
2034  // BIC x29, x29, #0x1000_0000_0000_0000
2035  BuildMI(MBB, MBB.getFirstTerminator(), DL, TII->get(AArch64::ANDXri),
2036  AArch64::FP)
2037  .addUse(AArch64::FP)
2038  .addImm(0x10fe)
2040  break;
2041 
2043  break;
2044  }
2045  }
2046 
2047  const StackOffset &SVEStackSize = getSVEStackSize(MF);
2048 
2049  // If there is a single SP update, insert it before the ret and we're done.
2050  if (CombineSPBump) {
2051  assert(!SVEStackSize && "Cannot combine SP bump with SVE");
2052 
2053  // When we are about to restore the CSRs, the CFA register is SP again.
2054  if (EmitCFI && hasFP(MF)) {
2055  const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
2056  unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
2057  unsigned CFIIndex =
2058  MF.addFrameInst(MCCFIInstruction::cfiDefCfa(nullptr, Reg, NumBytes));
2059  BuildMI(MBB, LastPopI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2060  .addCFIIndex(CFIIndex)
2062  }
2063 
2064  emitFrameOffset(MBB, MBB.getFirstTerminator(), DL, AArch64::SP, AArch64::SP,
2065  StackOffset::getFixed(NumBytes + (int64_t)AfterCSRPopSize),
2066  TII, MachineInstr::FrameDestroy, false, NeedsWinCFI,
2067  &HasWinCFI, EmitCFI, StackOffset::getFixed(NumBytes));
2068  if (HasWinCFI)
2070  TII->get(AArch64::SEH_EpilogEnd))
2072  return;
2073  }
2074 
2075  NumBytes -= PrologueSaveSize;
2076  assert(NumBytes >= 0 && "Negative stack allocation size!?");
2077 
2078  // Process the SVE callee-saves to determine what space needs to be
2079  // deallocated.
2080  StackOffset DeallocateBefore = {}, DeallocateAfter = SVEStackSize;
2081  MachineBasicBlock::iterator RestoreBegin = LastPopI, RestoreEnd = LastPopI;
2082  if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2083  RestoreBegin = std::prev(RestoreEnd);
2084  while (RestoreBegin != MBB.begin() &&
2085  IsSVECalleeSave(std::prev(RestoreBegin)))
2086  --RestoreBegin;
2087 
2088  assert(IsSVECalleeSave(RestoreBegin) &&
2089  IsSVECalleeSave(std::prev(RestoreEnd)) && "Unexpected instruction");
2090 
2091  StackOffset CalleeSavedSizeAsOffset =
2092  StackOffset::getScalable(CalleeSavedSize);
2093  DeallocateBefore = SVEStackSize - CalleeSavedSizeAsOffset;
2094  DeallocateAfter = CalleeSavedSizeAsOffset;
2095  }
2096 
2097  // Deallocate the SVE area.
2098  if (SVEStackSize) {
2099  // If we have stack realignment or variable sized objects on the stack,
2100  // restore the stack pointer from the frame pointer prior to SVE CSR
2101  // restoration.
2102  if (AFI->isStackRealigned() || MFI.hasVarSizedObjects()) {
2103  if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2104  // Set SP to start of SVE callee-save area from which they can
2105  // be reloaded. The code below will deallocate the stack space
2106  // space by moving FP -> SP.
2107  emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::FP,
2108  StackOffset::getScalable(-CalleeSavedSize), TII,
2110  }
2111  } else {
2112  if (AFI->getSVECalleeSavedStackSize()) {
2113  // Deallocate the non-SVE locals first before we can deallocate (and
2114  // restore callee saves) from the SVE area.
2116  MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
2118  false, false, nullptr, EmitCFI && !hasFP(MF),
2119  SVEStackSize + StackOffset::getFixed(NumBytes + PrologueSaveSize));
2120  NumBytes = 0;
2121  }
2122 
2123  emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
2124  DeallocateBefore, TII, MachineInstr::FrameDestroy, false,
2125  false, nullptr, EmitCFI && !hasFP(MF),
2126  SVEStackSize +
2127  StackOffset::getFixed(NumBytes + PrologueSaveSize));
2128 
2129  emitFrameOffset(MBB, RestoreEnd, DL, AArch64::SP, AArch64::SP,
2130  DeallocateAfter, TII, MachineInstr::FrameDestroy, false,
2131  false, nullptr, EmitCFI && !hasFP(MF),
2132  DeallocateAfter +
2133  StackOffset::getFixed(NumBytes + PrologueSaveSize));
2134  }
2135  if (EmitCFI)
2136  emitCalleeSavedSVERestores(MBB, RestoreEnd);
2137  }
2138 
2139  if (!hasFP(MF)) {
2140  bool RedZone = canUseRedZone(MF);
2141  // If this was a redzone leaf function, we don't need to restore the
2142  // stack pointer (but we may need to pop stack args for fastcc).
2143  if (RedZone && AfterCSRPopSize == 0)
2144  return;
2145 
2146  // Pop the local variables off the stack. If there are no callee-saved
2147  // registers, it means we are actually positioned at the terminator and can
2148  // combine stack increment for the locals and the stack increment for
2149  // callee-popped arguments into (possibly) a single instruction and be done.
2150  bool NoCalleeSaveRestore = PrologueSaveSize == 0;
2151  int64_t StackRestoreBytes = RedZone ? 0 : NumBytes;
2152  if (NoCalleeSaveRestore)
2153  StackRestoreBytes += AfterCSRPopSize;
2154 
2156  MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2157  StackOffset::getFixed(StackRestoreBytes), TII,
2158  MachineInstr::FrameDestroy, false, NeedsWinCFI, &HasWinCFI, EmitCFI,
2159  StackOffset::getFixed((RedZone ? 0 : NumBytes) + PrologueSaveSize));
2160 
2161  // If we were able to combine the local stack pop with the argument pop,
2162  // then we're done.
2163  if (NoCalleeSaveRestore || AfterCSRPopSize == 0) {
2164  if (HasWinCFI) {
2166  TII->get(AArch64::SEH_EpilogEnd))
2168  }
2169  return;
2170  }
2171 
2172  NumBytes = 0;
2173  }
2174 
2175  // Restore the original stack pointer.
2176  // FIXME: Rather than doing the math here, we should instead just use
2177  // non-post-indexed loads for the restores if we aren't actually going to
2178  // be able to save any instructions.
2179  if (!IsFunclet && (MFI.hasVarSizedObjects() || AFI->isStackRealigned())) {
2181  MBB, LastPopI, DL, AArch64::SP, AArch64::FP,
2183  TII, MachineInstr::FrameDestroy, false, NeedsWinCFI);
2184  } else if (NumBytes)
2185  emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2186  StackOffset::getFixed(NumBytes), TII,
2187  MachineInstr::FrameDestroy, false, NeedsWinCFI);
2188 
2189  // When we are about to restore the CSRs, the CFA register is SP again.
2190  if (EmitCFI && hasFP(MF)) {
2191  const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
2192  unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
2193  unsigned CFIIndex = MF.addFrameInst(
2194  MCCFIInstruction::cfiDefCfa(nullptr, Reg, PrologueSaveSize));
2195  BuildMI(MBB, LastPopI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2196  .addCFIIndex(CFIIndex)
2198  }
2199 
2200  // This must be placed after the callee-save restore code because that code
2201  // assumes the SP is at the same location as it was after the callee-save save
2202  // code in the prologue.
2203  if (AfterCSRPopSize) {
2204  assert(AfterCSRPopSize > 0 && "attempting to reallocate arg stack that an "
2205  "interrupt may have clobbered");
2206 
2208  MBB, MBB.getFirstTerminator(), DL, AArch64::SP, AArch64::SP,
2210  false, NeedsWinCFI, &HasWinCFI, EmitCFI,
2211  StackOffset::getFixed(CombineAfterCSRBump ? PrologueSaveSize : 0));
2212  }
2213  if (HasWinCFI)
2214  BuildMI(MBB, MBB.getFirstTerminator(), DL, TII->get(AArch64::SEH_EpilogEnd))
2216 }
2217 
2218 /// getFrameIndexReference - Provide a base+offset reference to an FI slot for
2219 /// debug info. It's the same as what we use for resolving the code-gen
2220 /// references for now. FIXME: This can go wrong when references are
2221 /// SP-relative and simple call frames aren't used.
2224  Register &FrameReg) const {
2226  MF, FI, FrameReg,
2227  /*PreferFP=*/
2228  MF.getFunction().hasFnAttribute(Attribute::SanitizeHWAddress),
2229  /*ForSimm=*/false);
2230 }
2231 
2234  int FI) const {
2236 }
2237 
2239  int64_t ObjectOffset) {
2240  const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
2241  const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2242  bool IsWin64 =
2243  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
2244  unsigned FixedObject =
2245  getFixedObjectSize(MF, AFI, IsWin64, /*IsFunclet=*/false);
2246  int64_t CalleeSaveSize = AFI->getCalleeSavedStackSize(MF.getFrameInfo());
2247  int64_t FPAdjust =
2248  CalleeSaveSize - AFI->getCalleeSaveBaseToFrameRecordOffset();
2249  return StackOffset::getFixed(ObjectOffset + FixedObject + FPAdjust);
2250 }
2251 
2253  int64_t ObjectOffset) {
2254  const auto &MFI = MF.getFrameInfo();
2255  return StackOffset::getFixed(ObjectOffset + (int64_t)MFI.getStackSize());
2256 }
2257 
2258  // TODO: This function currently does not work for scalable vectors.
2260  int FI) const {
2261  const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
2262  MF.getSubtarget().getRegisterInfo());
2263  int ObjectOffset = MF.getFrameInfo().getObjectOffset(FI);
2264  return RegInfo->getLocalAddressRegister(MF) == AArch64::FP
2265  ? getFPOffset(MF, ObjectOffset).getFixed()
2266  : getStackOffset(MF, ObjectOffset).getFixed();
2267 }
2268 
2270  const MachineFunction &MF, int FI, Register &FrameReg, bool PreferFP,
2271  bool ForSimm) const {
2272  const auto &MFI = MF.getFrameInfo();
2273  int64_t ObjectOffset = MFI.getObjectOffset(FI);
2274  bool isFixed = MFI.isFixedObjectIndex(FI);
2275  bool isSVE = MFI.getStackID(FI) == TargetStackID::ScalableVector;
2276  return resolveFrameOffsetReference(MF, ObjectOffset, isFixed, isSVE, FrameReg,
2277  PreferFP, ForSimm);
2278 }
2279 
2281  const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE,
2282  Register &FrameReg, bool PreferFP, bool ForSimm) const {
2283  const auto &MFI = MF.getFrameInfo();
2284  const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
2285  MF.getSubtarget().getRegisterInfo());
2286  const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
2287  const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2288 
2289  int64_t FPOffset = getFPOffset(MF, ObjectOffset).getFixed();
2290  int64_t Offset = getStackOffset(MF, ObjectOffset).getFixed();
2291  bool isCSR =
2292  !isFixed && ObjectOffset >= -((int)AFI->getCalleeSavedStackSize(MFI));
2293 
2294  const StackOffset &SVEStackSize = getSVEStackSize(MF);
2295 
2296  // Use frame pointer to reference fixed objects. Use it for locals if
2297  // there are VLAs or a dynamically realigned SP (and thus the SP isn't
2298  // reliable as a base). Make sure useFPForScavengingIndex() does the
2299  // right thing for the emergency spill slot.
2300  bool UseFP = false;
2301  if (AFI->hasStackFrame() && !isSVE) {
2302  // We shouldn't prefer using the FP to access fixed-sized stack objects when
2303  // there are scalable (SVE) objects in between the FP and the fixed-sized
2304  // objects.
2305  PreferFP &= !SVEStackSize;
2306 
2307  // Note: Keeping the following as multiple 'if' statements rather than
2308  // merging to a single expression for readability.
2309  //
2310  // Argument access should always use the FP.
2311  if (isFixed) {
2312  UseFP = hasFP(MF);
2313  } else if (isCSR && RegInfo->hasStackRealignment(MF)) {
2314  // References to the CSR area must use FP if we're re-aligning the stack
2315  // since the dynamically-sized alignment padding is between the SP/BP and
2316  // the CSR area.
2317  assert(hasFP(MF) && "Re-aligned stack must have frame pointer");
2318  UseFP = true;
2319  } else if (hasFP(MF) && !RegInfo->hasStackRealignment(MF)) {
2320  // If the FPOffset is negative and we're producing a signed immediate, we
2321  // have to keep in mind that the available offset range for negative
2322  // offsets is smaller than for positive ones. If an offset is available
2323  // via the FP and the SP, use whichever is closest.
2324  bool FPOffsetFits = !ForSimm || FPOffset >= -256;
2325  PreferFP |= Offset > -FPOffset && !SVEStackSize;
2326 
2327  if (MFI.hasVarSizedObjects()) {
2328  // If we have variable sized objects, we can use either FP or BP, as the
2329  // SP offset is unknown. We can use the base pointer if we have one and
2330  // FP is not preferred. If not, we're stuck with using FP.
2331  bool CanUseBP = RegInfo->hasBasePointer(MF);
2332  if (FPOffsetFits && CanUseBP) // Both are ok. Pick the best.
2333  UseFP = PreferFP;
2334  else if (!CanUseBP) // Can't use BP. Forced to use FP.
2335  UseFP = true;
2336  // else we can use BP and FP, but the offset from FP won't fit.
2337  // That will make us scavenge registers which we can probably avoid by
2338  // using BP. If it won't fit for BP either, we'll scavenge anyway.
2339  } else if (FPOffset >= 0) {
2340  // Use SP or FP, whichever gives us the best chance of the offset
2341  // being in range for direct access. If the FPOffset is positive,
2342  // that'll always be best, as the SP will be even further away.
2343  UseFP = true;
2344  } else if (MF.hasEHFunclets() && !RegInfo->hasBasePointer(MF)) {
2345  // Funclets access the locals contained in the parent's stack frame
2346  // via the frame pointer, so we have to use the FP in the parent
2347  // function.
2348  (void) Subtarget;
2349  assert(
2350  Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv()) &&
2351  "Funclets should only be present on Win64");
2352  UseFP = true;
2353  } else {
2354  // We have the choice between FP and (SP or BP).
2355  if (FPOffsetFits && PreferFP) // If FP is the best fit, use it.
2356  UseFP = true;
2357  }
2358  }
2359  }
2360 
2361  assert(
2362  ((isFixed || isCSR) || !RegInfo->hasStackRealignment(MF) || !UseFP) &&
2363  "In the presence of dynamic stack pointer realignment, "
2364  "non-argument/CSR objects cannot be accessed through the frame pointer");
2365 
2366  if (isSVE) {
2367  StackOffset FPOffset =
2369  StackOffset SPOffset =
2370  SVEStackSize +
2371  StackOffset::get(MFI.getStackSize() - AFI->getCalleeSavedStackSize(),
2372  ObjectOffset);
2373  // Always use the FP for SVE spills if available and beneficial.
2374  if (hasFP(MF) && (SPOffset.getFixed() ||
2375  FPOffset.getScalable() < SPOffset.getScalable() ||
2376  RegInfo->hasStackRealignment(MF))) {
2377  FrameReg = RegInfo->getFrameRegister(MF);
2378  return FPOffset;
2379  }
2380 
2381  FrameReg = RegInfo->hasBasePointer(MF) ? RegInfo->getBaseRegister()
2382  : (unsigned)AArch64::SP;
2383  return SPOffset;
2384  }
2385 
2386  StackOffset ScalableOffset = {};
2387  if (UseFP && !(isFixed || isCSR))
2388  ScalableOffset = -SVEStackSize;
2389  if (!UseFP && (isFixed || isCSR))
2390  ScalableOffset = SVEStackSize;
2391 
2392  if (UseFP) {
2393  FrameReg = RegInfo->getFrameRegister(MF);
2394  return StackOffset::getFixed(FPOffset) + ScalableOffset;
2395  }
2396 
2397  // Use the base pointer if we have one.
2398  if (RegInfo->hasBasePointer(MF))
2399  FrameReg = RegInfo->getBaseRegister();
2400  else {
2401  assert(!MFI.hasVarSizedObjects() &&
2402  "Can't use SP when we have var sized objects.");
2403  FrameReg = AArch64::SP;
2404  // If we're using the red zone for this function, the SP won't actually
2405  // be adjusted, so the offsets will be negative. They're also all
2406  // within range of the signed 9-bit immediate instructions.
2407  if (canUseRedZone(MF))
2408  Offset -= AFI->getLocalStackSize();
2409  }
2410 
2411  return StackOffset::getFixed(Offset) + ScalableOffset;
2412 }
2413 
2414 static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg) {
2415  // Do not set a kill flag on values that are also marked as live-in. This
2416  // happens with the @llvm-returnaddress intrinsic and with arguments passed in
2417  // callee saved registers.
2418  // Omitting the kill flags is conservatively correct even if the live-in
2419  // is not used after all.
2420  bool IsLiveIn = MF.getRegInfo().isLiveIn(Reg);
2421  return getKillRegState(!IsLiveIn);
2422 }
2423 
2425  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2427  return Subtarget.isTargetMachO() &&
2428  !(Subtarget.getTargetLowering()->supportSwiftError() &&
2429  Attrs.hasAttrSomewhere(Attribute::SwiftError)) &&
2431 }
2432 
2433 static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2,
2434  bool NeedsWinCFI, bool IsFirst) {
2435  // If we are generating register pairs for a Windows function that requires
2436  // EH support, then pair consecutive registers only. There are no unwind
2437  // opcodes for saves/restores of non-consectuve register pairs.
2438  // The unwind opcodes are save_regp, save_regp_x, save_fregp, save_frepg_x,
2439  // save_lrpair.
2440  // https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling
2441 
2442  if (Reg2 == AArch64::FP)
2443  return true;
2444  if (!NeedsWinCFI)
2445  return false;
2446  if (Reg2 == Reg1 + 1)
2447  return false;
2448  // If pairing a GPR with LR, the pair can be described by the save_lrpair
2449  // opcode. If this is the first register pair, it would end up with a
2450  // predecrement, but there's no save_lrpair_x opcode, so we can only do this
2451  // if LR is paired with something else than the first register.
2452  // The save_lrpair opcode requires the first register to be an odd one.
2453  if (Reg1 >= AArch64::X19 && Reg1 <= AArch64::X27 &&
2454  (Reg1 - AArch64::X19) % 2 == 0 && Reg2 == AArch64::LR && !IsFirst)
2455  return false;
2456  return true;
2457 }
2458 
2459 /// Returns true if Reg1 and Reg2 cannot be paired using a ldp/stp instruction.
2460 /// WindowsCFI requires that only consecutive registers can be paired.
2461 /// LR and FP need to be allocated together when the frame needs to save
2462 /// the frame-record. This means any other register pairing with LR is invalid.
2463 static bool invalidateRegisterPairing(unsigned Reg1, unsigned Reg2,
2464  bool UsesWinAAPCS, bool NeedsWinCFI,
2465  bool NeedsFrameRecord, bool IsFirst) {
2466  if (UsesWinAAPCS)
2467  return invalidateWindowsRegisterPairing(Reg1, Reg2, NeedsWinCFI, IsFirst);
2468 
2469  // If we need to store the frame record, don't pair any register
2470  // with LR other than FP.
2471  if (NeedsFrameRecord)
2472  return Reg2 == AArch64::LR;
2473 
2474  return false;
2475 }
2476 
2477 namespace {
2478 
2479 struct RegPairInfo {
2480  unsigned Reg1 = AArch64::NoRegister;
2481  unsigned Reg2 = AArch64::NoRegister;
2482  int FrameIdx;
2483  int Offset;
2484  enum RegType { GPR, FPR64, FPR128, PPR, ZPR } Type;
2485 
2486  RegPairInfo() = default;
2487 
2488  bool isPaired() const { return Reg2 != AArch64::NoRegister; }
2489 
2490  unsigned getScale() const {
2491  switch (Type) {
2492  case PPR:
2493  return 2;
2494  case GPR:
2495  case FPR64:
2496  return 8;
2497  case ZPR:
2498  case FPR128:
2499  return 16;
2500  }
2501  llvm_unreachable("Unsupported type");
2502  }
2503 
2504  bool isScalable() const { return Type == PPR || Type == ZPR; }
2505 };
2506 
2507 } // end anonymous namespace
2508 
2512  bool NeedsFrameRecord) {
2513 
2514  if (CSI.empty())
2515  return;
2516 
2517  bool IsWindows = isTargetWindows(MF);
2518  bool NeedsWinCFI = needsWinCFI(MF);
2520  MachineFrameInfo &MFI = MF.getFrameInfo();
2522  unsigned Count = CSI.size();
2523  (void)CC;
2524  // MachO's compact unwind format relies on all registers being stored in
2525  // pairs.
2528  (Count & 1) == 0) &&
2529  "Odd number of callee-saved regs to spill!");
2530  int ByteOffset = AFI->getCalleeSavedStackSize();
2531  int StackFillDir = -1;
2532  int RegInc = 1;
2533  unsigned FirstReg = 0;
2534  if (NeedsWinCFI) {
2535  // For WinCFI, fill the stack from the bottom up.
2536  ByteOffset = 0;
2537  StackFillDir = 1;
2538  // As the CSI array is reversed to match PrologEpilogInserter, iterate
2539  // backwards, to pair up registers starting from lower numbered registers.
2540  RegInc = -1;
2541  FirstReg = Count - 1;
2542  }
2543  int ScalableByteOffset = AFI->getSVECalleeSavedStackSize();
2544  bool NeedGapToAlignStack = AFI->hasCalleeSaveStackFreeSpace();
2545 
2546  // When iterating backwards, the loop condition relies on unsigned wraparound.
2547  for (unsigned i = FirstReg; i < Count; i += RegInc) {
2548  RegPairInfo RPI;
2549  RPI.Reg1 = CSI[i].getReg();
2550 
2551  if (AArch64::GPR64RegClass.contains(RPI.Reg1))
2552  RPI.Type = RegPairInfo::GPR;
2553  else if (AArch64::FPR64RegClass.contains(RPI.Reg1))
2554  RPI.Type = RegPairInfo::FPR64;
2555  else if (AArch64::FPR128RegClass.contains(RPI.Reg1))
2556  RPI.Type = RegPairInfo::FPR128;
2557  else if (AArch64::ZPRRegClass.contains(RPI.Reg1))
2558  RPI.Type = RegPairInfo::ZPR;
2559  else if (AArch64::PPRRegClass.contains(RPI.Reg1))
2560  RPI.Type = RegPairInfo::PPR;
2561  else
2562  llvm_unreachable("Unsupported register class.");
2563 
2564  // Add the next reg to the pair if it is in the same register class.
2565  if (unsigned(i + RegInc) < Count) {
2566  Register NextReg = CSI[i + RegInc].getReg();
2567  bool IsFirst = i == FirstReg;
2568  switch (RPI.Type) {
2569  case RegPairInfo::GPR:
2570  if (AArch64::GPR64RegClass.contains(NextReg) &&
2571  !invalidateRegisterPairing(RPI.Reg1, NextReg, IsWindows,
2572  NeedsWinCFI, NeedsFrameRecord, IsFirst))
2573  RPI.Reg2 = NextReg;
2574  break;
2575  case RegPairInfo::FPR64:
2576  if (AArch64::FPR64RegClass.contains(NextReg) &&
2577  !invalidateWindowsRegisterPairing(RPI.Reg1, NextReg, NeedsWinCFI,
2578  IsFirst))
2579  RPI.Reg2 = NextReg;
2580  break;
2581  case RegPairInfo::FPR128:
2582  if (AArch64::FPR128RegClass.contains(NextReg))
2583  RPI.Reg2 = NextReg;
2584  break;
2585  case RegPairInfo::PPR:
2586  case RegPairInfo::ZPR:
2587  break;
2588  }
2589  }
2590 
2591  // GPRs and FPRs are saved in pairs of 64-bit regs. We expect the CSI
2592  // list to come in sorted by frame index so that we can issue the store
2593  // pair instructions directly. Assert if we see anything otherwise.
2594  //
2595  // The order of the registers in the list is controlled by
2596  // getCalleeSavedRegs(), so they will always be in-order, as well.
2597  assert((!RPI.isPaired() ||
2598  (CSI[i].getFrameIdx() + RegInc == CSI[i + RegInc].getFrameIdx())) &&
2599  "Out of order callee saved regs!");
2600 
2601  assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg2 != AArch64::FP ||
2602  RPI.Reg1 == AArch64::LR) &&
2603  "FrameRecord must be allocated together with LR");
2604 
2605  // Windows AAPCS has FP and LR reversed.
2606  assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg1 != AArch64::FP ||
2607  RPI.Reg2 == AArch64::LR) &&
2608  "FrameRecord must be allocated together with LR");
2609 
2610  // MachO's compact unwind format relies on all registers being stored in
2611  // adjacent register pairs.
2614  (RPI.isPaired() &&
2615  ((RPI.Reg1 == AArch64::LR && RPI.Reg2 == AArch64::FP) ||
2616  RPI.Reg1 + 1 == RPI.Reg2))) &&
2617  "Callee-save registers not saved as adjacent register pair!");
2618 
2619  RPI.FrameIdx = CSI[i].getFrameIdx();
2620  if (NeedsWinCFI &&
2621  RPI.isPaired()) // RPI.FrameIdx must be the lower index of the pair
2622  RPI.FrameIdx = CSI[i + RegInc].getFrameIdx();
2623 
2624  int Scale = RPI.getScale();
2625 
2626  int OffsetPre = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
2627  assert(OffsetPre % Scale == 0);
2628 
2629  if (RPI.isScalable())
2630  ScalableByteOffset += StackFillDir * Scale;
2631  else
2632  ByteOffset += StackFillDir * (RPI.isPaired() ? 2 * Scale : Scale);
2633 
2634  // Swift's async context is directly before FP, so allocate an extra
2635  // 8 bytes for it.
2636  if (NeedsFrameRecord && AFI->hasSwiftAsyncContext() &&
2637  RPI.Reg2 == AArch64::FP)
2638  ByteOffset += StackFillDir * 8;
2639 
2640  assert(!(RPI.isScalable() && RPI.isPaired()) &&
2641  "Paired spill/fill instructions don't exist for SVE vectors");
2642 
2643  // Round up size of non-pair to pair size if we need to pad the
2644  // callee-save area to ensure 16-byte alignment.
2645  if (NeedGapToAlignStack && !NeedsWinCFI &&
2646  !RPI.isScalable() && RPI.Type != RegPairInfo::FPR128 &&
2647  !RPI.isPaired() && ByteOffset % 16 != 0) {
2648  ByteOffset += 8 * StackFillDir;
2649  assert(MFI.getObjectAlign(RPI.FrameIdx) <= Align(16));
2650  // A stack frame with a gap looks like this, bottom up:
2651  // d9, d8. x21, gap, x20, x19.
2652  // Set extra alignment on the x21 object to create the gap above it.
2653  MFI.setObjectAlignment(RPI.FrameIdx, Align(16));
2654  NeedGapToAlignStack = false;
2655  }
2656 
2657  int OffsetPost = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
2658  assert(OffsetPost % Scale == 0);
2659  // If filling top down (default), we want the offset after incrementing it.
2660  // If fillibg bootom up (WinCFI) we need the original offset.
2661  int Offset = NeedsWinCFI ? OffsetPre : OffsetPost;
2662 
2663  // The FP, LR pair goes 8 bytes into our expanded 24-byte slot so that the
2664  // Swift context can directly precede FP.
2665  if (NeedsFrameRecord && AFI->hasSwiftAsyncContext() &&
2666  RPI.Reg2 == AArch64::FP)
2667  Offset += 8;
2668  RPI.Offset = Offset / Scale;
2669 
2670  assert(((!RPI.isScalable() && RPI.Offset >= -64 && RPI.Offset <= 63) ||
2671  (RPI.isScalable() && RPI.Offset >= -256 && RPI.Offset <= 255)) &&
2672  "Offset out of bounds for LDP/STP immediate");
2673 
2674  // Save the offset to frame record so that the FP register can point to the
2675  // innermost frame record (spilled FP and LR registers).
2676  if (NeedsFrameRecord && ((!IsWindows && RPI.Reg1 == AArch64::LR &&
2677  RPI.Reg2 == AArch64::FP) ||
2678  (IsWindows && RPI.Reg1 == AArch64::FP &&
2679  RPI.Reg2 == AArch64::LR)))
2681 
2682  RegPairs.push_back(RPI);
2683  if (RPI.isPaired())
2684  i += RegInc;
2685  }
2686  if (NeedsWinCFI) {
2687  // If we need an alignment gap in the stack, align the topmost stack
2688  // object. A stack frame with a gap looks like this, bottom up:
2689  // x19, d8. d9, gap.
2690  // Set extra alignment on the topmost stack object (the first element in
2691  // CSI, which goes top down), to create the gap above it.
2692  if (AFI->hasCalleeSaveStackFreeSpace())
2693  MFI.setObjectAlignment(CSI[0].getFrameIdx(), Align(16));
2694  // We iterated bottom up over the registers; flip RegPairs back to top
2695  // down order.
2696  std::reverse(RegPairs.begin(), RegPairs.end());
2697  }
2698 }
2699 
2702  ArrayRef<CalleeSavedInfo> CSI, const TargetRegisterInfo *TRI) const {
2703  MachineFunction &MF = *MBB.getParent();
2704  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
2705  bool NeedsWinCFI = needsWinCFI(MF);
2706  DebugLoc DL;
2707  SmallVector<RegPairInfo, 8> RegPairs;
2708 
2709  computeCalleeSaveRegisterPairs(MF, CSI, TRI, RegPairs, hasFP(MF));
2710 
2711  const MachineRegisterInfo &MRI = MF.getRegInfo();
2712  if (homogeneousPrologEpilog(MF)) {
2713  auto MIB = BuildMI(MBB, MI, DL, TII.get(AArch64::HOM_Prolog))
2715 
2716  for (auto &RPI : RegPairs) {
2717  MIB.addReg(RPI.Reg1);
2718  MIB.addReg(RPI.Reg2);
2719 
2720  // Update register live in.
2721  if (!MRI.isReserved(RPI.Reg1))
2722  MBB.addLiveIn(RPI.Reg1);
2723  if (!MRI.isReserved(RPI.Reg2))
2724  MBB.addLiveIn(RPI.Reg2);
2725  }
2726  return true;
2727  }
2728  for (const RegPairInfo &RPI : llvm::reverse(RegPairs)) {
2729  unsigned Reg1 = RPI.Reg1;
2730  unsigned Reg2 = RPI.Reg2;
2731  unsigned StrOpc;
2732 
2733  // Issue sequence of spills for cs regs. The first spill may be converted
2734  // to a pre-decrement store later by emitPrologue if the callee-save stack
2735  // area allocation can't be combined with the local stack area allocation.
2736  // For example:
2737  // stp x22, x21, [sp, #0] // addImm(+0)
2738  // stp x20, x19, [sp, #16] // addImm(+2)
2739  // stp fp, lr, [sp, #32] // addImm(+4)
2740  // Rationale: This sequence saves uop updates compared to a sequence of
2741  // pre-increment spills like stp xi,xj,[sp,#-16]!
2742  // Note: Similar rationale and sequence for restores in epilog.
2743  unsigned Size;
2744  Align Alignment;
2745  switch (RPI.Type) {
2746  case RegPairInfo::GPR:
2747  StrOpc = RPI.isPaired() ? AArch64::STPXi : AArch64::STRXui;
2748  Size = 8;
2749  Alignment = Align(8);
2750  break;
2751  case RegPairInfo::FPR64:
2752  StrOpc = RPI.isPaired() ? AArch64::STPDi : AArch64::STRDui;
2753  Size = 8;
2754  Alignment = Align(8);
2755  break;
2756  case RegPairInfo::FPR128:
2757  StrOpc = RPI.isPaired() ? AArch64::STPQi : AArch64::STRQui;
2758  Size = 16;
2759  Alignment = Align(16);
2760  break;
2761  case RegPairInfo::ZPR:
2762  StrOpc = AArch64::STR_ZXI;
2763  Size = 16;
2764  Alignment = Align(16);
2765  break;
2766  case RegPairInfo::PPR:
2767  StrOpc = AArch64::STR_PXI;
2768  Size = 2;
2769  Alignment = Align(2);
2770  break;
2771  }
2772  LLVM_DEBUG(dbgs() << "CSR spill: (" << printReg(Reg1, TRI);
2773  if (RPI.isPaired()) dbgs() << ", " << printReg(Reg2, TRI);
2774  dbgs() << ") -> fi#(" << RPI.FrameIdx;
2775  if (RPI.isPaired()) dbgs() << ", " << RPI.FrameIdx + 1;
2776  dbgs() << ")\n");
2777 
2778  assert((!NeedsWinCFI || !(Reg1 == AArch64::LR && Reg2 == AArch64::FP)) &&
2779  "Windows unwdinding requires a consecutive (FP,LR) pair");
2780  // Windows unwind codes require consecutive registers if registers are
2781  // paired. Make the switch here, so that the code below will save (x,x+1)
2782  // and not (x+1,x).
2783  unsigned FrameIdxReg1 = RPI.FrameIdx;
2784  unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
2785  if (NeedsWinCFI && RPI.isPaired()) {
2786  std::swap(Reg1, Reg2);
2787  std::swap(FrameIdxReg1, FrameIdxReg2);
2788  }
2789  MachineInstrBuilder MIB = BuildMI(MBB, MI, DL, TII.get(StrOpc));
2790  if (!MRI.isReserved(Reg1))
2791  MBB.addLiveIn(Reg1);
2792  if (RPI.isPaired()) {
2793  if (!MRI.isReserved(Reg2))
2794  MBB.addLiveIn(Reg2);
2795  MIB.addReg(Reg2, getPrologueDeath(MF, Reg2));
2797  MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
2798  MachineMemOperand::MOStore, Size, Alignment));
2799  }
2800  MIB.addReg(Reg1, getPrologueDeath(MF, Reg1))
2801  .addReg(AArch64::SP)
2802  .addImm(RPI.Offset) // [sp, #offset*scale],
2803  // where factor*scale is implicit
2806  MachinePointerInfo::getFixedStack(MF, FrameIdxReg1),
2807  MachineMemOperand::MOStore, Size, Alignment));
2808  if (NeedsWinCFI)
2810 
2811  // Update the StackIDs of the SVE stack slots.
2812  MachineFrameInfo &MFI = MF.getFrameInfo();
2813  if (RPI.Type == RegPairInfo::ZPR || RPI.Type == RegPairInfo::PPR)
2815 
2816  }
2817  return true;
2818 }
2819 
2823  MachineFunction &MF = *MBB.getParent();
2824  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
2825  DebugLoc DL;
2826  SmallVector<RegPairInfo, 8> RegPairs;
2827  bool NeedsWinCFI = needsWinCFI(MF);
2828 
2829  if (MBBI != MBB.end())
2830  DL = MBBI->getDebugLoc();
2831 
2832  computeCalleeSaveRegisterPairs(MF, CSI, TRI, RegPairs, hasFP(MF));
2833 
2834  auto EmitMI = [&](const RegPairInfo &RPI) -> MachineBasicBlock::iterator {
2835  unsigned Reg1 = RPI.Reg1;
2836  unsigned Reg2 = RPI.Reg2;
2837 
2838  // Issue sequence of restores for cs regs. The last restore may be converted
2839  // to a post-increment load later by emitEpilogue if the callee-save stack
2840  // area allocation can't be combined with the local stack area allocation.
2841  // For example:
2842  // ldp fp, lr, [sp, #32] // addImm(+4)
2843  // ldp x20, x19, [sp, #16] // addImm(+2)
2844  // ldp x22, x21, [sp, #0] // addImm(+0)
2845  // Note: see comment in spillCalleeSavedRegisters()
2846  unsigned LdrOpc;
2847  unsigned Size;
2848  Align Alignment;
2849  switch (RPI.Type) {
2850  case RegPairInfo::GPR:
2851  LdrOpc = RPI.isPaired() ? AArch64::LDPXi : AArch64::LDRXui;
2852  Size = 8;
2853  Alignment = Align(8);
2854  break;
2855  case RegPairInfo::FPR64:
2856  LdrOpc = RPI.isPaired() ? AArch64::LDPDi : AArch64::LDRDui;
2857  Size = 8;
2858  Alignment = Align(8);
2859  break;
2860  case RegPairInfo::FPR128:
2861  LdrOpc = RPI.isPaired() ? AArch64::LDPQi : AArch64::LDRQui;
2862  Size = 16;
2863  Alignment = Align(16);
2864  break;
2865  case RegPairInfo::ZPR:
2866  LdrOpc = AArch64::LDR_ZXI;
2867  Size = 16;
2868  Alignment = Align(16);
2869  break;
2870  case RegPairInfo::PPR:
2871  LdrOpc = AArch64::LDR_PXI;
2872  Size = 2;
2873  Alignment = Align(2);
2874  break;
2875  }
2876  LLVM_DEBUG(dbgs() << "CSR restore: (" << printReg(Reg1, TRI);
2877  if (RPI.isPaired()) dbgs() << ", " << printReg(Reg2, TRI);
2878  dbgs() << ") -> fi#(" << RPI.FrameIdx;
2879  if (RPI.isPaired()) dbgs() << ", " << RPI.FrameIdx + 1;
2880  dbgs() << ")\n");
2881 
2882  // Windows unwind codes require consecutive registers if registers are
2883  // paired. Make the switch here, so that the code below will save (x,x+1)
2884  // and not (x+1,x).
2885  unsigned FrameIdxReg1 = RPI.FrameIdx;
2886  unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
2887  if (NeedsWinCFI && RPI.isPaired()) {
2888  std::swap(Reg1, Reg2);
2889  std::swap(FrameIdxReg1, FrameIdxReg2);
2890  }
2891  MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII.get(LdrOpc));
2892  if (RPI.isPaired()) {
2893  MIB.addReg(Reg2, getDefRegState(true));
2895  MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
2896  MachineMemOperand::MOLoad, Size, Alignment));
2897  }
2898  MIB.addReg(Reg1, getDefRegState(true))
2899  .addReg(AArch64::SP)
2900  .addImm(RPI.Offset) // [sp, #offset*scale]
2901  // where factor*scale is implicit
2904  MachinePointerInfo::getFixedStack(MF, FrameIdxReg1),
2905  MachineMemOperand::MOLoad, Size, Alignment));
2906  if (NeedsWinCFI)
2908 
2909  return MIB->getIterator();
2910  };
2911 
2912  // SVE objects are always restored in reverse order.
2913  for (const RegPairInfo &RPI : reverse(RegPairs))
2914  if (RPI.isScalable())
2915  EmitMI(RPI);
2916 
2917  if (homogeneousPrologEpilog(MF, &MBB)) {
2918  auto MIB = BuildMI(MBB, MBBI, DL, TII.get(AArch64::HOM_Epilog))
2920  for (auto &RPI : RegPairs) {
2921  MIB.addReg(RPI.Reg1, RegState::Define);
2922  MIB.addReg(RPI.Reg2, RegState::Define);
2923  }
2924  return true;
2925  }
2926 
2927  if (ReverseCSRRestoreSeq) {
2929  for (const RegPairInfo &RPI : reverse(RegPairs)) {
2930  if (RPI.isScalable())
2931  continue;
2932  MachineBasicBlock::iterator It = EmitMI(RPI);
2933  if (First == MBB.end())
2934  First = It;
2935  }
2936  if (First != MBB.end())
2937  MBB.splice(MBBI, &MBB, First);
2938  } else {
2939  for (const RegPairInfo &RPI : RegPairs) {
2940  if (RPI.isScalable())
2941  continue;
2942  (void)EmitMI(RPI);
2943  }
2944  }
2945 
2946  return true;
2947 }
2948 
2950  BitVector &SavedRegs,
2951  RegScavenger *RS) const {
2952  // All calls are tail calls in GHC calling conv, and functions have no
2953  // prologue/epilogue.
2955  return;
2956 
2957  TargetFrameLowering::determineCalleeSaves(MF, SavedRegs, RS);
2958  const AArch64RegisterInfo *RegInfo = static_cast<const AArch64RegisterInfo *>(
2959  MF.getSubtarget().getRegisterInfo());
2960  const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2962  unsigned UnspilledCSGPR = AArch64::NoRegister;
2963  unsigned UnspilledCSGPRPaired = AArch64::NoRegister;
2964 
2965  MachineFrameInfo &MFI = MF.getFrameInfo();
2966  const MCPhysReg *CSRegs = MF.getRegInfo().getCalleeSavedRegs();
2967 
2968  unsigned BasePointerReg = RegInfo->hasBasePointer(MF)
2969  ? RegInfo->getBaseRegister()
2970  : (unsigned)AArch64::NoRegister;
2971 
2972  unsigned ExtraCSSpill = 0;
2973  // Figure out which callee-saved registers to save/restore.
2974  for (unsigned i = 0; CSRegs[i]; ++i) {
2975  const unsigned Reg = CSRegs[i];
2976 
2977  // Add the base pointer register to SavedRegs if it is callee-save.
2978  if (Reg == BasePointerReg)
2979  SavedRegs.set(Reg);
2980 
2981  bool RegUsed = SavedRegs.test(Reg);
2982  unsigned PairedReg = AArch64::NoRegister;
2983  if (AArch64::GPR64RegClass.contains(Reg) ||
2984  AArch64::FPR64RegClass.contains(Reg) ||
2985  AArch64::FPR128RegClass.contains(Reg))
2986  PairedReg = CSRegs[i ^ 1];
2987 
2988  if (!RegUsed) {
2989  if (AArch64::GPR64RegClass.contains(Reg) &&
2990  !RegInfo->isReservedReg(MF, Reg)) {
2991  UnspilledCSGPR = Reg;
2992  UnspilledCSGPRPaired = PairedReg;
2993  }
2994  continue;
2995  }
2996 
2997  // MachO's compact unwind format relies on all registers being stored in
2998  // pairs.
2999  // FIXME: the usual format is actually better if unwinding isn't needed.
3000  if (producePairRegisters(MF) && PairedReg != AArch64::NoRegister &&
3001  !SavedRegs.test(PairedReg)) {
3002  SavedRegs.set(PairedReg);
3003  if (AArch64::GPR64RegClass.contains(PairedReg) &&
3004  !RegInfo->isReservedReg(MF, PairedReg))
3005  ExtraCSSpill = PairedReg;
3006  }
3007  }
3008 
3010  !Subtarget.isTargetWindows()) {
3011  // For Windows calling convention on a non-windows OS, where X18 is treated
3012  // as reserved, back up X18 when entering non-windows code (marked with the
3013  // Windows calling convention) and restore when returning regardless of
3014  // whether the individual function uses it - it might call other functions
3015  // that clobber it.
3016  SavedRegs.set(AArch64::X18);
3017  }
3018 
3019  // Calculates the callee saved stack size.
3020  unsigned CSStackSize = 0;
3021  unsigned SVECSStackSize = 0;
3023  const MachineRegisterInfo &MRI = MF.getRegInfo();
3024  for (unsigned Reg : SavedRegs.set_bits()) {
3025  auto RegSize = TRI->getRegSizeInBits(Reg, MRI) / 8;
3026  if (AArch64::PPRRegClass.contains(Reg) ||
3027  AArch64::ZPRRegClass.contains(Reg))
3028  SVECSStackSize += RegSize;
3029  else
3030  CSStackSize += RegSize;
3031  }
3032 
3033  // Save number of saved regs, so we can easily update CSStackSize later.
3034  unsigned NumSavedRegs = SavedRegs.count();
3035 
3036  // The frame record needs to be created by saving the appropriate registers
3037  uint64_t EstimatedStackSize = MFI.estimateStackSize(MF);
3038  if (hasFP(MF) ||
3039  windowsRequiresStackProbe(MF, EstimatedStackSize + CSStackSize + 16)) {
3040  SavedRegs.set(AArch64::FP);
3041  SavedRegs.set(AArch64::LR);
3042  }
3043 
3044  LLVM_DEBUG(dbgs() << "*** determineCalleeSaves\nSaved CSRs:";
3045  for (unsigned Reg
3046  : SavedRegs.set_bits()) dbgs()
3047  << ' ' << printReg(Reg, RegInfo);
3048  dbgs() << "\n";);
3049 
3050  // If any callee-saved registers are used, the frame cannot be eliminated.
3051  int64_t SVEStackSize =
3052  alignTo(SVECSStackSize + estimateSVEStackObjectOffsets(MFI), 16);
3053  bool CanEliminateFrame = (SavedRegs.count() == 0) && !SVEStackSize;
3054 
3055  // The CSR spill slots have not been allocated yet, so estimateStackSize
3056  // won't include them.
3057  unsigned EstimatedStackSizeLimit = estimateRSStackSizeLimit(MF);
3058 
3059  // Conservatively always assume BigStack when there are SVE spills.
3060  bool BigStack = SVEStackSize ||
3061  (EstimatedStackSize + CSStackSize) > EstimatedStackSizeLimit;
3062  if (BigStack || !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF))
3063  AFI->setHasStackFrame(true);
3064 
3065  // Estimate if we might need to scavenge a register at some point in order
3066  // to materialize a stack offset. If so, either spill one additional
3067  // callee-saved register or reserve a special spill slot to facilitate
3068  // register scavenging. If we already spilled an extra callee-saved register
3069  // above to keep the number of spills even, we don't need to do anything else
3070  // here.
3071  if (BigStack) {
3072  if (!ExtraCSSpill && UnspilledCSGPR != AArch64::NoRegister) {
3073  LLVM_DEBUG(dbgs() << "Spilling " << printReg(UnspilledCSGPR, RegInfo)
3074  << " to get a scratch register.\n");
3075  SavedRegs.set(UnspilledCSGPR);
3076  // MachO's compact unwind format relies on all registers being stored in
3077  // pairs, so if we need to spill one extra for BigStack, then we need to
3078  // store the pair.
3079  if (producePairRegisters(MF))
3080  SavedRegs.set(UnspilledCSGPRPaired);
3081  ExtraCSSpill = UnspilledCSGPR;
3082  }
3083 
3084  // If we didn't find an extra callee-saved register to spill, create
3085  // an emergency spill slot.
3086  if (!ExtraCSSpill || MF.getRegInfo().isPhysRegUsed(ExtraCSSpill)) {
3088  const TargetRegisterClass &RC = AArch64::GPR64RegClass;
3089  unsigned Size = TRI->getSpillSize(RC);
3090  Align Alignment = TRI->getSpillAlign(RC);
3091  int FI = MFI.CreateStackObject(Size, Alignment, false);
3092  RS->addScavengingFrameIndex(FI);
3093  LLVM_DEBUG(dbgs() << "No available CS registers, allocated fi#" << FI
3094  << " as the emergency spill slot.\n");
3095  }
3096  }
3097 
3098  // Adding the size of additional 64bit GPR saves.
3099  CSStackSize += 8 * (SavedRegs.count() - NumSavedRegs);
3100 
3101  // A Swift asynchronous context extends the frame record with a pointer
3102  // directly before FP.
3103  if (hasFP(MF) && AFI->hasSwiftAsyncContext())
3104  CSStackSize += 8;
3105 
3106  uint64_t AlignedCSStackSize = alignTo(CSStackSize, 16);
3107  LLVM_DEBUG(dbgs() << "Estimated stack frame size: "
3108  << EstimatedStackSize + AlignedCSStackSize
3109  << " bytes.\n");
3110 
3111  assert((!MFI.isCalleeSavedInfoValid() ||
3112  AFI->getCalleeSavedStackSize() == AlignedCSStackSize) &&
3113  "Should not invalidate callee saved info");
3114 
3115  // Round up to register pair alignment to avoid additional SP adjustment
3116  // instructions.
3117  AFI->setCalleeSavedStackSize(AlignedCSStackSize);
3118  AFI->setCalleeSaveStackHasFreeSpace(AlignedCSStackSize != CSStackSize);
3119  AFI->setSVECalleeSavedStackSize(alignTo(SVECSStackSize, 16));
3120 }
3121 
3123  MachineFunction &MF, const TargetRegisterInfo *RegInfo,
3124  std::vector<CalleeSavedInfo> &CSI, unsigned &MinCSFrameIndex,
3125  unsigned &MaxCSFrameIndex) const {
3126  bool NeedsWinCFI = needsWinCFI(MF);
3127  // To match the canonical windows frame layout, reverse the list of
3128  // callee saved registers to get them laid out by PrologEpilogInserter
3129  // in the right order. (PrologEpilogInserter allocates stack objects top
3130  // down. Windows canonical prologs store higher numbered registers at
3131  // the top, thus have the CSI array start from the highest registers.)
3132  if (NeedsWinCFI)
3133  std::reverse(CSI.begin(), CSI.end());
3134 
3135  if (CSI.empty())
3136  return true; // Early exit if no callee saved registers are modified!
3137 
3138  // Now that we know which registers need to be saved and restored, allocate
3139  // stack slots for them.
3140  MachineFrameInfo &MFI = MF.getFrameInfo();
3141  auto *AFI = MF.getInfo<AArch64FunctionInfo>();
3142 
3143  bool UsesWinAAPCS = isTargetWindows(MF);
3144  if (UsesWinAAPCS && hasFP(MF) && AFI->hasSwiftAsyncContext()) {
3145  int FrameIdx = MFI.CreateStackObject(8, Align(16), true);
3146  AFI->setSwiftAsyncContextFrameIdx(FrameIdx);
3147  if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3148  if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3149  }
3150 
3151  for (auto &CS : CSI) {
3152  Register Reg = CS.getReg();
3153  const TargetRegisterClass *RC = RegInfo->getMinimalPhysRegClass(Reg);
3154 
3155  unsigned Size = RegInfo->getSpillSize(*RC);
3156  Align Alignment(RegInfo->getSpillAlign(*RC));
3157  int FrameIdx = MFI.CreateStackObject(Size, Alignment, true);
3158  CS.setFrameIdx(FrameIdx);
3159 
3160  if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3161  if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3162 
3163  // Grab 8 bytes below FP for the extended asynchronous frame info.
3164  if (hasFP(MF) && AFI->hasSwiftAsyncContext() && !UsesWinAAPCS &&
3165  Reg == AArch64::FP) {
3166  FrameIdx = MFI.CreateStackObject(8, Alignment, true);
3167  AFI->setSwiftAsyncContextFrameIdx(FrameIdx);
3168  if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3169  if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3170  }
3171  }
3172  return true;
3173 }
3174 
3176  const MachineFunction &MF) const {
3177  const AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
3178  return AFI->hasCalleeSaveStackFreeSpace();
3179 }
3180 
3181 /// returns true if there are any SVE callee saves.
3183  int &Min, int &Max) {
3186 
3187  if (!MFI.isCalleeSavedInfoValid())
3188  return false;
3189 
3190  const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
3191  for (auto &CS : CSI) {
3192  if (AArch64::ZPRRegClass.contains(CS.getReg()) ||
3193  AArch64::PPRRegClass.contains(CS.getReg())) {
3195  Max + 1 == CS.getFrameIdx()) &&
3196  "SVE CalleeSaves are not consecutive");
3197 
3198  Min = std::min(Min, CS.getFrameIdx());
3199  Max = std::max(Max, CS.getFrameIdx());
3200  }
3201  }
3202  return Min != std::numeric_limits<int>::max();
3203 }
3204 
3205 // Process all the SVE stack objects and determine offsets for each
3206 // object. If AssignOffsets is true, the offsets get assigned.
3207 // Fills in the first and last callee-saved frame indices into
3208 // Min/MaxCSFrameIndex, respectively.
3209 // Returns the size of the stack.
3211  int &MinCSFrameIndex,
3212  int &MaxCSFrameIndex,
3213  bool AssignOffsets) {
3214 #ifndef NDEBUG
3215  // First process all fixed stack objects.
3216  for (int I = MFI.getObjectIndexBegin(); I != 0; ++I)
3218  "SVE vectors should never be passed on the stack by value, only by "
3219  "reference.");
3220 #endif
3221 
3222  auto Assign = [&MFI](int FI, int64_t Offset) {
3223  LLVM_DEBUG(dbgs() << "alloc FI(" << FI << ") at SP[" << Offset << "]\n");
3224  MFI.setObjectOffset(FI, Offset);
3225  };
3226 
3227  int64_t Offset = 0;
3228 
3229  // Then process all callee saved slots.
3230  if (getSVECalleeSaveSlotRange(MFI, MinCSFrameIndex, MaxCSFrameIndex)) {
3231  // Assign offsets to the callee save slots.
3232  for (int I = MinCSFrameIndex; I <= MaxCSFrameIndex; ++I) {
3233  Offset += MFI.getObjectSize(I);
3234  Offset = alignTo(Offset, MFI.getObjectAlign(I));
3235  if (AssignOffsets)
3236  Assign(I, -Offset);
3237  }
3238  }
3239 
3240  // Ensure that the Callee-save area is aligned to 16bytes.
3241  Offset = alignTo(Offset, Align(16U));
3242 
3243  // Create a buffer of SVE objects to allocate and sort it.
3244  SmallVector<int, 8> ObjectsToAllocate;
3245  // If we have a stack protector, and we've previously decided that we have SVE
3246  // objects on the stack and thus need it to go in the SVE stack area, then it
3247  // needs to go first.
3248  int StackProtectorFI = -1;
3249  if (MFI.hasStackProtectorIndex()) {
3250  StackProtectorFI = MFI.getStackProtectorIndex();
3251  if (MFI.getStackID(StackProtectorFI) == TargetStackID::ScalableVector)
3252  ObjectsToAllocate.push_back(StackProtectorFI);
3253  }
3254  for (int I = 0, E = MFI.getObjectIndexEnd(); I != E; ++I) {
3255  unsigned StackID = MFI.getStackID(I);
3256  if (StackID != TargetStackID::ScalableVector)
3257  continue;
3258  if (I == StackProtectorFI)
3259  continue;
3260  if (MaxCSFrameIndex >= I && I >= MinCSFrameIndex)
3261  continue;
3262  if (MFI.isDeadObjectIndex(I))
3263  continue;
3264 
3265  ObjectsToAllocate.push_back(I);
3266  }
3267 
3268  // Allocate all SVE locals and spills
3269  for (unsigned FI : ObjectsToAllocate) {
3270  Align Alignment = MFI.getObjectAlign(FI);
3271  // FIXME: Given that the length of SVE vectors is not necessarily a power of
3272  // two, we'd need to align every object dynamically at runtime if the
3273  // alignment is larger than 16. This is not yet supported.
3274  if (Alignment > Align(16))
3276  "Alignment of scalable vectors > 16 bytes is not yet supported");
3277 
3278  Offset = alignTo(Offset + MFI.getObjectSize(FI), Alignment);
3279  if (AssignOffsets)
3280  Assign(FI, -Offset);
3281  }
3282 
3283  return Offset;
3284 }
3285 
3286 int64_t AArch64FrameLowering::estimateSVEStackObjectOffsets(
3287  MachineFrameInfo &MFI) const {
3288  int MinCSFrameIndex, MaxCSFrameIndex;
3289  return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex, false);
3290 }
3291 
3292 int64_t AArch64FrameLowering::assignSVEStackObjectOffsets(
3293  MachineFrameInfo &MFI, int &MinCSFrameIndex, int &MaxCSFrameIndex) const {
3294  return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex,
3295  true);
3296 }
3297 
3299  MachineFunction &MF, RegScavenger *RS) const {
3300  MachineFrameInfo &MFI = MF.getFrameInfo();
3301 
3303  "Upwards growing stack unsupported");
3304 
3305  int MinCSFrameIndex, MaxCSFrameIndex;
3306  int64_t SVEStackSize =
3307  assignSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex);
3308 
3310  AFI->setStackSizeSVE(alignTo(SVEStackSize, 16U));
3311  AFI->setMinMaxSVECSFrameIndex(MinCSFrameIndex, MaxCSFrameIndex);
3312 
3313  // If this function isn't doing Win64-style C++ EH, we don't need to do
3314  // anything.
3315  if (!MF.hasEHFunclets())
3316  return;
3317  const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
3318  WinEHFuncInfo &EHInfo = *MF.getWinEHFuncInfo();
3319 
3320  MachineBasicBlock &MBB = MF.front();
3321  auto MBBI = MBB.begin();
3322  while (MBBI != MBB.end() && MBBI->getFlag(MachineInstr::FrameSetup))
3323  ++MBBI;
3324 
3325  // Create an UnwindHelp object.
3326  // The UnwindHelp object is allocated at the start of the fixed object area
3327  int64_t FixedObject =
3328  getFixedObjectSize(MF, AFI, /*IsWin64*/ true, /*IsFunclet*/ false);
3329  int UnwindHelpFI = MFI.CreateFixedObject(/*Size*/ 8,
3330  /*SPOffset*/ -FixedObject,
3331  /*IsImmutable=*/false);
3332  EHInfo.UnwindHelpFrameIdx = UnwindHelpFI;
3333 
3334  // We need to store -2 into the UnwindHelp object at the start of the
3335  // function.
3336  DebugLoc DL;
3337  RS->enterBasicBlockEnd(MBB);
3338  RS->backward(std::prev(MBBI));
3339  Register DstReg = RS->FindUnusedReg(&AArch64::GPR64commonRegClass);
3340  assert(DstReg && "There must be a free register after frame setup");
3341  BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVi64imm), DstReg).addImm(-2);
3342  BuildMI(MBB, MBBI, DL, TII.get(AArch64::STURXi))
3343  .addReg(DstReg, getKillRegState(true))
3344  .addFrameIndex(UnwindHelpFI)
3345  .addImm(0);
3346 }
3347 
3348 namespace {
3349 struct TagStoreInstr {
3350  MachineInstr *MI;
3351  int64_t Offset, Size;
3352  explicit TagStoreInstr(MachineInstr *MI, int64_t Offset, int64_t Size)
3353  : MI(MI), Offset(Offset), Size(Size) {}
3354 };
3355 
3356 class TagStoreEdit {
3357  MachineFunction *MF;
3360  // Tag store instructions that are being replaced.
3362  // Combined memref arguments of the above instructions.
3363  SmallVector<MachineMemOperand *, 8> CombinedMemRefs;
3364 
3365  // Replace allocation tags in [FrameReg + FrameRegOffset, FrameReg +
3366  // FrameRegOffset + Size) with the address tag of SP.
3367  Register FrameReg;
3368  StackOffset FrameRegOffset;
3369  int64_t Size;
3370  // If not None, move FrameReg to (FrameReg + FrameRegUpdate) at the end.
3371  Optional<int64_t> FrameRegUpdate;
3372  // MIFlags for any FrameReg updating instructions.
3373  unsigned FrameRegUpdateFlags;
3374 
3375  // Use zeroing instruction variants.
3376  bool ZeroData;
3377  DebugLoc DL;
3378 
3379  void emitUnrolled(MachineBasicBlock::iterator InsertI);
3380  void emitLoop(MachineBasicBlock::iterator InsertI);
3381 
3382 public:
3383  TagStoreEdit(MachineBasicBlock *MBB, bool ZeroData)
3384  : MBB(MBB), ZeroData(ZeroData) {
3385  MF = MBB->getParent();
3386  MRI = &MF->getRegInfo();
3387  }
3388  // Add an instruction to be replaced. Instructions must be added in the
3389  // ascending order of Offset, and have to be adjacent.
3390  void addInstruction(TagStoreInstr I) {
3391  assert((TagStores.empty() ||
3392  TagStores.back().Offset + TagStores.back().Size == I.Offset) &&
3393  "Non-adjacent tag store instructions.");
3394  TagStores.push_back(I);
3395  }
3396  void clear() { TagStores.clear(); }
3397  // Emit equivalent code at the given location, and erase the current set of
3398  // instructions. May skip if the replacement is not profitable. May invalidate
3399  // the input iterator and replace it with a valid one.
3400  void emitCode(MachineBasicBlock::iterator &InsertI,
3401  const AArch64FrameLowering *TFI, bool TryMergeSPUpdate);
3402 };
3403 
3404 void TagStoreEdit::emitUnrolled(MachineBasicBlock::iterator InsertI) {
3405  const AArch64InstrInfo *TII =
3406  MF->getSubtarget<AArch64Subtarget>().getInstrInfo();
3407 
3408  const int64_t kMinOffset = -256 * 16;
3409  const int64_t kMaxOffset = 255 * 16;
3410 
3411  Register BaseReg = FrameReg;
3412  int64_t BaseRegOffsetBytes = FrameRegOffset.getFixed();
3413  if (BaseRegOffsetBytes < kMinOffset ||
3414  BaseRegOffsetBytes + (Size - Size % 32) > kMaxOffset) {
3415  Register ScratchReg = MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3416  emitFrameOffset(*MBB, InsertI, DL, ScratchReg, BaseReg,
3417  StackOffset::getFixed(BaseRegOffsetBytes), TII);
3418  BaseReg = ScratchReg;
3419  BaseRegOffsetBytes = 0;
3420  }
3421 
3422  MachineInstr *LastI = nullptr;
3423  while (Size) {
3424  int64_t InstrSize = (Size > 16) ? 32 : 16;
3425  unsigned Opcode =
3426  InstrSize == 16
3427  ? (ZeroData ? AArch64::STZGOffset : AArch64::STGOffset)
3428  : (ZeroData ? AArch64::STZ2GOffset : AArch64::ST2GOffset);
3429  MachineInstr *I = BuildMI(*MBB, InsertI, DL, TII->get(Opcode))
3430  .addReg(AArch64::SP)
3431  .addReg(BaseReg)
3432  .addImm(BaseRegOffsetBytes / 16)
3433  .setMemRefs(CombinedMemRefs);
3434  // A store to [BaseReg, #0] should go last for an opportunity to fold the
3435  // final SP adjustment in the epilogue.
3436  if (BaseRegOffsetBytes == 0)
3437  LastI = I;
3438  BaseRegOffsetBytes += InstrSize;
3439  Size -= InstrSize;
3440  }
3441 
3442  if (LastI)
3443  MBB->splice(InsertI, MBB, LastI);
3444 }
3445 
3446 void TagStoreEdit::emitLoop(MachineBasicBlock::iterator InsertI) {
3447  const AArch64InstrInfo *TII =
3448  MF->getSubtarget<AArch64Subtarget>().getInstrInfo();
3449 
3450  Register BaseReg = FrameRegUpdate
3451  ? FrameReg
3452  : MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3453  Register SizeReg = MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3454 
3455  emitFrameOffset(*MBB, InsertI, DL, BaseReg, FrameReg, FrameRegOffset, TII);
3456 
3457  int64_t LoopSize = Size;
3458  // If the loop size is not a multiple of 32, split off one 16-byte store at
3459  // the end to fold BaseReg update into.
3460  if (FrameRegUpdate && *FrameRegUpdate)
3461  LoopSize -= LoopSize % 32;
3462  MachineInstr *LoopI = BuildMI(*MBB, InsertI, DL,
3463  TII->get(ZeroData ? AArch64::STZGloop_wback
3464  : AArch64::STGloop_wback))
3465  .addDef(SizeReg)
3466  .addDef(BaseReg)
3467  .addImm(LoopSize)
3468  .addReg(BaseReg)
3469  .setMemRefs(CombinedMemRefs);
3470  if (FrameRegUpdate)
3471  LoopI->setFlags(FrameRegUpdateFlags);
3472 
3473  int64_t ExtraBaseRegUpdate =
3474  FrameRegUpdate ? (*FrameRegUpdate - FrameRegOffset.getFixed() - Size) : 0;
3475  if (LoopSize < Size) {
3476  assert(FrameRegUpdate);
3477  assert(Size - LoopSize == 16);
3478  // Tag 16 more bytes at BaseReg and update BaseReg.
3479  BuildMI(*MBB, InsertI, DL,
3480  TII->get(ZeroData ? AArch64::STZGPostIndex : AArch64::STGPostIndex))
3481  .addDef(BaseReg)
3482  .addReg(BaseReg)
3483  .addReg(BaseReg)
3484  .addImm(1 + ExtraBaseRegUpdate / 16)
3485  .setMemRefs(CombinedMemRefs)
3486  .setMIFlags(FrameRegUpdateFlags);
3487  } else if (ExtraBaseRegUpdate) {
3488  // Update BaseReg.
3489  BuildMI(
3490  *MBB, InsertI, DL,
3491  TII->get(ExtraBaseRegUpdate > 0 ? AArch64::ADDXri : AArch64::SUBXri))
3492  .addDef(BaseReg)
3493  .addReg(BaseReg)
3494  .addImm(std::abs(ExtraBaseRegUpdate))
3495  .addImm(0)
3496  .setMIFlags(FrameRegUpdateFlags);
3497  }
3498 }
3499 
3500 // Check if *II is a register update that can be merged into STGloop that ends
3501 // at (Reg + Size). RemainingOffset is the required adjustment to Reg after the
3502 // end of the loop.
3503 bool canMergeRegUpdate(MachineBasicBlock::iterator II, unsigned Reg,
3504  int64_t Size, int64_t *TotalOffset) {
3505  MachineInstr &MI = *II;
3506  if ((MI.getOpcode() == AArch64::ADDXri ||
3507  MI.getOpcode() == AArch64::SUBXri) &&
3508  MI.getOperand(0).getReg() == Reg && MI.getOperand(1).getReg() == Reg) {
3509  unsigned Shift = AArch64_AM::getShiftValue(MI.getOperand(3).getImm());
3510  int64_t Offset = MI.getOperand(2).getImm() << Shift;
3511  if (MI.getOpcode() == AArch64::SUBXri)
3512  Offset = -Offset;
3513  int64_t AbsPostOffset = std::abs(Offset - Size);
3514  const int64_t kMaxOffset =
3515  0xFFF; // Max encoding for unshifted ADDXri / SUBXri
3516  if (AbsPostOffset <= kMaxOffset && AbsPostOffset % 16 == 0) {
3517  *TotalOffset = Offset;
3518  return true;
3519  }
3520  }
3521  return false;
3522 }
3523 
3524 void mergeMemRefs(const SmallVectorImpl<TagStoreInstr> &TSE,
3526  MemRefs.clear();
3527  for (auto &TS : TSE) {
3528  MachineInstr *MI = TS.MI;
3529  // An instruction without memory operands may access anything. Be
3530  // conservative and return an empty list.
3531  if (MI->memoperands_empty()) {
3532  MemRefs.clear();
3533  return;
3534  }
3535  MemRefs.append(MI->memoperands_begin(), MI->memoperands_end());
3536  }
3537 }
3538 
3539 void TagStoreEdit::emitCode(MachineBasicBlock::iterator &InsertI,
3540  const AArch64FrameLowering *TFI,
3541  bool TryMergeSPUpdate) {
3542  if (TagStores.empty())
3543  return;
3544  TagStoreInstr &FirstTagStore = TagStores[0];
3545  TagStoreInstr &LastTagStore = TagStores[TagStores.size() - 1];
3546  Size = LastTagStore.Offset - FirstTagStore.Offset + LastTagStore.Size;
3547  DL = TagStores[0].MI->getDebugLoc();
3548 
3549  Register Reg;
3550  FrameRegOffset = TFI->resolveFrameOffsetReference(
3551  *MF, FirstTagStore.Offset, false /*isFixed*/, false /*isSVE*/, Reg,
3552  /*PreferFP=*/false, /*ForSimm=*/true);
3553  FrameReg = Reg;
3554  FrameRegUpdate = None;
3555 
3556  mergeMemRefs(TagStores, CombinedMemRefs);
3557 
3558  LLVM_DEBUG(dbgs() << "Replacing adjacent STG instructions:\n";
3559  for (const auto &Instr
3560  : TagStores) { dbgs() << " " << *Instr.MI; });
3561 
3562  // Size threshold where a loop becomes shorter than a linear sequence of
3563  // tagging instructions.
3564  const int kSetTagLoopThreshold = 176;
3565  if (Size < kSetTagLoopThreshold) {
3566  if (TagStores.size() < 2)
3567  return;
3568  emitUnrolled(InsertI);
3569  } else {
3570  MachineInstr *UpdateInstr = nullptr;
3571  int64_t TotalOffset = 0;
3572  if (TryMergeSPUpdate) {
3573  // See if we can merge base register update into the STGloop.
3574  // This is done in AArch64LoadStoreOptimizer for "normal" stores,
3575  // but STGloop is way too unusual for that, and also it only
3576  // realistically happens in function epilogue. Also, STGloop is expanded
3577  // before that pass.
3578  if (InsertI != MBB->end() &&
3579  canMergeRegUpdate(InsertI, FrameReg, FrameRegOffset.getFixed() + Size,
3580  &TotalOffset)) {
3581  UpdateInstr = &*InsertI++;
3582  LLVM_DEBUG(dbgs() << "Folding SP update into loop:\n "
3583  << *UpdateInstr);
3584  }
3585  }
3586 
3587  if (!UpdateInstr && TagStores.size() < 2)
3588  return;
3589 
3590  if (UpdateInstr) {
3591  FrameRegUpdate = TotalOffset;
3592  FrameRegUpdateFlags = UpdateInstr->getFlags();
3593  }
3594  emitLoop(InsertI);
3595  if (UpdateInstr)
3596  UpdateInstr->eraseFromParent();
3597  }
3598 
3599  for (auto &TS : TagStores)
3600  TS.MI->eraseFromParent();
3601 }
3602 
3603 bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
3604  int64_t &Size, bool &ZeroData) {
3605  MachineFunction &MF = *MI.getParent()->getParent();
3606  const MachineFrameInfo &MFI = MF.getFrameInfo();
3607 
3608  unsigned Opcode = MI.getOpcode();
3609  ZeroData = (Opcode == AArch64::STZGloop || Opcode == AArch64::STZGOffset ||
3610  Opcode == AArch64::STZ2GOffset);
3611 
3612  if (Opcode == AArch64::STGloop || Opcode == AArch64::STZGloop) {
3613  if (!MI.getOperand(0).isDead() || !MI.getOperand(1).isDead())
3614  return false;
3615  if (!MI.getOperand(2).isImm() || !MI.getOperand(3).isFI())
3616  return false;
3617  Offset = MFI.getObjectOffset(MI.getOperand(3).getIndex());
3618  Size = MI.getOperand(2).getImm();
3619  return true;
3620  }
3621 
3622  if (Opcode == AArch64::STGOffset || Opcode == AArch64::STZGOffset)
3623  Size = 16;
3624  else if (Opcode == AArch64::ST2GOffset || Opcode == AArch64::STZ2GOffset)
3625  Size = 32;
3626  else
3627  return false;
3628 
3629  if (MI.getOperand(0).getReg() != AArch64::SP || !MI.getOperand(1).isFI())
3630  return false;
3631 
3632  Offset = MFI.getObjectOffset(MI.getOperand(1).getIndex()) +
3633  16 * MI.getOperand(2).getImm();
3634  return true;
3635 }
3636 
3637 // Detect a run of memory tagging instructions for adjacent stack frame slots,
3638 // and replace them with a shorter instruction sequence:
3639 // * replace STG + STG with ST2G
3640 // * replace STGloop + STGloop with STGloop
3641 // This code needs to run when stack slot offsets are already known, but before
3642 // FrameIndex operands in STG instructions are eliminated.
3644  const AArch64FrameLowering *TFI,
3645  RegScavenger *RS) {
3646  bool FirstZeroData;
3647  int64_t Size, Offset;
3648  MachineInstr &MI = *II;
3649  MachineBasicBlock *MBB = MI.getParent();
3650  MachineBasicBlock::iterator NextI = ++II;
3651  if (&MI == &MBB->instr_back())
3652  return II;
3653  if (!isMergeableStackTaggingInstruction(MI, Offset, Size, FirstZeroData))
3654  return II;
3655 
3657  Instrs.emplace_back(&MI, Offset, Size);
3658 
3659  constexpr int kScanLimit = 10;
3660  int Count = 0;
3662  NextI != E && Count < kScanLimit; ++NextI) {
3663  MachineInstr &MI = *NextI;
3664  bool ZeroData;
3665  int64_t Size, Offset;
3666  // Collect instructions that update memory tags with a FrameIndex operand
3667  // and (when applicable) constant size, and whose output registers are dead
3668  // (the latter is almost always the case in practice). Since these
3669  // instructions effectively have no inputs or outputs, we are free to skip
3670  // any non-aliasing instructions in between without tracking used registers.
3671  if (isMergeableStackTaggingInstruction(MI, Offset, Size, ZeroData)) {
3672  if (ZeroData != FirstZeroData)
3673  break;
3674  Instrs.emplace_back(&MI, Offset, Size);
3675  continue;
3676  }
3677 
3678  // Only count non-transient, non-tagging instructions toward the scan
3679  // limit.
3680  if (!MI.isTransient())
3681  ++Count;
3682 
3683  // Just in case, stop before the epilogue code starts.
3684  if (MI.getFlag(MachineInstr::FrameSetup) ||
3685  MI.getFlag(MachineInstr::FrameDestroy))
3686  break;
3687 
3688  // Reject anything that may alias the collected instructions.
3689  if (MI.mayLoadOrStore() || MI.hasUnmodeledSideEffects())
3690  break;
3691  }
3692 
3693  // New code will be inserted after the last tagging instruction we've found.
3694  MachineBasicBlock::iterator InsertI = Instrs.back().MI;
3695  InsertI++;
3696 
3697  llvm::stable_sort(Instrs,
3698  [](const TagStoreInstr &Left, const TagStoreInstr &Right) {
3699  return Left.Offset < Right.Offset;
3700  });
3701 
3702  // Make sure that we don't have any overlapping stores.
3703  int64_t CurOffset = Instrs[0].Offset;
3704  for (auto &Instr : Instrs) {
3705  if (CurOffset > Instr.Offset)
3706  return NextI;
3707  CurOffset = Instr.Offset + Instr.Size;
3708  }
3709 
3710  // Find contiguous runs of tagged memory and emit shorter instruction
3711  // sequencies for them when possible.
3712  TagStoreEdit TSE(MBB, FirstZeroData);
3713  Optional<int64_t> EndOffset;
3714  for (auto &Instr : Instrs) {
3715  if (EndOffset && *EndOffset != Instr.Offset) {
3716  // Found a gap.
3717  TSE.emitCode(InsertI, TFI, /*TryMergeSPUpdate = */ false);
3718  TSE.clear();
3719  }
3720 
3721  TSE.addInstruction(Instr);
3722  EndOffset = Instr.Offset + Instr.Size;
3723  }
3724 
3725  // Multiple FP/SP updates in a loop cannot be described by CFI instructions.
3726  TSE.emitCode(InsertI, TFI, /*TryMergeSPUpdate = */
3727  !MBB->getParent()
3730 
3731  return InsertI;
3732 }
3733 } // namespace
3734 
3736  MachineFunction &MF, RegScavenger *RS = nullptr) const {
3738  for (auto &BB : MF)
3739  for (MachineBasicBlock::iterator II = BB.begin(); II != BB.end();)
3740  II = tryMergeAdjacentSTG(II, this, RS);
3741 }
3742 
3743 /// For Win64 AArch64 EH, the offset to the Unwind object is from the SP
3744 /// before the update. This is easily retrieved as it is exactly the offset
3745 /// that is set in processFunctionBeforeFrameFinalized.
3747  const MachineFunction &MF, int FI, Register &FrameReg,
3748  bool IgnoreSPUpdates) const {
3749  const MachineFrameInfo &MFI = MF.getFrameInfo();
3750  if (IgnoreSPUpdates) {
3751  LLVM_DEBUG(dbgs() << "Offset from the SP for " << FI << " is "
3752  << MFI.getObjectOffset(FI) << "\n");
3753  FrameReg = AArch64::SP;
3754  return StackOffset::getFixed(MFI.getObjectOffset(FI));
3755  }
3756 
3757  // Go to common code if we cannot provide sp + offset.
3758  if (MFI.hasVarSizedObjects() ||
3761  return getFrameIndexReference(MF, FI, FrameReg);
3762 
3763  FrameReg = AArch64::SP;
3764  return getStackOffset(MF, MFI.getObjectOffset(FI));
3765 }
3766 
3767 /// The parent frame offset (aka dispFrame) is only used on X86_64 to retrieve
3768 /// the parent's frame pointer
3770  const MachineFunction &MF) const {
3771  return 0;
3772 }
3773 
3774 /// Funclets only need to account for space for the callee saved registers,
3775 /// as the locals are accounted for in the parent's stack frame.
3777  const MachineFunction &MF) const {
3778  // This is the size of the pushed CSRs.
3779  unsigned CSSize =
3780  MF.getInfo<AArch64FunctionInfo>()->getCalleeSavedStackSize();
3781  // This is the amount of stack a funclet needs to allocate.
3782  return alignTo(CSSize + MF.getFrameInfo().getMaxCallFrameSize(),
3783  getStackAlign());
3784 }
3785 
3786 namespace {
3787 struct FrameObject {
3788  bool IsValid = false;
3789  // Index of the object in MFI.
3790  int ObjectIndex = 0;
3791  // Group ID this object belongs to.
3792  int GroupIndex = -1;
3793  // This object should be placed first (closest to SP).
3794  bool ObjectFirst = false;
3795  // This object's group (which always contains the object with
3796  // ObjectFirst==true) should be placed first.
3797  bool GroupFirst = false;
3798 };
3799 
3800 class GroupBuilder {
3801  SmallVector<int, 8> CurrentMembers;
3802  int NextGroupIndex = 0;
3803  std::vector<FrameObject> &Objects;
3804 
3805 public:
3806  GroupBuilder(std::vector<FrameObject> &Objects) : Objects(Objects) {}
3807  void AddMember(int Index) { CurrentMembers.push_back(Index); }
3808  void EndCurrentGroup() {
3809  if (CurrentMembers.size() > 1) {
3810  // Create a new group with the current member list. This might remove them
3811  // from their pre-existing groups. That's OK, dealing with overlapping
3812  // groups is too hard and unlikely to make a difference.
3813  LLVM_DEBUG(dbgs() << "group:");
3814  for (int Index : CurrentMembers) {
3815  Objects[Index].GroupIndex = NextGroupIndex;
3816  LLVM_DEBUG(dbgs() << " " << Index);
3817  }
3818  LLVM_DEBUG(dbgs() << "\n");
3819  NextGroupIndex++;
3820  }
3821  CurrentMembers.clear();
3822  }
3823 };
3824 
3825 bool FrameObjectCompare(const FrameObject &A, const FrameObject &B) {
3826  // Objects at a lower index are closer to FP; objects at a higher index are
3827  // closer to SP.
3828  //
3829  // For consistency in our comparison, all invalid objects are placed
3830  // at the end. This also allows us to stop walking when we hit the
3831  // first invalid item after it's all sorted.
3832  //
3833  // The "first" object goes first (closest to SP), followed by the members of
3834  // the "first" group.
3835  //
3836  // The rest are sorted by the group index to keep the groups together.
3837  // Higher numbered groups are more likely to be around longer (i.e. untagged
3838  // in the function epilogue and not at some earlier point). Place them closer
3839  // to SP.
3840  //
3841  // If all else equal, sort by the object index to keep the objects in the
3842  // original order.
3843  return std::make_tuple(!A.IsValid, A.ObjectFirst, A.GroupFirst, A.GroupIndex,
3844  A.ObjectIndex) <
3845  std::make_tuple(!B.IsValid, B.ObjectFirst, B.GroupFirst, B.GroupIndex,
3846  B.ObjectIndex);
3847 }
3848 } // namespace
3849 
3851  const MachineFunction &MF, SmallVectorImpl<int> &ObjectsToAllocate) const {
3852  if (!OrderFrameObjects || ObjectsToAllocate.empty())
3853  return;
3854 
3855  const MachineFrameInfo &MFI = MF.getFrameInfo();
3856  std::vector<FrameObject> FrameObjects(MFI.getObjectIndexEnd());
3857  for (auto &Obj : ObjectsToAllocate) {
3858  FrameObjects[Obj].IsValid = true;
3859  FrameObjects[Obj].ObjectIndex = Obj;
3860  }
3861 
3862  // Identify stack slots that are tagged at the same time.
3863  GroupBuilder GB(FrameObjects);
3864  for (auto &MBB : MF) {
3865  for (auto &MI : MBB) {
3866  if (MI.isDebugInstr())
3867  continue;
3868  int OpIndex;
3869  switch (MI.getOpcode()) {
3870  case AArch64::STGloop:
3871  case AArch64::STZGloop:
3872  OpIndex = 3;
3873  break;
3874  case AArch64::STGOffset:
3875  case AArch64::STZGOffset:
3876  case AArch64::ST2GOffset:
3877  case AArch64::STZ2GOffset:
3878  OpIndex = 1;
3879  break;
3880  default:
3881  OpIndex = -1;
3882  }
3883 
3884  int TaggedFI = -1;
3885  if (OpIndex >= 0) {
3886  const MachineOperand &MO = MI.getOperand(OpIndex);
3887  if (MO.isFI()) {
3888  int FI = MO.getIndex();
3889  if (FI >= 0 && FI < MFI.getObjectIndexEnd() &&
3890  FrameObjects[FI].IsValid)
3891  TaggedFI = FI;
3892  }
3893  }
3894 
3895  // If this is a stack tagging instruction for a slot that is not part of a
3896  // group yet, either start a new group or add it to the current one.
3897  if (TaggedFI >= 0)
3898  GB.AddMember(TaggedFI);
3899  else
3900  GB.EndCurrentGroup();
3901  }
3902  // Groups should never span multiple basic blocks.
3903  GB.EndCurrentGroup();
3904  }
3905 
3906  // If the function's tagged base pointer is pinned to a stack slot, we want to
3907  // put that slot first when possible. This will likely place it at SP + 0,
3908  // and save one instruction when generating the base pointer because IRG does
3909  // not allow an immediate offset.
3910  const AArch64FunctionInfo &AFI = *MF.getInfo<AArch64FunctionInfo>();
3912  if (TBPI) {
3913  FrameObjects[*TBPI].ObjectFirst = true;
3914  FrameObjects[*TBPI].GroupFirst = true;
3915  int FirstGroupIndex = FrameObjects[*TBPI].GroupIndex;
3916  if (FirstGroupIndex >= 0)
3917  for (FrameObject &Object : FrameObjects)
3918  if (Object.GroupIndex == FirstGroupIndex)
3919  Object.GroupFirst = true;
3920  }
3921 
3922  llvm::stable_sort(FrameObjects, FrameObjectCompare);
3923 
3924  int i = 0;
3925  for (auto &Obj : FrameObjects) {
3926  // All invalid items are sorted at the end, so it's safe to stop.
3927  if (!Obj.IsValid)
3928  break;
3929  ObjectsToAllocate[i++] = Obj.ObjectIndex;
3930  }
3931 
3932  LLVM_DEBUG(dbgs() << "Final frame order:\n"; for (auto &Obj
3933  : FrameObjects) {
3934  if (!Obj.IsValid)
3935  break;
3936  dbgs() << " " << Obj.ObjectIndex << ": group " << Obj.GroupIndex;
3937  if (Obj.ObjectFirst)
3938  dbgs() << ", first";
3939  if (Obj.GroupFirst)
3940  dbgs() << ", group-first";
3941  dbgs() << "\n";
3942  });
3943 }
llvm::Check::Size
@ Size
Definition: FileCheck.h:77
llvm::MachineFunction::hasWinCFI
bool hasWinCFI() const
Definition: MachineFunction.h:738
i
i
Definition: README.txt:29
llvm::alignTo
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition: Alignment.h:156
llvm::RegState::InternalRead
@ InternalRead
Register reads a value that is defined inside the same instruction or bundle.
Definition: MachineInstrBuilder.h:59
llvm::isAsynchronousEHPersonality
bool isAsynchronousEHPersonality(EHPersonality Pers)
Returns true if this personality function catches asynchronous exceptions.
Definition: EHPersonalities.h:49
llvm::AArch64ISD::LOADgot
@ LOADgot
Definition: AArch64ISelLowering.h:70
llvm::MachineFrameInfo::isMaxCallFrameSizeComputed
bool isMaxCallFrameSizeComputed() const
Definition: MachineFrameInfo.h:653
llvm::MachineFrameInfo::hasVarSizedObjects
bool hasVarSizedObjects() const
This method may be called any time after instruction selection is complete to determine if the stack ...
Definition: MachineFrameInfo.h:354
llvm::AArch64Subtarget::isTargetWindows
bool isTargetWindows() const
Definition: AArch64Subtarget.h:254
AArch64RegisterInfo.h
Attrs
Function Attrs
Definition: README_ALTIVEC.txt:215
MCDwarf.h
MI
IRTranslator LLVM IR MI
Definition: IRTranslator.cpp:108
MachineInstr.h
MathExtras.h
llvm::MachineInstrBuilder::addImm
const MachineInstrBuilder & addImm(int64_t Val) const
Add a new immediate operand.
Definition: MachineInstrBuilder.h:131
llvm::MachineFrameInfo::estimateStackSize
uint64_t estimateStackSize(const MachineFunction &MF) const
Estimate and return the size of the stack frame.
Definition: MachineFrameInfo.cpp:137
llvm
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
llvm::MachineInstrBuilder::copyImplicitOps
const MachineInstrBuilder & copyImplicitOps(const MachineInstr &OtherMI) const
Copy all the implicit operands from OtherMI onto this one.
Definition: MachineInstrBuilder.h:321
AArch64MachineFunctionInfo.h
llvm::MachineRegisterInfo::isPhysRegUsed
bool isPhysRegUsed(MCRegister PhysReg, bool SkipRegMaskTest=false) const
Return true if the specified register is modified or read in this function.
Definition: MachineRegisterInfo.cpp:587
llvm::CallingConv::Win64
@ Win64
The C convention as implemented on Windows/x86-64 and AArch64.
Definition: CallingConv.h:156
llvm::MCSymbol
MCSymbol - Instances of this class represent a symbol name in the MC file, and MCSymbols are created ...
Definition: MCSymbol.h:41
llvm::AArch64Subtarget::swiftAsyncContextIsDynamicallySet
bool swiftAsyncContextIsDynamicallySet() const
Return whether FrameLowering should always set the "extended frame present" bit in FP,...
Definition: AArch64Subtarget.h:322
getFixedObjectSize
static unsigned getFixedObjectSize(const MachineFunction &MF, const AArch64FunctionInfo *AFI, bool IsWin64, bool IsFunclet)
Returns the size of the fixed object area (allocated next to sp on entry) On Win64 this may include a...
Definition: AArch64FrameLowering.cpp:382
llvm::LivePhysRegs::addReg
void addReg(MCPhysReg Reg)
Adds a physical register and all its sub-registers to the set.
Definition: LivePhysRegs.h:81
DefaultSafeSPDisplacement
static const unsigned DefaultSafeSPDisplacement
This is the biggest offset to the stack pointer we can encode in aarch64 instructions (without using ...
Definition: AArch64FrameLowering.cpp:345
llvm::AArch64_AM::LSL
@ LSL
Definition: AArch64AddressingModes.h:35
llvm::AArch64FrameLowering::emitCalleeSavedFrameMoves
void emitCalleeSavedFrameMoves(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI) const
Definition: AArch64FrameLowering.cpp:579
llvm::MachineRegisterInfo::createVirtualRegister
Register createVirtualRegister(const TargetRegisterClass *RegClass, StringRef Name="")
createVirtualRegister - Create and return a new virtual register in the function with the specified r...
Definition: MachineRegisterInfo.cpp:156
llvm::MachineModuleInfo::getContext
const MCContext & getContext() const
Definition: MachineModuleInfo.h:143
llvm::TargetRegisterInfo::isGeneralPurposeRegister
virtual bool isGeneralPurposeRegister(const MachineFunction &MF, MCRegister PhysReg) const
Returns true if PhysReg is a general purpose register.
Definition: TargetRegisterInfo.h:589
llvm::MachineRegisterInfo
MachineRegisterInfo - Keep track of information for virtual and physical registers,...
Definition: MachineRegisterInfo.h:50
produceCompactUnwindFrame
static bool produceCompactUnwindFrame(MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:2424
RegSize
unsigned RegSize
Definition: AArch64MIPeepholeOpt.cpp:124
llvm::MachineInstrBuilder::add
const MachineInstrBuilder & add(const MachineOperand &MO) const
Definition: MachineInstrBuilder.h:224
llvm::Function
Definition: Function.h:60
llvm::BitVector::set
BitVector & set()
Definition: BitVector.h:344
llvm::TargetSubtargetInfo::getInstrInfo
virtual const TargetInstrInfo * getInstrInfo() const
Definition: TargetSubtargetInfo.h:93
llvm::MachineInstrBuilder::addCFIIndex
const MachineInstrBuilder & addCFIIndex(unsigned CFIIndex) const
Definition: MachineInstrBuilder.h:247
llvm::MachineBasicBlock::isEHFuncletEntry
bool isEHFuncletEntry() const
Returns true if this is the entry block of an EH funclet.
Definition: MachineBasicBlock.h:602
contains
return AArch64::GPR64RegClass contains(Reg)
llvm::CodeModel::Medium
@ Medium
Definition: CodeGen.h:28
llvm::SmallVector
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1182
Statistic.h
llvm::MachineFunction::getMachineMemOperand
MachineMemOperand * getMachineMemOperand(MachinePointerInfo PtrInfo, MachineMemOperand::Flags f, uint64_t s, Align base_alignment, const AAMDNodes &AAInfo=AAMDNodes(), const MDNode *Ranges=nullptr, SyncScope::ID SSID=SyncScope::System, AtomicOrdering Ordering=AtomicOrdering::NotAtomic, AtomicOrdering FailureOrdering=AtomicOrdering::NotAtomic)
getMachineMemOperand - Allocate a new MachineMemOperand.
Definition: MachineFunction.cpp:454
llvm::HexagonISD::PFALSE
@ PFALSE
Definition: HexagonISelLowering.h:73
llvm::CallingConv::PreserveMost
@ PreserveMost
Used for runtime calls that preserves most registers.
Definition: CallingConv.h:63
ErrorHandling.h
llvm::MCRegisterInfo::getDwarfRegNum
int getDwarfRegNum(MCRegister RegNum, bool isEH) const
Map a target register to an equivalent dwarf register number.
Definition: MCRegisterInfo.cpp:68
llvm::NVPTX::PTXCvtMode::RPI
@ RPI
Definition: NVPTX.h:135
llvm::MCRegisterInfo::getNumRegs
unsigned getNumRegs() const
Return the number of registers this target has (useful for sizing arrays holding per register informa...
Definition: MCRegisterInfo.h:491
llvm::getBLRCallOpcode
unsigned getBLRCallOpcode(const MachineFunction &MF)
Return opcode to be used for indirect calls.
Definition: AArch64InstrInfo.cpp:8149
llvm::AArch64RegisterInfo::getBaseRegister
unsigned getBaseRegister() const
Definition: AArch64RegisterInfo.cpp:487
llvm::X86Disassembler::Reg
Reg
All possible values of the reg field in the ModR/M byte.
Definition: X86DisassemblerDecoder.h:462
llvm::AArch64FunctionInfo::getTaggedBasePointerIndex
Optional< int > getTaggedBasePointerIndex() const
Definition: AArch64MachineFunctionInfo.h:403
llvm::BitVector::set_bits
iterator_range< const_set_bits_iterator > set_bits() const
Definition: BitVector.h:133
llvm::AArch64FunctionInfo::setSwiftAsyncContextFrameIdx
void setSwiftAsyncContextFrameIdx(int FI)
Definition: AArch64MachineFunctionInfo.h:435
MachineBasicBlock.h
llvm::LivePhysRegs
A set of physical registers with utility functions to track liveness when walking backward/forward th...
Definition: LivePhysRegs.h:50
Right
Vector Shift Left Right
Definition: README_P9.txt:118
llvm::TargetSubtargetInfo::getRegisterInfo
virtual const TargetRegisterInfo * getRegisterInfo() const
getRegisterInfo - If register information is available, return it.
Definition: TargetSubtargetInfo.h:125
llvm::cl::Hidden
@ Hidden
Definition: CommandLine.h:139
llvm::MachineOperand::setImm
void setImm(int64_t immVal)
Definition: MachineOperand.h:664
llvm::MachineBasicBlock::findDebugLoc
DebugLoc findDebugLoc(instr_iterator MBBI)
Find the next valid DebugLoc starting at MBBI, skipping any DBG_VALUE and DBG_LABEL instructions.
Definition: MachineBasicBlock.cpp:1387
llvm::TargetRegisterInfo
TargetRegisterInfo base class - We assume that the target defines a static array of TargetRegisterDes...
Definition: TargetRegisterInfo.h:237
Shift
bool Shift
Definition: README.txt:468
llvm::AArch64FrameLowering::enableStackSlotScavenging
bool enableStackSlotScavenging(const MachineFunction &MF) const override
Returns true if the stack slot holes in the fixed and callee-save stack area should be used when allo...
Definition: AArch64FrameLowering.cpp:3175
llvm::AArch64Subtarget::getInstrInfo
const AArch64InstrInfo * getInstrInfo() const override
Definition: AArch64Subtarget.h:174
llvm::StackOffset::getFixed
ScalarTy getFixed() const
Definition: TypeSize.h:149
llvm::Type
The instances of the Type class are immutable: once they are created, they are never changed.
Definition: Type.h:45
llvm::AttributeList
Definition: Attributes.h:425
TargetInstrInfo.h
fixupSEHOpcode
static void fixupSEHOpcode(MachineBasicBlock::iterator MBBI, unsigned LocalStackSize)
Definition: AArch64FrameLowering.cpp:1098
getPrologueDeath
static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg)
Definition: AArch64FrameLowering.cpp:2414
llvm::Optional< int >
llvm::AArch64FrameLowering::assignCalleeSavedSpillSlots
bool assignCalleeSavedSpillSlots(MachineFunction &MF, const TargetRegisterInfo *TRI, std::vector< CalleeSavedInfo > &CSI, unsigned &MinCSFrameIndex, unsigned &MaxCSFrameIndex) const override
assignCalleeSavedSpillSlots - Allows target to override spill slot assignment logic.
Definition: AArch64FrameLowering.cpp:3122
llvm::max
Expected< ExpressionValue > max(const ExpressionValue &Lhs, const ExpressionValue &Rhs)
Definition: FileCheck.cpp:337
llvm::AArch64FrameLowering::hasFP
bool hasFP(const MachineFunction &MF) const override
hasFP - Return true if the specified function should have a dedicated frame pointer register.
Definition: AArch64FrameLowering.cpp:426
llvm::MachineFrameInfo::getObjectIndexEnd
int getObjectIndexEnd() const
Return one past the maximum frame object index.
Definition: MachineFrameInfo.h:409
determineSVEStackObjectOffsets
static int64_t determineSVEStackObjectOffsets(MachineFrameInfo &MFI, int &MinCSFrameIndex, int &MaxCSFrameIndex, bool AssignOffsets)
Definition: AArch64FrameLowering.cpp:3210
llvm::CodeModel::Kernel
@ Kernel
Definition: CodeGen.h:28
llvm::MachineOperand::isFI
bool isFI() const
isFI - Tests if this is a MO_FrameIndex operand.
Definition: MachineOperand.h:330
llvm::AArch64FunctionInfo::setTaggedBasePointerOffset
void setTaggedBasePointerOffset(unsigned Offset)
Definition: AArch64MachineFunctionInfo.h:411
llvm::MCCFIInstruction::createSameValue
static MCCFIInstruction createSameValue(MCSymbol *L, unsigned Register)
.cfi_same_value Current value of Register is the same as in the previous frame.
Definition: MCDwarf.h:615
TRI
unsigned const TargetRegisterInfo * TRI
Definition: MachineSink.cpp:1628
llvm::RegScavenger::FindUnusedReg
Register FindUnusedReg(const TargetRegisterClass *RC) const
Find an unused register of the specified register class.
Definition: RegisterScavenging.cpp:266
llvm::MachineFrameInfo::getMaxCallFrameSize
unsigned getMaxCallFrameSize() const
Return the maximum size of a call frame that must be allocated for an outgoing function call.
Definition: MachineFrameInfo.h:646
OpIndex
unsigned OpIndex
Definition: SPIRVModuleAnalysis.cpp:46
llvm::ARCISD::BL
@ BL
Definition: ARCISelLowering.h:34
llvm::ArrayRef::empty
bool empty() const
empty - Check if the array is empty.
Definition: ArrayRef.h:159
LLVM_DEBUG
#define LLVM_DEBUG(X)
Definition: Debug.h:101
llvm::AArch64Subtarget::getTargetLowering
const AArch64TargetLowering * getTargetLowering() const override
Definition: AArch64Subtarget.h:171
F
#define F(x, y, z)
Definition: MD5.cpp:55
llvm::MCCFIInstruction::cfiDefCfaOffset
static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, int Offset)
.cfi_def_cfa_offset modifies a rule for computing CFA.
Definition: MCDwarf.h:546
MachineRegisterInfo.h
llvm::MachineInstr::FrameDestroy
@ FrameDestroy
Definition: MachineInstr.h:86
isTargetWindows
static bool isTargetWindows(const MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:1283
llvm::MachineBasicBlock::erase
instr_iterator erase(instr_iterator I)
Remove an instruction from the instruction list and delete it.
Definition: MachineBasicBlock.cpp:1313
llvm::SwiftAsyncFramePointerMode::Never
@ Never
Never set the bit.
llvm::dbgs
raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition: Debug.cpp:163
llvm::AArch64RegisterInfo::cannotEliminateFrame
bool cannotEliminateFrame(const MachineFunction &MF) const
Definition: AArch64RegisterInfo.cpp:628
clear
static void clear(coro::Shape &Shape)
Definition: Coroutines.cpp:149
llvm::classifyEHPersonality
EHPersonality classifyEHPersonality(const Value *Pers)
See if the given exception handling personality function is one that we understand.
Definition: EHPersonalities.cpp:22
llvm::AArch64FrameLowering
Definition: AArch64FrameLowering.h:21
llvm::AArch64FrameOffsetCannotUpdate
@ AArch64FrameOffsetCannotUpdate
Offset cannot apply.
Definition: AArch64InstrInfo.h:435
llvm::MachineInstr::getFlags
uint16_t getFlags() const
Return the MI flags bitvector.
Definition: MachineInstr.h:352
llvm::AlignStyle::Left
@ Left
ReverseCSRRestoreSeq
static cl::opt< bool > ReverseCSRRestoreSeq("reverse-csr-restore-seq", cl::desc("reverse the CSR restore sequence"), cl::init(false), cl::Hidden)
llvm::VFISAKind::SVE
@ SVE
CommandLine.h
llvm::AArch64FrameLowering::processFunctionBeforeFrameFinalized
void processFunctionBeforeFrameFinalized(MachineFunction &MF, RegScavenger *RS) const override
processFunctionBeforeFrameFinalized - This method is called immediately before the specified function...
Definition: AArch64FrameLowering.cpp:3298
llvm::MachineInstrBuilder::addDef
const MachineInstrBuilder & addDef(Register RegNo, unsigned Flags=0, unsigned SubReg=0) const
Add a virtual register definition operand.
Definition: MachineInstrBuilder.h:116
llvm::getDefRegState
unsigned getDefRegState(bool B)
Definition: MachineInstrBuilder.h:540
llvm::MachineFunction::front
const MachineBasicBlock & front() const
Definition: MachineFunction.h:865
llvm::MachineFunction::getRegInfo
MachineRegisterInfo & getRegInfo()
getRegInfo - Return information about the registers currently in use.
Definition: MachineFunction.h:666
OrderFrameObjects
static cl::opt< bool > OrderFrameObjects("aarch64-order-frame-objects", cl::desc("sort stack allocations"), cl::init(true), cl::Hidden)
llvm::MachineBasicBlock::insertAfter
iterator insertAfter(iterator I, MachineInstr *MI)
Insert MI into the instruction list after I.
Definition: MachineBasicBlock.h:930
llvm::TargetInstrInfo
TargetInstrInfo - Interface to description of machine instruction set.
Definition: TargetInstrInfo.h:98
AArch64TargetMachine.h
AArch64InstrInfo.h
llvm::AArch64FrameLowering::resolveFrameOffsetReference
StackOffset resolveFrameOffsetReference(const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE, Register &FrameReg, bool PreferFP, bool ForSimm) const
Definition: AArch64FrameLowering.cpp:2280
llvm::TargetFrameLowering::getOffsetOfLocalArea
int getOffsetOfLocalArea() const
getOffsetOfLocalArea - This method returns the offset of the local area from the stack pointer on ent...
Definition: TargetFrameLowering.h:140
TargetMachine.h
llvm::MutableArrayRef
MutableArrayRef - Represent a mutable reference to an array (0 or more elements consecutively in memo...
Definition: ArrayRef.h:28
llvm::AArch64FunctionInfo::isStackRealigned
bool isStackRealigned() const
Definition: AArch64MachineFunctionInfo.h:231
llvm::MachineOperand::CreateImm
static MachineOperand CreateImm(int64_t Val)
Definition: MachineOperand.h:782
llvm::AArch64InstrInfo
Definition: AArch64InstrInfo.h:37
P2
This might compile to this xmm1 xorps xmm0 movss xmm0 ret Now consider if the code caused xmm1 to get spilled This might produce this xmm1 movaps xmm0 movaps xmm1 movss xmm0 ret since the reload is only used by these we could fold it into the producing something like xmm1 movaps xmm0 ret saving two instructions The basic idea is that a reload from a spill if only one byte chunk is bring in zeros the one element instead of elements This can be used to simplify a variety of shuffle where the elements are fixed zeros This code generates ugly probably due to costs being off or< 4 x float > * P2
Definition: README-SSE.txt:278
E
static GCRegistry::Add< CoreCLRGC > E("coreclr", "CoreCLR-compatible GC")
CASE
#define CASE(n)
llvm::MachineOperand::getImm
int64_t getImm() const
Definition: MachineOperand.h:546
llvm::AArch64FrameLowering::eliminateCallFramePseudoInstr
MachineBasicBlock::iterator eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator I) const override
This method is called during prolog/epilog code insertion to eliminate call frame setup and destroy p...
Definition: AArch64FrameLowering.cpp:466
llvm::MachineFunction::getInfo
Ty * getInfo()
getInfo - Keep track of various per-function pieces of information for backends that would like to do...
Definition: MachineFunction.h:754
invalidateWindowsRegisterPairing
static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2, bool NeedsWinCFI, bool IsFirst)
Definition: AArch64FrameLowering.cpp:2433
llvm::LivePhysRegs::addLiveIns
void addLiveIns(const MachineBasicBlock &MBB)
Adds all live-in registers of basic block MBB.
Definition: LivePhysRegs.cpp:238
llvm::ISD::CATCHRET
@ CATCHRET
CATCHRET - Represents a return from a catch block funclet.
Definition: ISDOpcodes.h:1044
llvm::AArch64Subtarget::isTargetILP32
bool isTargetILP32() const
Definition: AArch64Subtarget.h:263
llvm::TargetFrameLowering::getStackAlign
Align getStackAlign() const
getStackAlignment - This method returns the number of bytes to which the stack pointer must be aligne...
Definition: TargetFrameLowering.h:100
llvm::ARM_PROC::A
@ A
Definition: ARMBaseInfo.h:34
llvm::MachineRegisterInfo::isReserved
bool isReserved(MCRegister PhysReg) const
isReserved - Returns true when PhysReg is a reserved register.
Definition: MachineRegisterInfo.h:930
llvm::BitVector::count
size_type count() const
count - Returns the number of bits which are set.
Definition: BitVector.h:155
llvm::AArch64TargetLowering::supportSwiftError
bool supportSwiftError() const override
Return true if the target supports swifterror attribute.
Definition: AArch64ISelLowering.h:820
llvm::Log2
unsigned Log2(Align A)
Returns the log2 of the alignment.
Definition: Alignment.h:209
int
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s LLVM produces ret int
Definition: README.txt:536
EnableHomogeneousPrologEpilog
cl::opt< bool > EnableHomogeneousPrologEpilog("homogeneous-prolog-epilog", cl::Hidden, cl::desc("Emit homogeneous prologue and epilogue for the size " "optimization (default = off)"))
llvm::AArch64FunctionInfo::setHasRedZone
void setHasRedZone(bool s)
Definition: AArch64MachineFunctionInfo.h:330
getStackOffset
static StackOffset getStackOffset(const MachineFunction &MF, int64_t ObjectOffset)
Definition: AArch64FrameLowering.cpp:2252
llvm::TargetRegisterClass
Definition: TargetRegisterInfo.h:46
TII
const HexagonInstrInfo * TII
Definition: HexagonCopyToCombine.cpp:125
llvm::dwarf::Index
Index
Definition: Dwarf.h:472
llvm::TypeSize::Fixed
static TypeSize Fixed(ScalarTy MinVal)
Definition: TypeSize.h:441
llvm::MCInstrDesc
Describe properties that are true of each instruction in the target description file.
Definition: MCInstrDesc.h:197
B
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
First
into llvm powi allowing the code generator to produce balanced multiplication trees First
Definition: README.txt:54
llvm::MachineOperand
MachineOperand class - Representation of each machine instruction operand.
Definition: MachineOperand.h:48
llvm::CodeModel::Small
@ Small
Definition: CodeGen.h:28
llvm::MachineInstr::FrameSetup
@ FrameSetup
Definition: MachineInstr.h:84
llvm::MCID::Flag
Flag
These should be considered private to the implementation of the MCInstrDesc class.
Definition: MCInstrDesc.h:147
llvm::MachineModuleInfo
This class contains meta information specific to a module.
Definition: MachineModuleInfo.h:74
getSVEStackSize
static StackOffset getSVEStackSize(const MachineFunction &MF)
Returns the size of the entire SVE stackframe (calleesaves + spills).
Definition: AArch64FrameLowering.cpp:399
findScratchNonCalleeSaveRegister
static unsigned findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB)
Definition: AArch64FrameLowering.cpp:827
llvm::AArch64_AM::getShifterImm
static unsigned getShifterImm(AArch64_AM::ShiftExtendType ST, unsigned Imm)
getShifterImm - Encode the shift type and amount: imm: 6-bit shift amount shifter: 000 ==> lsl 001 ==...
Definition: AArch64AddressingModes.h:99
llvm::RegScavenger::backward
void backward()
Update internal register state and move MBB iterator backwards.
Definition: RegisterScavenging.cpp:239
llvm::report_fatal_error
void report_fatal_error(Error Err, bool gen_crash_diag=true)
Report a serious error, calling any installed error handler.
Definition: Error.cpp:145
llvm::STATISTIC
STATISTIC(NumFunctions, "Total number of functions")
llvm::MachineFrameInfo::getStackID
uint8_t getStackID(int ObjectIdx) const
Definition: MachineFrameInfo.h:723
llvm::RegScavenger::enterBasicBlockEnd
void enterBasicBlockEnd(MachineBasicBlock &MBB)
Start tracking liveness from the end of basic block MBB.
Definition: RegisterScavenging.cpp:87
llvm::MachineBasicBlock::instr_back
MachineInstr & instr_back()
Definition: MachineBasicBlock.h:280
llvm::AArch64FunctionInfo::needsAsyncDwarfUnwindInfo
bool needsAsyncDwarfUnwindInfo() const
Definition: AArch64MachineFunctionInfo.cpp:140
llvm::AArch64_AM::getShiftValue
static unsigned getShiftValue(unsigned Imm)
getShiftValue - Extract the shift value.
Definition: AArch64AddressingModes.h:86
llvm::MachineFunction::setHasWinCFI
void setHasWinCFI(bool v)
Definition: MachineFunction.h:741
llvm::MachineFrameInfo::getStackSize
uint64_t getStackSize() const
Return the number of bytes that must be allocated to hold all of the fixed size frame objects.
Definition: MachineFrameInfo.h:577
DebugLoc.h
emitShadowCallStackPrologue
static void emitShadowCallStackPrologue(const TargetInstrInfo &TII, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, bool NeedsWinCFI, bool NeedsUnwindInfo)
Definition: AArch64FrameLowering.cpp:1314
llvm::MachineFrameInfo::getObjectOffset
int64_t getObjectOffset(int ObjectIdx) const
Return the assigned stack offset of the specified object from the incoming stack pointer.
Definition: MachineFrameInfo.h:518
llvm::AArch64FrameLowering::getFrameIndexReference
StackOffset getFrameIndexReference(const MachineFunction &MF, int FI, Register &FrameReg) const override
getFrameIndexReference - Provide a base+offset reference to an FI slot for debug info.
Definition: AArch64FrameLowering.cpp:2223
Info
Analysis containing CSE Info
Definition: CSEInfo.cpp:27
llvm::BitVector
Definition: BitVector.h:75
llvm::SmallVectorImpl::append
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
Definition: SmallVector.h:670
Align
uint64_t Align
Definition: ELFObjHandler.cpp:81
llvm::StackOffset::getScalable
ScalarTy getScalable() const
Definition: TypeSize.h:150
llvm::MCCFIInstruction::createEscape
static MCCFIInstruction createEscape(MCSymbol *L, StringRef Vals, StringRef Comment="")
.cfi_escape Allows the user to add arbitrary bytes to the unwind info.
Definition: MCDwarf.h:631
llvm::AArch64FunctionInfo::hasStackFrame
bool hasStackFrame() const
Definition: AArch64MachineFunctionInfo.h:228
llvm::Align
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition: Alignment.h:39
llvm::MachineFrameInfo::getObjectIndexBegin
int getObjectIndexBegin() const
Return the minimum frame object index.
Definition: MachineFrameInfo.h:406
llvm::MachineInstrBuilder::addExternalSymbol
const MachineInstrBuilder & addExternalSymbol(const char *FnName, unsigned TargetFlags=0) const
Definition: MachineInstrBuilder.h:184
convertCalleeSaveRestoreToSPPrePostIncDec
static MachineBasicBlock::iterator convertCalleeSaveRestoreToSPPrePostIncDec(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, const TargetInstrInfo *TII, int CSStackSizeInc, bool NeedsWinCFI, bool *HasWinCFI, bool EmitCFI, MachineInstr::MIFlag FrameFlag=MachineInstr::FrameSetup, int CFAOffset=0)
Definition: AArch64FrameLowering.cpp:1120
llvm::None
const NoneType None
Definition: None.h:24
llvm::MachineFrameInfo::isCalleeSavedInfoValid
bool isCalleeSavedInfoValid() const
Has the callee saved info been calculated yet?
Definition: MachineFrameInfo.h:792
llvm::CallingConv::ID
unsigned ID
LLVM IR allows to use arbitrary numbers as calling convention identifiers.
Definition: CallingConv.h:24
fixupCalleeSaveRestoreStackOffset
static void fixupCalleeSaveRestoreStackOffset(MachineInstr &MI, uint64_t LocalStackSize, bool NeedsWinCFI, bool *HasWinCFI)
Definition: AArch64FrameLowering.cpp:1234
llvm::MachineBasicBlock
Definition: MachineBasicBlock.h:94
llvm::MachineFrameInfo::isDeadObjectIndex
bool isDeadObjectIndex(int ObjectIdx) const
Returns true if the specified index corresponds to a dead object.
Definition: MachineFrameInfo.h:737
llvm::Function::getAttributes
AttributeList getAttributes() const
Return the attribute list for this Function.
Definition: Function.h:314
computeCalleeSaveRegisterPairs
static void computeCalleeSaveRegisterPairs(MachineFunction &MF, ArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI, SmallVectorImpl< RegPairInfo > &RegPairs, bool NeedsFrameRecord)
Definition: AArch64FrameLowering.cpp:2509
AArch64AddressingModes.h
llvm::OutputFileType::Object
@ Object
StackTaggingMergeSetTag
static cl::opt< bool > StackTaggingMergeSetTag("stack-tagging-merge-settag", cl::desc("merge settag instruction in function epilog"), cl::init(true), cl::Hidden)
llvm::TargetOptions::DisableFramePointerElim
bool DisableFramePointerElim(const MachineFunction &MF) const
DisableFramePointerElim - This returns true if frame pointer elimination optimization should be disab...
Definition: TargetOptionsImpl.cpp:23
llvm::MachineFunction::getMMI
MachineModuleInfo & getMMI() const
Definition: MachineFunction.h:607
llvm::TargetRegisterInfo::getSpillAlign
Align getSpillAlign(const TargetRegisterClass &RC) const
Return the minimum required alignment in bytes for a spill slot for a register of this class.
Definition: TargetRegisterInfo.h:292
llvm::AArch64FrameLowering::resolveFrameIndexReference
StackOffset resolveFrameIndexReference(const MachineFunction &MF, int FI, Register &FrameReg, bool PreferFP, bool ForSimm) const
Definition: AArch64FrameLowering.cpp:2269
llvm::MachineFunction::getSubtarget
const TargetSubtargetInfo & getSubtarget() const
getSubtarget - Return the subtarget for which this machine code is being compiled.
Definition: MachineFunction.h:656
llvm::MachineInstrBuilder::addFrameIndex
const MachineInstrBuilder & addFrameIndex(int Idx) const
Definition: MachineInstrBuilder.h:152
llvm::MachineInstrBuilder::setMIFlag
const MachineInstrBuilder & setMIFlag(MachineInstr::MIFlag Flag) const
Definition: MachineInstrBuilder.h:278
llvm::isAArch64FrameOffsetLegal
int isAArch64FrameOffsetLegal(const MachineInstr &MI, StackOffset &Offset, bool *OutUseUnscaledOp=nullptr, unsigned *OutUnscaledOp=nullptr, int64_t *EmittableOffset=nullptr)
Check if the Offset is a valid frame offset for MI.
Definition: AArch64InstrInfo.cpp:4596
llvm::AArch64FunctionInfo::getCalleeSaveBaseToFrameRecordOffset
int getCalleeSaveBaseToFrameRecordOffset() const
Definition: AArch64MachineFunctionInfo.h:415
llvm::Function::hasFnAttribute
bool hasFnAttribute(Attribute::AttrKind Kind) const
Return true if the function has the attribute.
Definition: Function.cpp:628
llvm::cl::opt< bool >
llvm::TargetRegisterInfo::getSpillSize
unsigned getSpillSize(const TargetRegisterClass &RC) const
Return the size in bytes of the stack slot allocated to hold a spilled copy of a register from class ...
Definition: TargetRegisterInfo.h:286
llvm::WinEHFuncInfo
Definition: WinEHFuncInfo.h:90
llvm::AMDGPU::Hwreg::Offset
Offset
Definition: SIDefines.h:416
llvm::AArch64FrameLowering::resetCFIToInitialState
void resetCFIToInitialState(MachineBasicBlock &MBB) const override
Emit CFI instructions that recreate the state of the unwind information upon fucntion entry.
Definition: AArch64FrameLowering.cpp:594
getSVECalleeSaveSlotRange
static bool getSVECalleeSaveSlotRange(const MachineFrameInfo &MFI, int &Min, int &Max)
returns true if there are any SVE callee saves.
Definition: AArch64FrameLowering.cpp:3182
Index
uint32_t Index
Definition: ELFObjHandler.cpp:82
llvm::MachineInstr
Representation of each machine instruction.
Definition: MachineInstr.h:66
llvm::MachineInstrBuilder
Definition: MachineInstrBuilder.h:69
uint64_t
llvm::MachineFrameInfo::getObjectSize
int64_t getObjectSize(int ObjectIdx) const
Return the size of the specified object.
Definition: MachineFrameInfo.h:469
AArch64FrameLowering.h
llvm::Function::getCallingConv
CallingConv::ID getCallingConv() const
getCallingConv()/setCallingConv(CC) - These method get and set the calling convention of this functio...
Definition: Function.h:238
llvm::CallingConv::CXX_FAST_TLS
@ CXX_FAST_TLS
Used for access functions.
Definition: CallingConv.h:72
llvm::MachineFrameInfo::hasStackProtectorIndex
bool hasStackProtectorIndex() const
Definition: MachineFrameInfo.h:359
llvm::AArch64FrameLowering::spillCalleeSavedRegisters
bool spillCalleeSavedRegisters(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, ArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI) const override
spillCalleeSavedRegisters - Issues instruction(s) to spill all callee saved registers and returns tru...
Definition: AArch64FrameLowering.cpp:2700
llvm::AArch64Subtarget::getChkStkName
const char * getChkStkName() const
Definition: AArch64Subtarget.h:366
llvm::TargetOptions::SwiftAsyncFramePointer
SwiftAsyncFramePointerMode SwiftAsyncFramePointer
Control when and how the Swift async frame pointer bit should be set.
Definition: TargetOptions.h:243
llvm::LivePhysRegs::available
bool available(const MachineRegisterInfo &MRI, MCPhysReg Reg) const
Returns true if register Reg and no aliasing register is in the set.
Definition: LivePhysRegs.cpp:141
llvm::MCCFIInstruction::createNegateRAState
static MCCFIInstruction createNegateRAState(MCSymbol *L)
.cfi_negate_ra_state AArch64 negate RA state.
Definition: MCDwarf.h:596
llvm::AArch64FunctionInfo::needsDwarfUnwindInfo
bool needsDwarfUnwindInfo() const
Definition: AArch64MachineFunctionInfo.cpp:132
llvm::MachineRegisterInfo::getCalleeSavedRegs
const MCPhysReg * getCalleeSavedRegs() const
Returns list of callee saved registers.
Definition: MachineRegisterInfo.cpp:623
llvm::AArch64FunctionInfo::setCalleeSaveBaseToFrameRecordOffset
void setCalleeSaveBaseToFrameRecordOffset(int Offset)
Definition: AArch64MachineFunctionInfo.h:418
llvm::ISD::CLEANUPRET
@ CLEANUPRET
CLEANUPRET - Represents a return from a cleanup block funclet.
Definition: ISDOpcodes.h:1048
llvm::AArch64FunctionInfo
AArch64FunctionInfo - This class is derived from MachineFunctionInfo and contains private AArch64-spe...
Definition: AArch64MachineFunctionInfo.h:38
I
#define I(x, y, z)
Definition: MD5.cpp:58
llvm::MCPhysReg
uint16_t MCPhysReg
An unsigned integer type large enough to represent all physical registers, but not necessarily virtua...
Definition: MCRegister.h:21
llvm::RegScavenger
Definition: RegisterScavenging.h:34
llvm::MachineFrameInfo::getObjectAlign
Align getObjectAlign(int ObjectIdx) const
Return the alignment of the specified stack object.
Definition: MachineFrameInfo.h:483
llvm::cl::init
initializer< Ty > init(const Ty &Val)
Definition: CommandLine.h:439
llvm::MachineFrameInfo::setObjectOffset
void setObjectOffset(int ObjectIdx, int64_t SPOffset)
Set the stack frame offset of the specified object.
Definition: MachineFrameInfo.h:552
llvm::TargetStackID::ScalableVector
@ ScalableVector
Definition: TargetFrameLowering.h:30
llvm::StackOffset::getScalable
static StackOffset getScalable(ScalarTy Scalable)
Definition: TypeSize.h:144
windowsRequiresStackProbe
static bool windowsRequiresStackProbe(MachineFunction &MF, uint64_t StackSizeInBytes)
Definition: AArch64FrameLowering.cpp:871
llvm::MachineBasicBlock::getLastNonDebugInstr
iterator getLastNonDebugInstr(bool SkipPseudoOp=true)
Returns an iterator to the last non-debug instruction in the basic block, or end().
Definition: MachineBasicBlock.cpp:263
llvm::RegState::Define
@ Define
Register definition.
Definition: MachineInstrBuilder.h:44
llvm::AArch64FrameLowering::getStackIDForScalableVectors
TargetStackID::Value getStackIDForScalableVectors() const override
Returns the StackID that scalable vectors should be associated with.
Definition: AArch64FrameLowering.cpp:376
llvm::TargetMachine::Options
TargetOptions Options
Definition: TargetMachine.h:118
llvm::MachineInstr::setFlags
void setFlags(unsigned flags)
Definition: MachineInstr.h:366
assert
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
llvm::AArch64FunctionInfo::getLocalStackSize
uint64_t getLocalStackSize() const
Definition: AArch64MachineFunctionInfo.h:244
InsertSEH
static MachineBasicBlock::iterator InsertSEH(MachineBasicBlock::iterator MBBI, const TargetInstrInfo &TII, MachineInstr::MIFlag Flag)
Definition: AArch64FrameLowering.cpp:978
llvm::MachineFrameInfo::CreateFixedObject
int CreateFixedObject(uint64_t Size, int64_t SPOffset, bool IsImmutable, bool isAliased=false)
Create a new object at a fixed location on the stack.
Definition: MachineFrameInfo.cpp:83
llvm::AArch64FunctionInfo::setStackRealigned
void setStackRealigned(bool s)
Definition: AArch64MachineFunctionInfo.h:232
std::swap
void swap(llvm::BitVector &LHS, llvm::BitVector &RHS)
Implement std::swap in terms of BitVector swap.
Definition: BitVector.h:853
llvm::MachineFunction::getFrameInfo
MachineFrameInfo & getFrameInfo()
getFrameInfo - Return the frame info object for the current function.
Definition: MachineFunction.h:672
llvm::MachineBasicBlock::getParent
const MachineFunction * getParent() const
Return the MachineFunction containing this basic block.
Definition: MachineBasicBlock.h:261
llvm::RegScavenger::addScavengingFrameIndex
void addScavengingFrameIndex(int FI)
Add a scavenging frame index.
Definition: RegisterScavenging.h:143
llvm::MachineInstrBuilder::addMemOperand
const MachineInstrBuilder & addMemOperand(MachineMemOperand *MMO) const
Definition: MachineInstrBuilder.h:202
llvm::TargetFrameLowering::getStackGrowthDirection
StackDirection getStackGrowthDirection() const
getStackGrowthDirection - Return the direction the stack grows
Definition: TargetFrameLowering.h:89
llvm::AArch64FunctionInfo::getVarArgsGPRSize
unsigned getVarArgsGPRSize() const
Definition: AArch64MachineFunctionInfo.h:341
llvm::AArch64Subtarget::isCallingConvWin64
bool isCallingConvWin64(CallingConv::ID CC) const
Definition: AArch64Subtarget.h:307
llvm::MCCFIInstruction::cfiDefCfa
static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register, int Offset)
.cfi_def_cfa defines a rule for computing CFA as: take address from Register and add Offset to it.
Definition: MCDwarf.h:532
llvm::emitFrameOffset
void emitFrameOffset(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, unsigned DestReg, unsigned SrcReg, StackOffset Offset, const TargetInstrInfo *TII, MachineInstr::MIFlag=MachineInstr::NoFlags, bool SetNZCV=false, bool NeedsWinCFI=false, bool *HasWinCFI=nullptr, bool EmitCFAOffset=false, StackOffset InitialOffset={}, unsigned FrameReg=AArch64::SP)
emitFrameOffset - Emit instructions as needed to set DestReg to SrcReg plus Offset.
Definition: AArch64InstrInfo.cpp:4363
MachineModuleInfo.h
llvm::WinEHFuncInfo::UnwindHelpFrameIdx
int UnwindHelpFrameIdx
Definition: WinEHFuncInfo.h:99
llvm::AArch64FrameLowering::getNonLocalFrameIndexReference
StackOffset getNonLocalFrameIndexReference(const MachineFunction &MF, int FI) const override
getNonLocalFrameIndexReference - This method returns the offset used to reference a frame index locat...
Definition: AArch64FrameLowering.cpp:2233
llvm::MachineInstrBuilder::addReg
const MachineInstrBuilder & addReg(Register RegNo, unsigned flags=0, unsigned SubReg=0) const
Add a new virtual register operand.
Definition: MachineInstrBuilder.h:97
llvm::MachineInstrBuilder::addUse
const MachineInstrBuilder & addUse(Register RegNo, unsigned Flags=0, unsigned SubReg=0) const
Add a virtual register use operand.
Definition: MachineInstrBuilder.h:123
llvm::AArch64FrameLowering::canUseRedZone
bool canUseRedZone(const MachineFunction &MF) const
Can this function use the red zone for local allocations.
Definition: AArch64FrameLowering.cpp:404
llvm::MachineInstr::MIFlag
MIFlag
Definition: MachineInstr.h:82
Cleanup
static const HTTPClientCleanup Cleanup
Definition: HTTPClient.cpp:42
llvm::MachineFunction
Definition: MachineFunction.h:257
TargetOptions.h
llvm::MachineFrameInfo::getCalleeSavedInfo
const std::vector< CalleeSavedInfo > & getCalleeSavedInfo() const
Returns a reference to call saved info vector for the current function.
Definition: MachineFrameInfo.h:779
llvm::TargetMachine::getMCAsmInfo
const MCAsmInfo * getMCAsmInfo() const
Return target specific asm information.
Definition: TargetMachine.h:205
llvm::MachineBasicBlock::getFirstTerminator
iterator getFirstTerminator()
Returns an iterator to the first terminator instruction of this basic block.
Definition: MachineBasicBlock.cpp:238
EnableRedZone
static cl::opt< bool > EnableRedZone("aarch64-redzone", cl::desc("enable use of redzone on AArch64"), cl::init(false), cl::Hidden)
llvm::ArrayRef
ArrayRef - Represent a constant reference to an array (0 or more elements consecutively in memory),...
Definition: APInt.h:32
llvm::AArch64Subtarget::isXRegisterReserved
bool isXRegisterReserved(size_t i) const
Definition: AArch64Subtarget.h:201
llvm::MachineFrameInfo::setStackID
void setStackID(int ObjectIdx, uint8_t ID)
Definition: MachineFrameInfo.h:728
llvm::MachineFrameInfo::hasPatchPoint
bool hasPatchPoint() const
This method may be called any time after instruction selection is complete to determine if there is a...
Definition: MachineFrameInfo.h:388
llvm::RegState::Implicit
@ Implicit
Not emitted register (e.g. carry, or temporary result).
Definition: MachineInstrBuilder.h:46
llvm::min
Expected< ExpressionValue > min(const ExpressionValue &Lhs, const ExpressionValue &Rhs)
Definition: FileCheck.cpp:357
MCAsmInfo.h
llvm::any_of
bool any_of(R &&range, UnaryPredicate P)
Provide wrappers to std::any_of which take ranges instead of having to pass begin/end explicitly.
Definition: STLExtras.h:1597
DataLayout.h
llvm::MachineFrameInfo::CreateStackObject
int CreateStackObject(uint64_t Size, Align Alignment, bool isSpillSlot, const AllocaInst *Alloca=nullptr, uint8_t ID=0)
Create a new statically sized stack object, returning a nonnegative identifier to represent it.
Definition: MachineFrameInfo.cpp:51
llvm::StringRef
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:50
llvm::MachineBasicBlock::splice
void splice(iterator Where, MachineBasicBlock *Other, iterator From)
Take an instruction from MBB 'Other' at the position From, and insert it into this MBB right before '...
Definition: MachineBasicBlock.h:1009
MBBI
MachineBasicBlock MachineBasicBlock::iterator MBBI
Definition: AArch64SLSHardening.cpp:75
llvm_unreachable
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
Definition: ErrorHandling.h:143
if
if(llvm_vc STREQUAL "") set(fake_version_inc "$
Definition: CMakeLists.txt:14
getRegisterOrZero
static MCRegister getRegisterOrZero(MCRegister Reg, bool HasSVE)
Definition: AArch64FrameLowering.cpp:679
uint32_t
llvm::StackOffset
StackOffset is a class to represent an offset with 2 dimensions, named fixed and scalable,...
Definition: TypeSize.h:134
llvm::ilist_node_impl::getIterator
self_iterator getIterator()
Definition: ilist_node.h:82
TargetSubtargetInfo.h
DL
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
Definition: AArch64SLSHardening.cpp:76
llvm::SwiftAsyncFramePointerMode::DeploymentBased
@ DeploymentBased
Determine whether to set the bit statically or dynamically based on the deployment target.
llvm::AArch64FunctionInfo::getTailCallReservedStack
unsigned getTailCallReservedStack() const
Definition: AArch64MachineFunctionInfo.h:214
CC
auto CC
Definition: RISCVRedundantCopyElimination.cpp:79
llvm::AArch64RegisterInfo
Definition: AArch64RegisterInfo.h:26
llvm::AArch64FunctionInfo::setMinMaxSVECSFrameIndex
void setMinMaxSVECSFrameIndex(int Min, int Max)
Definition: AArch64MachineFunctionInfo.h:316
llvm::MCContext::createTempSymbol
MCSymbol * createTempSymbol()
Create a temporary symbol with a unique name.
Definition: MCContext.cpp:324
llvm::CodeModel::Tiny
@ Tiny
Definition: CodeGen.h:28
llvm::EHPersonality
EHPersonality
Definition: EHPersonalities.h:21
llvm::StackOffset::getFixed
static StackOffset getFixed(ScalarTy Fixed)
Definition: TypeSize.h:143
llvm::TargetSubtargetInfo
TargetSubtargetInfo - Generic base class for all target subtargets.
Definition: TargetSubtargetInfo.h:60
Prolog
@ Prolog
Definition: AArch64LowerHomogeneousPrologEpilog.cpp:125
llvm::MachineMemOperand::MOLoad
@ MOLoad
The memory access reads data.
Definition: MachineMemOperand.h:134
MRI
unsigned const MachineRegisterInfo * MRI
Definition: AArch64AdvSIMDScalarPass.cpp:105
llvm::MachineFrameInfo::getMaxAlign
Align getMaxAlign() const
Return the alignment in bytes that this function must be aligned to, which is greater than the defaul...
Definition: MachineFrameInfo.h:593
llvm::Register
Wrapper class representing virtual and physical registers.
Definition: Register.h:19
llvm::MachineFunction::addFrameInst
unsigned addFrameInst(const MCCFIInstruction &Inst)
Definition: MachineFunction.cpp:310
llvm::make_scope_exit
detail::scope_exit< std::decay_t< Callable > > make_scope_exit(Callable &&F)
Definition: ScopeExit.h:59
llvm::Function::hasOptSize
bool hasOptSize() const
Optimize this function for size (-Os) or minimum size (-Oz).
Definition: Function.h:664
llvm::MachineBasicBlock::addLiveIn
void addLiveIn(MCRegister PhysReg, LaneBitmask LaneMask=LaneBitmask::getAll())
Adds the specified register as a live in.
Definition: MachineBasicBlock.h:404
llvm::MachineFrameInfo::isFrameAddressTaken
bool isFrameAddressTaken() const
This method may be called any time after instruction selection is complete to determine if there is a...
Definition: MachineFrameInfo.h:370
llvm::AArch64_AM::getArithExtendImm
static unsigned getArithExtendImm(AArch64_AM::ShiftExtendType ET, unsigned Imm)
getArithExtendImm - Encode the extend type and shift amount for an arithmetic instruction: imm: 3-bit...
Definition: AArch64AddressingModes.h:171
llvm::MachineFrameInfo::hasCalls
bool hasCalls() const
Return true if the current function has any function calls.
Definition: MachineFrameInfo.h:605
llvm::AArch64FrameLowering::hasReservedCallFrame
bool hasReservedCallFrame(const MachineFunction &MF) const override
hasReservedCallFrame - Under normal circumstances, when a frame pointer is not required,...
Definition: AArch64FrameLowering.cpp:462
emitShadowCallStackEpilogue
static void emitShadowCallStackEpilogue(const TargetInstrInfo &TII, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL)
Definition: AArch64FrameLowering.cpp:1353
CallingConv.h
MBB
MachineBasicBlock & MBB
Definition: AArch64SLSHardening.cpp:74
llvm::AArch64Subtarget::getRegisterInfo
const AArch64RegisterInfo * getRegisterInfo() const override
Definition: AArch64Subtarget.h:175
Attributes.h
emitCalleeSavedRestores
static void emitCalleeSavedRestores(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, bool SVE)
Definition: AArch64FrameLowering.cpp:636
isFuncletReturnInstr
static bool isFuncletReturnInstr(const MachineInstr &MI)
Definition: AArch64FrameLowering.cpp:1887
llvm::BitVector::test
bool test(unsigned Idx) const
Definition: BitVector.h:454
llvm::stable_sort
void stable_sort(R &&Range)
Definition: STLExtras.h:1752
llvm::AArch64_AM::UXTX
@ UXTX
Definition: AArch64AddressingModes.h:44
llvm::AArch64FrameLowering::getWinEHParentFrameOffset
unsigned getWinEHParentFrameOffset(const MachineFunction &MF) const override
The parent frame offset (aka dispFrame) is only used on X86_64 to retrieve the parent's frame pointer...
Definition: AArch64FrameLowering.cpp:3769
llvm::MachineRegisterInfo::isLiveIn
bool isLiveIn(Register Reg) const
Definition: MachineRegisterInfo.cpp:438
kSetTagLoopThreshold
static const int kSetTagLoopThreshold
Definition: AArch64SelectionDAGInfo.cpp:119
llvm::AArch64FunctionInfo::hasCalleeSaveStackFreeSpace
bool hasCalleeSaveStackFreeSpace() const
Definition: AArch64MachineFunctionInfo.h:234
llvm::MachineFunction::getFunction
Function & getFunction()
Return the LLVM function that this machine code represents.
Definition: MachineFunction.h:622
llvm::TargetRegisterInfo::getRegSizeInBits
unsigned getRegSizeInBits(const TargetRegisterClass &RC) const
Return the size in bits of a register from class RC.
Definition: TargetRegisterInfo.h:280
uint16_t
llvm::AArch64FunctionInfo::getStackSizeSVE
uint64_t getStackSizeSVE() const
Definition: AArch64MachineFunctionInfo.h:226
llvm::MachineFunction::getTarget
const LLVMTargetMachine & getTarget() const
getTarget - Return the target machine this machine code is compiled with
Definition: MachineFunction.h:652
MachineFrameInfo.h
llvm::MachineFunction::getWinEHFuncInfo
const WinEHFuncInfo * getWinEHFuncInfo() const
getWinEHFuncInfo - Return information about how the current function uses Windows exception handling.
Definition: MachineFunction.h:700
llvm::AArch64FrameLowering::processFunctionBeforeFrameIndicesReplaced
void processFunctionBeforeFrameIndicesReplaced(MachineFunction &MF, RegScavenger *RS) const override
processFunctionBeforeFrameIndicesReplaced - This method is called immediately before MO_FrameIndex op...
Definition: AArch64FrameLowering.cpp:3735
llvm::MachineOperand::getIndex
int getIndex() const
Definition: MachineOperand.h:566
llvm::CallingConv::SwiftTail
@ SwiftTail
This follows the Swift calling convention in how arguments are passed but guarantees tail calls will ...
Definition: CallingConv.h:87
Success
#define Success
Definition: AArch64Disassembler.cpp:280
llvm::TypeSize
Definition: TypeSize.h:435
Function.h
needsShadowCallStackPrologueEpilogue
static bool needsShadowCallStackPrologueEpilogue(MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:1301
llvm::TargetStackID::Value
Value
Definition: TargetFrameLowering.h:27
insertCFISameValue
static void insertCFISameValue(const MCInstrDesc &Desc, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator InsertPt, unsigned DwarfReg)
Definition: AArch64FrameLowering.cpp:585
llvm::AArch64FunctionInfo::setLocalStackSize
void setLocalStackSize(uint64_t Size)
Definition: AArch64MachineFunctionInfo.h:243
llvm::MachineInstrBuilder::setMemRefs
const MachineInstrBuilder & setMemRefs(ArrayRef< MachineMemOperand * > MMOs) const
Definition: MachineInstrBuilder.h:208
AArch64MCTargetDesc.h
llvm::AArch64FunctionInfo::getSVECalleeSavedStackSize
unsigned getSVECalleeSavedStackSize() const
Definition: AArch64MachineFunctionInfo.h:312
llvm::SmallVectorImpl::clear
void clear()
Definition: SmallVector.h:597
llvm::AArch64FrameLowering::getSEHFrameIndexOffset
int getSEHFrameIndexOffset(const MachineFunction &MF, int FI) const
Definition: AArch64FrameLowering.cpp:2259
llvm::AArch64FrameLowering::getFrameIndexReferencePreferSP
StackOffset getFrameIndexReferencePreferSP(const MachineFunction &MF, int FI, Register &FrameReg, bool IgnoreSPUpdates) const override
For Win64 AArch64 EH, the offset to the Unwind object is from the SP before the update.
Definition: AArch64FrameLowering.cpp:3746
llvm::MachineFunction::hasEHFunclets
bool hasEHFunclets() const
Definition: MachineFunction.h:1099
llvm::MachineMemOperand::MOStore
@ MOStore
The memory access writes data.
Definition: MachineMemOperand.h:136
llvm::AMDGPU::Hwreg::Width
Width
Definition: SIDefines.h:439
WinEHFuncInfo.h
llvm::AArch64FrameLowering::emitPrologue
void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override
emitProlog/emitEpilog - These methods insert prolog and epilog code into the function.
Definition: AArch64FrameLowering.cpp:1375
llvm::RISCVMatInt::Imm
@ Imm
Definition: RISCVMatInt.h:23
llvm::TargetRegisterInfo::hasStackRealignment
bool hasStackRealignment(const MachineFunction &MF) const
True if stack realignment is required and still possible.
Definition: TargetRegisterInfo.h:968
llvm::AArch64Subtarget::isTargetMachO
bool isTargetMachO() const
Definition: AArch64Subtarget.h:261
llvm::AArch64FunctionInfo::getArgumentStackToRestore
unsigned getArgumentStackToRestore() const
Definition: AArch64MachineFunctionInfo.h:209
llvm::CodeModel::Large
@ Large
Definition: CodeGen.h:28
InsertReturnAddressAuth
static void InsertReturnAddressAuth(MachineFunction &MF, MachineBasicBlock &MBB)
Definition: AArch64FrameLowering.cpp:1849
invalidateRegisterPairing
static bool invalidateRegisterPairing(unsigned Reg1, unsigned Reg2, bool UsesWinAAPCS, bool NeedsWinCFI, bool NeedsFrameRecord, bool IsFirst)
Returns true if Reg1 and Reg2 cannot be paired using a ldp/stp instruction.
Definition: AArch64FrameLowering.cpp:2463
llvm::getKillRegState
unsigned getKillRegState(bool B)
Definition: MachineInstrBuilder.h:546
llvm::AArch64TargetLowering::getRedZoneSize
unsigned getRedZoneSize(const Function &F) const
Definition: AArch64ISelLowering.h:864
llvm::MachineFrameInfo
The MachineFrameInfo class represents an abstract stack frame until prolog/epilog code is inserted.
Definition: MachineFrameInfo.h:105
AArch64Subtarget.h
llvm::BuildMI
MachineInstrBuilder BuildMI(MachineFunction &MF, const MIMetadata &MIMD, const MCInstrDesc &MCID)
Builder interface. Specify how to create the initial instruction itself.
Definition: MachineInstrBuilder.h:357
SmallVector.h
estimateRSStackSizeLimit
static unsigned estimateRSStackSizeLimit(MachineFunction &MF)
Look at each instruction that references stack frames and return the stack size limit beyond which so...
Definition: AArch64FrameLowering.cpp:350
llvm::MachinePointerInfo::getFixedStack
static MachinePointerInfo getFixedStack(MachineFunction &MF, int FI, int64_t Offset=0)
Return a MachinePointerInfo record that refers to the specified FrameIndex.
Definition: MachineOperand.cpp:1018
llvm::MachineBasicBlock::begin
iterator begin()
Definition: MachineBasicBlock.h:305
MachineInstrBuilder.h
llvm::AArch64FrameLowering::emitEpilogue
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override
Definition: AArch64FrameLowering.cpp:1897
llvm::MachineInstrBuilder::setMIFlags
const MachineInstrBuilder & setMIFlags(unsigned Flags) const
Definition: MachineInstrBuilder.h:273
llvm::AArch64FunctionInfo::hasSwiftAsyncContext
bool hasSwiftAsyncContext() const
Definition: AArch64MachineFunctionInfo.h:433
getFPOffset
static StackOffset getFPOffset(const MachineFunction &MF, int64_t ObjectOffset)
Definition: AArch64FrameLowering.cpp:2238
llvm::AArch64FrameLowering::canUseAsPrologue
bool canUseAsPrologue(const MachineBasicBlock &MBB) const override
Check whether or not the given MBB can be used as a prologue for the target.
Definition: AArch64FrameLowering.cpp:856
llvm::ArrayRef::size
size_t size() const
size - Get the array size.
Definition: ArrayRef.h:164
llvm::AArch64FrameLowering::getWinEHFuncletFrameSize
unsigned getWinEHFuncletFrameSize(const MachineFunction &MF) const
Funclets only need to account for space for the callee saved registers, as the locals are accounted f...
Definition: AArch64FrameLowering.cpp:3776
llvm::MachineBasicBlock::empty
bool empty() const
Definition: MachineBasicBlock.h:277
llvm::CallingConv::GHC
@ GHC
Used by the Glasgow Haskell Compiler (GHC).
Definition: CallingConv.h:50
ScopeExit.h
llvm::AArch64FunctionInfo::getCalleeSavedStackSize
unsigned getCalleeSavedStackSize(const MachineFrameInfo &MFI) const
Definition: AArch64MachineFunctionInfo.h:259
MachineMemOperand.h
llvm::SmallVectorImpl
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: APFloat.h:42
llvm::reverse
auto reverse(ContainerTy &&C)
Definition: STLExtras.h:365
MachineOperand.h
llvm::TargetFrameLowering::StackGrowsDown
@ StackGrowsDown
Definition: TargetFrameLowering.h:47
llvm::Function::hasMinSize
bool hasMinSize() const
Optimize this function for minimum size (-Oz).
Definition: Function.h:661
llvm::MCCFIInstruction::createOffset
static MCCFIInstruction createOffset(MCSymbol *L, unsigned Register, int Offset)
.cfi_offset Previous value of Register is saved at offset Offset from CFA.
Definition: MCDwarf.h:570
llvm::AArch64FrameLowering::determineCalleeSaves
void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS) const override
This method determines which of the registers reported by TargetRegisterInfo::getCalleeSavedRegs() sh...
Definition: AArch64FrameLowering.cpp:2949
llvm::RegState::Kill
@ Kill
The last use of a register.
Definition: MachineInstrBuilder.h:48
llvm::RegState::Dead
@ Dead
Unused definition.
Definition: MachineInstrBuilder.h:50
llvm::AArch64FrameLowering::restoreCalleeSavedRegisters
bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, MutableArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI) const override
restoreCalleeSavedRegisters - Issues instruction(s) to restore all callee saved registers and returns...
Definition: AArch64FrameLowering.cpp:2820
BB
Common register allocation spilling lr str ldr sxth r3 ldr mla r4 can lr mov lr str ldr sxth r3 mla r4 and then merge mul and lr str ldr sxth r3 mla r4 It also increase the likelihood the store may become dead bb27 Successors according to LLVM BB
Definition: README.txt:39
llvm::MachineFrameInfo::getStackProtectorIndex
int getStackProtectorIndex() const
Return the index for the stack protector object.
Definition: MachineFrameInfo.h:357
llvm::TargetMachine::getCodeModel
CodeModel::Model getCodeModel() const
Returns the code model.
Definition: TargetMachine.h:225
llvm::TargetFrameLowering::determineCalleeSaves
virtual void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS=nullptr) const
This method determines which of the registers reported by TargetRegisterInfo::getCalleeSavedRegs() sh...
Definition: TargetFrameLoweringImpl.cpp:83
llvm::MachineFrameInfo::hasStackMap
bool hasStackMap() const
This method may be called any time after instruction selection is complete to determine if there is a...
Definition: MachineFrameInfo.h:382
llvm::DebugLoc
A debug info location.
Definition: DebugLoc.h:33
llvm::cl::desc
Definition: CommandLine.h:412
llvm::TargetRegisterInfo::getMinimalPhysRegClass
const TargetRegisterClass * getMinimalPhysRegClass(MCRegister Reg, MVT VT=MVT::Other) const
Returns the Register Class of a physical register of the given type, picking the most sub register cl...
Definition: TargetRegisterInfo.cpp:212
RegisterScavenging.h
llvm::AArch64Subtarget
Definition: AArch64Subtarget.h:38
raw_ostream.h
MachineFunction.h
llvm::printReg
Printable printReg(Register Reg, const TargetRegisterInfo *TRI=nullptr, unsigned SubIdx=0, const MachineRegisterInfo *MRI=nullptr)
Prints virtual and physical registers with or without a TRI instance.
Definition: TargetRegisterInfo.cpp:111
llvm::AArch64FrameLowering::orderFrameObjects
void orderFrameObjects(const MachineFunction &MF, SmallVectorImpl< int > &ObjectsToAllocate) const override
Order the symbols in the local stack frame.
Definition: AArch64FrameLowering.cpp:3850
llvm::AArch64RegisterInfo::hasBasePointer
bool hasBasePointer(const MachineFunction &MF) const
Definition: AArch64RegisterInfo.cpp:489
llvm::createCFAOffset
MCCFIInstruction createCFAOffset(const TargetRegisterInfo &MRI, unsigned Reg, const StackOffset &OffsetFromDefCFA)
Definition: AArch64InstrInfo.cpp:4200
llvm::MachineInstr::eraseFromParent
void eraseFromParent()
Unlink 'this' from the containing basic block and delete it.
Definition: MachineInstr.cpp:684
llvm::MachineInstrBundleIterator< MachineInstr >
llvm::MCAsmInfo::usesWindowsCFI
bool usesWindowsCFI() const
Definition: MCAsmInfo.h:793
llvm::abs
APFloat abs(APFloat X)
Returns the absolute value of the argument.
Definition: APFloat.h:1282
llvm::HexagonInstrInfo::copyPhysReg
void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, const DebugLoc &DL, MCRegister DestReg, MCRegister SrcReg, bool KillSrc) const override
Emit instructions to copy a pair of physical registers.
Definition: HexagonInstrInfo.cpp:854
llvm::AArch64II::MO_GOT
@ MO_GOT
MO_GOT - This flag indicates that a symbol operand represents the address of the GOT entry for the sy...
Definition: AArch64BaseInfo.h:753
TargetRegisterInfo.h
Debug.h
needsWinCFI
static bool needsWinCFI(const MachineFunction &MF)
Definition: AArch64FrameLowering.cpp:888
llvm::MachineBasicBlock::end
iterator end()
Definition: MachineBasicBlock.h:307
llvm::SwiftAsyncFramePointerMode::Always
@ Always
Always set the bit.
llvm::AArch64RegisterInfo::isReservedReg
bool isReservedReg(const MachineFunction &MF, MCRegister Reg) const
Definition: AArch64RegisterInfo.cpp:439
llvm::AArch64InstrInfo::isSEHInstruction
static bool isSEHInstruction(const MachineInstr &MI)
Return true if the instructions is a SEH instruciton used for unwinding on Windows.
Definition: AArch64InstrInfo.cpp:997
llvm::AArch64FunctionInfo::setStackSizeSVE
void setStackSizeSVE(uint64_t S)
Definition: AArch64MachineFunctionInfo.h:221
llvm::SmallVectorImpl::emplace_back
reference emplace_back(ArgTypes &&... Args)
Definition: SmallVector.h:924
getArgumentStackToRestore
static int64_t getArgumentStackToRestore(MachineFunction &MF, MachineBasicBlock &MBB)
Returns how much of the incoming argument stack area (in bytes) we should clean up in an epilogue.
Definition: AArch64FrameLowering.cpp:268
llvm::MCRegister
Wrapper class representing physical registers. Should be passed by value.
Definition: MCRegister.h:24
IsSVECalleeSave
static bool IsSVECalleeSave(MachineBasicBlock::iterator I)
Definition: AArch64FrameLowering.cpp:1288
LivePhysRegs.h
llvm::MCCFIInstruction::createRestore
static MCCFIInstruction createRestore(MCSymbol *L, unsigned Register)
.cfi_restore says that the rule for Register is now the same as it was at the beginning of the functi...
Definition: MCDwarf.h:603
llvm::StackOffset::get
static StackOffset get(ScalarTy Fixed, ScalarTy Scalable)
Definition: TypeSize.h:145
<