LLVM  14.0.0git
IRSimilarityIdentifier.h
Go to the documentation of this file.
1 //===- IRSimilarityIdentifier.h - Find similarity in a module --------------==//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // \file
10 // Interface file for the IRSimilarityIdentifier for identifying similarities in
11 // IR including the IRInstructionMapper, which maps an Instruction to unsigned
12 // integers.
13 //
14 // Two sequences of instructions are called "similar" if they perform the same
15 // series of operations for all inputs.
16 //
17 // \code
18 // %1 = add i32 %a, 10
19 // %2 = add i32 %a, %1
20 // %3 = icmp slt icmp %1, %2
21 // \endcode
22 //
23 // and
24 //
25 // \code
26 // %1 = add i32 11, %a
27 // %2 = sub i32 %a, %1
28 // %3 = icmp sgt icmp %2, %1
29 // \endcode
30 //
31 // ultimately have the same result, even if the inputs, and structure are
32 // slightly different.
33 //
34 // For instructions, we do not worry about operands that do not have fixed
35 // semantic meaning to the program. We consider the opcode that the instruction
36 // has, the types, parameters, and extra information such as the function name,
37 // or comparison predicate. These are used to create a hash to map instructions
38 // to integers to be used in similarity matching in sequences of instructions
39 //
40 // Terminology:
41 // An IRSimilarityCandidate is a region of IRInstructionData (wrapped
42 // Instructions), usually used to denote a region of similarity has been found.
43 //
44 // A SimilarityGroup is a set of IRSimilarityCandidates that are structurally
45 // similar to one another.
46 //
47 //===----------------------------------------------------------------------===//
48 
49 #ifndef LLVM_ANALYSIS_IRSIMILARITYIDENTIFIER_H
50 #define LLVM_ANALYSIS_IRSIMILARITYIDENTIFIER_H
51 
52 #include "llvm/IR/InstVisitor.h"
53 #include "llvm/IR/Instructions.h"
54 #include "llvm/IR/Module.h"
55 #include "llvm/IR/PassManager.h"
56 #include "llvm/Pass.h"
57 #include "llvm/Support/Allocator.h"
58 
59 namespace llvm {
60 namespace IRSimilarity {
61 
63 
64 /// This represents what is and is not supported when finding similarity in
65 /// Instructions.
66 ///
67 /// Legal Instructions are considered when looking at similarity between
68 /// Instructions.
69 ///
70 /// Illegal Instructions cannot be considered when looking for similarity
71 /// between Instructions. They act as boundaries between similarity regions.
72 ///
73 /// Invisible Instructions are skipped over during analysis.
74 // TODO: Shared with MachineOutliner
76 
77 /// This provides the utilities for hashing an Instruction to an unsigned
78 /// integer. Two IRInstructionDatas produce the same hash value when their
79 /// underlying Instructions perform the same operation (even if they don't have
80 /// the same input operands.)
81 /// As a more concrete example, consider the following:
82 ///
83 /// \code
84 /// %add1 = add i32 %a, %b
85 /// %add2 = add i32 %c, %d
86 /// %add3 = add i64 %e, %f
87 /// \endcode
88 ///
89 // Then the IRInstructionData wrappers for these Instructions may be hashed like
90 /// so:
91 ///
92 /// \code
93 /// ; These two adds have the same types and operand types, so they hash to the
94 /// ; same number.
95 /// %add1 = add i32 %a, %b ; Hash: 1
96 /// %add2 = add i32 %c, %d ; Hash: 1
97 /// ; This add produces an i64. This differentiates it from %add1 and %add2. So,
98 /// ; it hashes to a different number.
99 /// %add3 = add i64 %e, %f; Hash: 2
100 /// \endcode
101 ///
102 ///
103 /// This hashing scheme will be used to represent the program as a very long
104 /// string. This string can then be placed in a data structure which can be used
105 /// for similarity queries.
106 ///
107 /// TODO: Handle types of Instructions which can be equal even with different
108 /// operands. (E.g. comparisons with swapped predicates.)
109 /// TODO: Handle CallInsts, which are only checked for function type
110 /// by \ref isSameOperationAs.
111 /// TODO: Handle GetElementPtrInsts, as some of the operands have to be the
112 /// exact same, and some do not.
114  : ilist_node<IRInstructionData, ilist_sentinel_tracking<true>> {
115 
116  /// The source Instruction that is being wrapped.
117  Instruction *Inst = nullptr;
118  /// The values of the operands in the Instruction.
120  /// The legality of the wrapped instruction. This is informed by InstrType,
121  /// and is used when checking when two instructions are considered similar.
122  /// If either instruction is not legal, the instructions are automatically not
123  /// considered similar.
124  bool Legal;
125 
126  /// This is only relevant if we are wrapping a CmpInst where we needed to
127  /// change the predicate of a compare instruction from a greater than form
128  /// to a less than form. It is None otherwise.
130 
131  /// This structure holds the distances of how far "ahead of" or "behind" the
132  /// target blocks of a branch, or the incoming blocks of a phi nodes are.
133  /// If the value is negative, it means that the block was registered before
134  /// the block of this instruction in terms of blocks in the function.
135  /// Code Example:
136  /// \code
137  /// block_1:
138  /// br i1 %0, label %block_2, label %block_3
139  /// block_2:
140  /// br i1 %1, label %block_1, label %block_2
141  /// block_3:
142  /// br i1 %2, label %block_2, label %block_1
143  /// ; Replacing the labels with relative values, this becomes:
144  /// block_1:
145  /// br i1 %0, distance 1, distance 2
146  /// block_2:
147  /// br i1 %1, distance -1, distance 0
148  /// block_3:
149  /// br i1 %2, distance -1, distance -2
150  /// \endcode
151  /// Taking block_2 as our example, block_1 is "behind" block_2, and block_2 is
152  /// "ahead" of block_2.
154 
155  /// Gather the information that is difficult to gather for an Instruction, or
156  /// is changed. i.e. the operands of an Instruction and the Types of those
157  /// operands. This extra information allows for similarity matching to make
158  /// assertions that allow for more flexibility when checking for whether an
159  /// Instruction performs the same operation.
162 
163  /// Fills data stuctures for IRInstructionData when it is constructed from a
164  // reference or a pointer.
165  void initializeInstruction();
166 
167  /// Get the predicate that the compare instruction is using for hashing the
168  /// instruction. the IRInstructionData must be wrapping a CmpInst.
170 
171  /// A function that swaps the predicates to their less than form if they are
172  /// in a greater than form. Otherwise, the predicate is unchanged.
173  ///
174  /// \param CI - The comparison operation to find a consistent preidcate for.
175  /// \return the consistent comparison predicate.
177 
178  /// For an IRInstructionData containing a branch, finds the
179  /// relative distances from the source basic block to the target by taking
180  /// the difference of the number assigned to the current basic block and the
181  /// target basic block of the branch.
182  ///
183  /// \param BasicBlockToInteger - The mapping of basic blocks to their location
184  /// in the module.
185  void
187 
188  /// Hashes \p Value based on its opcode, types, and operand types.
189  /// Two IRInstructionData instances produce the same hash when they perform
190  /// the same operation.
191  ///
192  /// As a simple example, consider the following instructions.
193  ///
194  /// \code
195  /// %add1 = add i32 %x1, %y1
196  /// %add2 = add i32 %x2, %y2
197  ///
198  /// %sub = sub i32 %x1, %y1
199  ///
200  /// %add_i64 = add i64 %x2, %y2
201  /// \endcode
202  ///
203  /// Because the first two adds operate the same types, and are performing the
204  /// same action, they will be hashed to the same value.
205  ///
206  /// However, the subtraction instruction is not the same as an addition, and
207  /// will be hashed to a different value.
208  ///
209  /// Finally, the last add has a different type compared to the first two add
210  /// instructions, so it will also be hashed to a different value that any of
211  /// the previous instructions.
212  ///
213  /// \param [in] ID - The IRInstructionData instance to be hashed.
214  /// \returns A hash_value of the IRInstructionData.
216  SmallVector<Type *, 4> OperTypes;
217  for (Value *V : ID.OperVals)
218  OperTypes.push_back(V->getType());
219 
220  if (isa<CmpInst>(ID.Inst))
221  return llvm::hash_combine(
222  llvm::hash_value(ID.Inst->getOpcode()),
223  llvm::hash_value(ID.Inst->getType()),
224  llvm::hash_value(ID.getPredicate()),
225  llvm::hash_combine_range(OperTypes.begin(), OperTypes.end()));
226  else if (CallInst *CI = dyn_cast<CallInst>(ID.Inst))
227  return llvm::hash_combine(
228  llvm::hash_value(ID.Inst->getOpcode()),
229  llvm::hash_value(ID.Inst->getType()),
230  llvm::hash_value(CI->getCalledFunction()->getName().str()),
231  llvm::hash_combine_range(OperTypes.begin(), OperTypes.end()));
232  return llvm::hash_combine(
233  llvm::hash_value(ID.Inst->getOpcode()),
234  llvm::hash_value(ID.Inst->getType()),
235  llvm::hash_combine_range(OperTypes.begin(), OperTypes.end()));
236  }
237 
239 };
240 
242  : simple_ilist<IRInstructionData, ilist_sentinel_tracking<true>> {};
243 
244 /// Compare one IRInstructionData class to another IRInstructionData class for
245 /// whether they are performing a the same operation, and can mapped to the
246 /// same value. For regular instructions if the hash value is the same, then
247 /// they will also be close.
248 ///
249 /// \param A - The first IRInstructionData class to compare
250 /// \param B - The second IRInstructionData class to compare
251 /// \returns true if \p A and \p B are similar enough to be mapped to the same
252 /// value.
253 bool isClose(const IRInstructionData &A, const IRInstructionData &B);
254 
255 struct IRInstructionDataTraits : DenseMapInfo<IRInstructionData *> {
256  static inline IRInstructionData *getEmptyKey() { return nullptr; }
258  return reinterpret_cast<IRInstructionData *>(-1);
259  }
260 
261  static unsigned getHashValue(const IRInstructionData *E) {
262  using llvm::hash_value;
263  assert(E && "IRInstructionData is a nullptr?");
264  return hash_value(*E);
265  }
266 
267  static bool isEqual(const IRInstructionData *LHS,
268  const IRInstructionData *RHS) {
269  if (RHS == getEmptyKey() || RHS == getTombstoneKey() ||
270  LHS == getEmptyKey() || LHS == getTombstoneKey())
271  return LHS == RHS;
272 
273  assert(LHS && RHS && "nullptr should have been caught by getEmptyKey?");
274  return isClose(*LHS, *RHS);
275  }
276 };
277 
278 /// Helper struct for converting the Instructions in a Module into a vector of
279 /// unsigned integers. This vector of unsigned integers can be thought of as a
280 /// "numeric string". This numeric string can then be queried by, for example,
281 /// data structures that find repeated substrings.
282 ///
283 /// This hashing is done per BasicBlock in the module. To hash Instructions
284 /// based off of their operations, each Instruction is wrapped in an
285 /// IRInstructionData struct. The unsigned integer for an IRInstructionData
286 /// depends on:
287 /// - The hash provided by the IRInstructionData.
288 /// - Which member of InstrType the IRInstructionData is classified as.
289 // See InstrType for more details on the possible classifications, and how they
290 // manifest in the numeric string.
291 ///
292 /// The numeric string for an individual BasicBlock is terminated by an unique
293 /// unsigned integer. This prevents data structures which rely on repetition
294 /// from matching across BasicBlocks. (For example, the SuffixTree.)
295 /// As a concrete example, if we have the following two BasicBlocks:
296 /// \code
297 /// bb0:
298 /// %add1 = add i32 %a, %b
299 /// %add2 = add i32 %c, %d
300 /// %add3 = add i64 %e, %f
301 /// bb1:
302 /// %sub = sub i32 %c, %d
303 /// \endcode
304 /// We may hash the Instructions like this (via IRInstructionData):
305 /// \code
306 /// bb0:
307 /// %add1 = add i32 %a, %b ; Hash: 1
308 /// %add2 = add i32 %c, %d; Hash: 1
309 /// %add3 = add i64 %e, %f; Hash: 2
310 /// bb1:
311 /// %sub = sub i32 %c, %d; Hash: 3
312 /// %add4 = add i32 %c, %d ; Hash: 1
313 /// \endcode
314 /// And produce a "numeric string representation" like so:
315 /// 1, 1, 2, unique_integer_1, 3, 1, unique_integer_2
316 ///
317 /// TODO: This is very similar to the MachineOutliner, and should be
318 /// consolidated into the same interface.
320  /// The starting illegal instruction number to map to.
321  ///
322  /// Set to -3 for compatibility with DenseMapInfo<unsigned>.
323  unsigned IllegalInstrNumber = static_cast<unsigned>(-3);
324 
325  /// The next available integer to assign to a legal Instruction to.
326  unsigned LegalInstrNumber = 0;
327 
328  /// Correspondence from IRInstructionData to unsigned integers.
331 
332  /// A mapping for a basic block in a module to its assigned number/location
333  /// in the module.
335 
336  /// Set if we added an illegal number in the previous step.
337  /// Since each illegal number is unique, we only need one of them between
338  /// each range of legal numbers. This lets us make sure we don't add more
339  /// than one illegal number per range.
340  bool AddedIllegalLastTime = false;
341 
342  /// Marks whether we found a illegal instruction in the previous step.
344 
345  /// Marks whether we have found a set of instructions that is long enough
346  /// to be considered for similarity.
347  bool HaveLegalRange = false;
348 
349  /// This allocator pointer is in charge of holding on to the IRInstructionData
350  /// so it is not deallocated until whatever external tool is using it is done
351  /// with the information.
353 
354  /// This allocator pointer is in charge of creating the IRInstructionDataList
355  /// so it is not deallocated until whatever external tool is using it is done
356  /// with the information.
358 
359  /// Get an allocated IRInstructionData struct using the InstDataAllocator.
360  ///
361  /// \param I - The Instruction to wrap with IRInstructionData.
362  /// \param Legality - A boolean value that is true if the instruction is to
363  /// be considered for similarity, and false if not.
364  /// \param IDL - The InstructionDataList that the IRInstructionData is
365  /// inserted into.
366  /// \returns An allocated IRInstructionData struct.
369 
370  /// Get an empty allocated IRInstructionData struct using the
371  /// InstDataAllocator.
372  ///
373  /// \param IDL - The InstructionDataList that the IRInstructionData is
374  /// inserted into.
375  /// \returns An allocated IRInstructionData struct.
377 
378  /// Get an allocated IRInstructionDataList object using the IDLAllocator.
379  ///
380  /// \returns An allocated IRInstructionDataList object.
382 
384 
385  /// Assigns values to all the basic blocks in function \p F starting from
386  /// integer \p BBNumber.
387  ///
388  /// \param F - The function containing the basic blocks to assign numbers to.
389  /// \param BBNumber - The number to start from.
390  void initializeForBBs(Function &F, unsigned &BBNumber) {
391  for (BasicBlock &BB : F)
392  BasicBlockToInteger.insert(std::make_pair(&BB, BBNumber++));
393  }
394 
395  /// Assigns values to all the basic blocks in Module \p M.
396  /// \param M - The module containing the basic blocks to assign numbers to.
398  unsigned BBNumber = 0;
399  for (Function &F : M)
400  initializeForBBs(F, BBNumber);
401  }
402 
403  /// Maps the Instructions in a BasicBlock \p BB to legal or illegal integers
404  /// determined by \p InstrType. Two Instructions are mapped to the same value
405  /// if they are close as defined by the InstructionData class above.
406  ///
407  /// \param [in] BB - The BasicBlock to be mapped to integers.
408  /// \param [in,out] InstrList - Vector of IRInstructionData to append to.
409  /// \param [in,out] IntegerMapping - Vector of unsigned integers to append to.
411  std::vector<IRInstructionData *> &InstrList,
412  std::vector<unsigned> &IntegerMapping);
413 
414  /// Maps an Instruction to a legal integer.
415  ///
416  /// \param [in] It - The Instruction to be mapped to an integer.
417  /// \param [in,out] IntegerMappingForBB - Vector of unsigned integers to
418  /// append to.
419  /// \param [in,out] InstrListForBB - Vector of InstructionData to append to.
420  /// \returns The integer \p It was mapped to.
422  std::vector<unsigned> &IntegerMappingForBB,
423  std::vector<IRInstructionData *> &InstrListForBB);
424 
425  /// Maps an Instruction to an illegal integer.
426  ///
427  /// \param [in] It - The \p Instruction to be mapped to an integer.
428  /// \param [in,out] IntegerMappingForBB - Vector of unsigned integers to
429  /// append to.
430  /// \param [in,out] InstrListForBB - Vector of IRInstructionData to append to.
431  /// \param End - true if creating a dummy IRInstructionData at the end of a
432  /// basic block.
433  /// \returns The integer \p It was mapped to.
434  unsigned mapToIllegalUnsigned(
435  BasicBlock::iterator &It, std::vector<unsigned> &IntegerMappingForBB,
436  std::vector<IRInstructionData *> &InstrListForBB, bool End = false);
437 
440  : InstDataAllocator(IDA), IDLAllocator(IDLA) {
441  // Make sure that the implementation of DenseMapInfo<unsigned> hasn't
442  // changed.
443  assert(DenseMapInfo<unsigned>::getEmptyKey() == static_cast<unsigned>(-1) &&
444  "DenseMapInfo<unsigned>'s empty key isn't -1!");
446  static_cast<unsigned>(-2) &&
447  "DenseMapInfo<unsigned>'s tombstone key isn't -2!");
448 
449  IDL = new (IDLAllocator->Allocate())
451  }
452 
453  /// Custom InstVisitor to classify different instructions for whether it can
454  /// be analyzed for similarity.
456  : public InstVisitor<InstructionClassification, InstrType> {
458 
459  // TODO: Determine a scheme to resolve when the label is similar enough.
461  if (EnableBranches)
462  return Legal;
463  return Illegal;
464  }
465  // TODO: Determine a scheme to resolve when the labels are similar enough.
467  // TODO: Handle allocas.
469  // We exclude variable argument instructions since variable arguments
470  // requires extra checking of the argument list.
472  // We exclude all exception handling cases since they are so context
473  // dependent.
476  // DebugInfo should be included in the regions, but should not be
477  // analyzed for similarity as it has no bearing on the outcome of the
478  // program.
480  // TODO: Handle specific intrinsics.
482  // We only allow call instructions where the function has a name and
483  // is not an indirect call.
485  Function *F = CI.getCalledFunction();
486  if (!F || CI.isIndirectCall() || !F->hasName())
487  return Illegal;
488  return Legal;
489  }
490  // TODO: We do not current handle similarity that changes the control flow.
492  // TODO: We do not current handle similarity that changes the control flow.
494  // TODO: Handle interblock similarity.
497 
498  // The flag variable that lets the classifier know whether we should
499  // allow branches to be checked for similarity.
500  bool EnableBranches = false;
501  };
502 
503  /// Maps an Instruction to a member of InstrType.
505 };
506 
507 /// This is a class that wraps a range of IRInstructionData from one point to
508 /// another in the vector of IRInstructionData, which is a region of the
509 /// program. It is also responsible for defining the structure within this
510 /// region of instructions.
511 ///
512 /// The structure of a region is defined through a value numbering system
513 /// assigned to each unique value in a region at the creation of the
514 /// IRSimilarityCandidate.
515 ///
516 /// For example, for each Instruction we add a mapping for each new
517 /// value seen in that Instruction.
518 /// IR: Mapping Added:
519 /// %add1 = add i32 %a, c1 %add1 -> 3, %a -> 1, c1 -> 2
520 /// %add2 = add i32 %a, %1 %add2 -> 4
521 /// %add3 = add i32 c2, c1 %add3 -> 6, c2 -> 5
522 ///
523 /// We can compare IRSimilarityCandidates against one another.
524 /// The \ref isSimilar function compares each IRInstructionData against one
525 /// another and if we have the same sequences of IRInstructionData that would
526 /// create the same hash, we have similar IRSimilarityCandidates.
527 ///
528 /// We can also compare the structure of IRSimilarityCandidates. If we can
529 /// create a mapping of registers in the region contained by one
530 /// IRSimilarityCandidate to the region contained by different
531 /// IRSimilarityCandidate, they can be considered structurally similar.
532 ///
533 /// IRSimilarityCandidate1: IRSimilarityCandidate2:
534 /// %add1 = add i32 %a, %b %add1 = add i32 %d, %e
535 /// %add2 = add i32 %a, %c %add2 = add i32 %d, %f
536 /// %add3 = add i32 c1, c2 %add3 = add i32 c3, c4
537 ///
538 /// Can have the following mapping from candidate to candidate of:
539 /// %a -> %d, %b -> %e, %c -> %f, c1 -> c3, c2 -> c4
540 /// and can be considered similar.
541 ///
542 /// IRSimilarityCandidate1: IRSimilarityCandidate2:
543 /// %add1 = add i32 %a, %b %add1 = add i32 %d, c4
544 /// %add2 = add i32 %a, %c %add2 = add i32 %d, %f
545 /// %add3 = add i32 c1, c2 %add3 = add i32 c3, c4
546 ///
547 /// We cannot create the same mapping since the use of c4 is not used in the
548 /// same way as %b or c2.
550 private:
551  /// The start index of this IRSimilarityCandidate in the instruction list.
552  unsigned StartIdx = 0;
553 
554  /// The number of instructions in this IRSimilarityCandidate.
555  unsigned Len = 0;
556 
557  /// The first instruction in this IRSimilarityCandidate.
558  IRInstructionData *FirstInst = nullptr;
559 
560  /// The last instruction in this IRSimilarityCandidate.
561  IRInstructionData *LastInst = nullptr;
562 
563  /// Global Value Numbering structures
564  /// @{
565  /// Stores the mapping of the value to the number assigned to it in the
566  /// IRSimilarityCandidate.
567  DenseMap<Value *, unsigned> ValueToNumber;
568  /// Stores the mapping of the number to the value assigned this number.
569  DenseMap<unsigned, Value *> NumberToValue;
570  /// Stores the mapping of a value's number to canonical numbering in the
571  /// candidate's respective similarity group.
572  DenseMap<unsigned, unsigned> NumberToCanonNum;
573  /// Stores the mapping of canonical number in the candidate's respective
574  /// similarity group to a value number.
575  DenseMap<unsigned, unsigned> CanonNumToNumber;
576  /// @}
577 
578 public:
579  /// \param StartIdx - The starting location of the region.
580  /// \param Len - The length of the region.
581  /// \param FirstInstIt - The starting IRInstructionData of the region.
582  /// \param LastInstIt - The ending IRInstructionData of the region.
583  IRSimilarityCandidate(unsigned StartIdx, unsigned Len,
584  IRInstructionData *FirstInstIt,
585  IRInstructionData *LastInstIt);
586 
587  /// \param A - The first IRInstructionCandidate to compare.
588  /// \param B - The second IRInstructionCandidate to compare.
589  /// \returns True when every IRInstructionData in \p A is similar to every
590  /// IRInstructionData in \p B.
591  static bool isSimilar(const IRSimilarityCandidate &A,
592  const IRSimilarityCandidate &B);
593 
594  /// \param [in] A - The first IRInstructionCandidate to compare.
595  /// \param [in] B - The second IRInstructionCandidate to compare.
596  /// \returns True when every IRInstructionData in \p A is structurally similar
597  /// to \p B.
598  static bool compareStructure(const IRSimilarityCandidate &A,
599  const IRSimilarityCandidate &B);
600 
601  /// \param [in] A - The first IRInstructionCandidate to compare.
602  /// \param [in] B - The second IRInstructionCandidate to compare.
603  /// \param [in,out] ValueNumberMappingA - A mapping of value numbers from
604  /// candidate \p A to candidate \B.
605  /// \param [in,out] ValueNumberMappingB - A mapping of value numbers from
606  /// candidate \p B to candidate \A.
607  /// \returns True when every IRInstructionData in \p A is structurally similar
608  /// to \p B.
609  static bool
611  const IRSimilarityCandidate &B,
612  DenseMap<unsigned, DenseSet<unsigned>> &ValueNumberMappingA,
613  DenseMap<unsigned, DenseSet<unsigned>> &ValueNumberMappingB);
614 
615  struct OperandMapping {
616  /// The IRSimilarityCandidate that holds the instruction the OperVals were
617  /// pulled from.
619 
620  /// The operand values to be analyzed.
622 
623  /// The current mapping of global value numbers from one IRSimilarityCandidate
624  /// to another IRSimilarityCandidate.
626  };
627 
628  /// A helper struct to hold the candidate, for a branch instruction, the
629  /// relative location of a label, and the label itself. This is mostly to
630  /// group the values together before passing them as a bundle to a function.
632  /// The IRSimilarityCandidate that holds the instruction the relative
633  /// location was pulled from.
635 
636  /// The relative location to be analyzed.
638 
639  /// The corresponding value.
641  };
642 
643  /// Compare the operands in \p A and \p B and check that the current mapping
644  /// of global value numbers from \p A to \p B and \p B to \A is consistent.
645  ///
646  /// \param A - The first IRInstructionCandidate, operand values, and current
647  /// operand mappings to compare.
648  /// \param B - The second IRInstructionCandidate, operand values, and current
649  /// operand mappings to compare.
650  /// \returns true if the IRSimilarityCandidates operands are compatible.
652  OperandMapping B);
653 
654  /// Compare the operands in \p A and \p B and check that the current mapping
655  /// of global value numbers from \p A to \p B and \p B to \A is consistent
656  /// given that the operands are commutative.
657  ///
658  /// \param A - The first IRInstructionCandidate, operand values, and current
659  /// operand mappings to compare.
660  /// \param B - The second IRInstructionCandidate, operand values, and current
661  /// operand mappings to compare.
662  /// \returns true if the IRSimilarityCandidates operands are compatible.
664  OperandMapping B);
665 
666  /// Compare the relative locations in \p A and \p B and check that the
667  /// distances match if both locations are contained in the region, and that
668  /// the branches both point outside the region if they do not.
669  /// Example Region:
670  /// \code
671  /// entry:
672  /// br i1 %0, label %block_1, label %block_3
673  /// block_0:
674  /// br i1 %0, label %block_1, label %block_2
675  /// block_1:
676  /// br i1 %0, label %block_2, label %block_3
677  /// block_2:
678  /// br i1 %1, label %block_1, label %block_4
679  /// block_3:
680  /// br i1 %2, label %block_2, label %block_5
681  /// \endcode
682  /// If we compare the branches in block_0 and block_1 the relative values are
683  /// 1 and 2 for both, so we consider this a match.
684  ///
685  /// If we compare the branches in entry and block_0 the relative values are
686  /// 2 and 3, and 1 and 2 respectively. Since these are not the same we do not
687  /// consider them a match.
688  ///
689  /// If we compare the branches in block_1 and block_2 the relative values are
690  /// 1 and 2, and -1 and None respectively. As a result we do not consider
691  /// these to be the same
692  ///
693  /// If we compare the branches in block_2 and block_3 the relative values are
694  /// -1 and None for both. We do consider these to be a match.
695  ///
696  /// \param A - The first IRInstructionCandidate, relative location value,
697  /// and incoming block.
698  /// \param B - The second IRInstructionCandidate, relative location value,
699  /// and incoming block.
700  /// \returns true if the relative locations match.
703 
704  /// Create a mapping from the value numbering to a different separate set of
705  /// numbers. This will serve as a guide for relating one candidate to another.
706  /// The canonical number gives use the ability identify which global value
707  /// number in one candidate relates to the global value number in the other.
708  ///
709  /// \param [in, out] CurrCand - The IRSimilarityCandidate to create a
710  /// canonical numbering for.
711  static void createCanonicalMappingFor(IRSimilarityCandidate &CurrCand);
712 
713  /// Create a mapping for the value numbering of the calling
714  /// IRSimilarityCandidate, to a different separate set of numbers, based on
715  /// the canonical ordering in \p SourceCand. These are defined based on the
716  /// found mappings in \p ToSourceMapping and \p FromSourceMapping. Both of
717  /// these relationships should have the same information, just in opposite
718  /// directions.
719  ///
720  /// \param [in, out] SourceCand - The IRSimilarityCandidate to create a
721  /// canonical numbering from.
722  /// \param ToSourceMapping - The mapping of value numbers from this candidate
723  /// to \p SourceCand.
724  /// \param FromSourceMapping - The mapping of value numbers from \p SoureCand
725  /// to this candidate.
727  IRSimilarityCandidate &SourceCand,
728  DenseMap<unsigned, DenseSet<unsigned>> &ToSourceMapping,
729  DenseMap<unsigned, DenseSet<unsigned>> &FromSourceMapping);
730 
731  /// \param [in,out] BBSet - The set to track the basic blocks.
733  for (IRInstructionData &ID : *this) {
734  BasicBlock *BB = ID.Inst->getParent();
735  if (BBSet.contains(BB))
736  continue;
737  BBSet.insert(BB);
738  }
739  }
740 
741  /// \param [in,out] BBSet - The set to track the basic blocks.
742  /// \param [in,out] BBList - A list in order of use to track the basic blocks.
744  SmallVector<BasicBlock *> &BBList) const {
745  for (IRInstructionData &ID : *this) {
746  BasicBlock *BB = ID.Inst->getParent();
747  if (BBSet.contains(BB))
748  continue;
749  BBSet.insert(BB);
750  BBList.push_back(BB);
751  }
752  }
753 
754  /// Compare the start and end indices of the two IRSimilarityCandidates for
755  /// whether they overlap. If the start instruction of one
756  /// IRSimilarityCandidate is less than the end instruction of the other, and
757  /// the start instruction of one is greater than the start instruction of the
758  /// other, they overlap.
759  ///
760  /// \returns true if the IRSimilarityCandidates do not have overlapping
761  /// instructions.
762  static bool overlap(const IRSimilarityCandidate &A,
763  const IRSimilarityCandidate &B);
764 
765  /// \returns the number of instructions in this Candidate.
766  unsigned getLength() const { return Len; }
767 
768  /// \returns the start index of this IRSimilarityCandidate.
769  unsigned getStartIdx() const { return StartIdx; }
770 
771  /// \returns the end index of this IRSimilarityCandidate.
772  unsigned getEndIdx() const { return StartIdx + Len - 1; }
773 
774  /// \returns The first IRInstructionData.
775  IRInstructionData *front() const { return FirstInst; }
776  /// \returns The last IRInstructionData.
777  IRInstructionData *back() const { return LastInst; }
778 
779  /// \returns The first Instruction.
780  Instruction *frontInstruction() { return FirstInst->Inst; }
781  /// \returns The last Instruction
782  Instruction *backInstruction() { return LastInst->Inst; }
783 
784  /// \returns The BasicBlock the IRSimilarityCandidate starts in.
785  BasicBlock *getStartBB() { return FirstInst->Inst->getParent(); }
786  /// \returns The BasicBlock the IRSimilarityCandidate ends in.
787  BasicBlock *getEndBB() { return LastInst->Inst->getParent(); }
788 
789  /// \returns The Function that the IRSimilarityCandidate is located in.
791 
792  /// Finds the positive number associated with \p V if it has been mapped.
793  /// \param [in] V - the Value to find.
794  /// \returns The positive number corresponding to the value.
795  /// \returns None if not present.
797  assert(V != nullptr && "Value is a nullptr?");
798  DenseMap<Value *, unsigned>::iterator VNIt = ValueToNumber.find(V);
799  if (VNIt == ValueToNumber.end())
800  return None;
801  return VNIt->second;
802  }
803 
804  /// Finds the Value associate with \p Num if it exists.
805  /// \param [in] Num - the number to find.
806  /// \returns The Value associated with the number.
807  /// \returns None if not present.
808  Optional<Value *> fromGVN(unsigned Num) {
809  DenseMap<unsigned, Value *>::iterator VNIt = NumberToValue.find(Num);
810  if (VNIt == NumberToValue.end())
811  return None;
812  assert(VNIt->second != nullptr && "Found value is a nullptr!");
813  return VNIt->second;
814  }
815 
816  /// Find the canonical number from the global value number \p N stored in the
817  /// candidate.
818  ///
819  /// \param N - The global value number to find the canonical number for.
820  /// \returns An optional containing the value, and None if it could not be
821  /// found.
823  DenseMap<unsigned, unsigned>::iterator NCIt = NumberToCanonNum.find(N);
824  if (NCIt == NumberToCanonNum.end())
825  return None;
826  return NCIt->second;
827  }
828 
829  /// Find the global value number from the canonical number \p N stored in the
830  /// candidate.
831  ///
832  /// \param N - The canonical number to find the global vlaue number for.
833  /// \returns An optional containing the value, and None if it could not be
834  /// found.
836  DenseMap<unsigned, unsigned>::iterator CNIt = CanonNumToNumber.find(N);
837  if (CNIt == CanonNumToNumber.end())
838  return None;
839  return CNIt->second;
840  }
841 
842  /// \param RHS -The IRSimilarityCandidate to compare against
843  /// \returns true if the IRSimilarityCandidate is occurs after the
844  /// IRSimilarityCandidate in the program.
845  bool operator<(const IRSimilarityCandidate &RHS) const {
846  return getStartIdx() > RHS.getStartIdx();
847  }
848 
850  iterator begin() const { return iterator(front()); }
851  iterator end() const { return std::next(iterator(back())); }
852 };
853 
857 typedef std::vector<IRSimilarityCandidate> SimilarityGroup;
858 typedef std::vector<SimilarityGroup> SimilarityGroupList;
859 
860 /// This class puts all the pieces of the IRInstructionData,
861 /// IRInstructionMapper, IRSimilarityCandidate together.
862 ///
863 /// It first feeds the Module or vector of Modules into the IRInstructionMapper,
864 /// and puts all the mapped instructions into a single long list of
865 /// IRInstructionData.
866 ///
867 /// The list of unsigned integers is given to the Suffix Tree or similar data
868 /// structure to find repeated subsequences. We construct an
869 /// IRSimilarityCandidate for each instance of the subsequence. We compare them
870 /// against one another since These repeated subsequences can have different
871 /// structure. For each different kind of structure found, we create a
872 /// similarity group.
873 ///
874 /// If we had four IRSimilarityCandidates A, B, C, and D where A, B and D are
875 /// structurally similar to one another, while C is different we would have two
876 /// SimilarityGroups:
877 ///
878 /// SimilarityGroup 1: SimilarityGroup 2
879 /// A, B, D C
880 ///
881 /// A list of the different similarity groups is then returned after
882 /// analyzing the module.
884 public:
885  IRSimilarityIdentifier(bool MatchBranches = true)
886  : Mapper(&InstDataAllocator, &InstDataListAllocator),
887  EnableBranches(MatchBranches) {}
888 
889 private:
890  /// Map the instructions in the module to unsigned integers, using mapping
891  /// already present in the Mapper if possible.
892  ///
893  /// \param [in] M Module - To map to integers.
894  /// \param [in,out] InstrList - The vector to append IRInstructionData to.
895  /// \param [in,out] IntegerMapping - The vector to append integers to.
896  void populateMapper(Module &M, std::vector<IRInstructionData *> &InstrList,
897  std::vector<unsigned> &IntegerMapping);
898 
899  /// Map the instructions in the modules vector to unsigned integers, using
900  /// mapping already present in the mapper if possible.
901  ///
902  /// \param [in] Modules - The list of modules to use to populate the mapper
903  /// \param [in,out] InstrList - The vector to append IRInstructionData to.
904  /// \param [in,out] IntegerMapping - The vector to append integers to.
905  void populateMapper(ArrayRef<std::unique_ptr<Module>> &Modules,
906  std::vector<IRInstructionData *> &InstrList,
907  std::vector<unsigned> &IntegerMapping);
908 
909  /// Find the similarity candidates in \p InstrList and corresponding
910  /// \p UnsignedVec
911  ///
912  /// \param [in,out] InstrList - The vector to append IRInstructionData to.
913  /// \param [in,out] IntegerMapping - The vector to append integers to.
914  /// candidates found in the program.
915  void findCandidates(std::vector<IRInstructionData *> &InstrList,
916  std::vector<unsigned> &IntegerMapping);
917 
918 public:
919  // Find the IRSimilarityCandidates in the \p Modules and group by structural
920  // similarity in a SimilarityGroup, each group is returned in a
921  // SimilarityGroupList.
922  //
923  // \param [in] Modules - the modules to analyze.
924  // \returns The groups of similarity ranges found in the modules.
926  findSimilarity(ArrayRef<std::unique_ptr<Module>> Modules);
927 
928  // Find the IRSimilarityCandidates in the given Module grouped by structural
929  // similarity in a SimilarityGroup, contained inside a SimilarityGroupList.
930  //
931  // \param [in] M - the module to analyze.
932  // \returns The groups of similarity ranges found in the module.
934 
935  // Clears \ref SimilarityCandidates if it is already filled by a previous run.
937  // If we've already analyzed a Module or set of Modules, so we must clear
938  // the SimilarityCandidates to make sure we do not have only old values
939  // hanging around.
940  if (SimilarityCandidates.hasValue())
941  SimilarityCandidates->clear();
942  else
943  SimilarityCandidates = SimilarityGroupList();
944  }
945 
946  // \returns The groups of similarity ranges found in the most recently passed
947  // set of modules.
949  return SimilarityCandidates;
950  }
951 
952 private:
953  /// The allocator for IRInstructionData.
955 
956  /// The allocator for IRInstructionDataLists.
958 
959  /// Map Instructions to unsigned integers and wraps the Instruction in an
960  /// instance of IRInstructionData.
961  IRInstructionMapper Mapper;
962 
963  /// The flag variable that marks whether we should check branches for
964  /// similarity, or only look within basic blocks.
965  bool EnableBranches = true;
966 
967  /// The SimilarityGroups found with the most recent run of \ref
968  /// findSimilarity. None if there is no recent run.
969  Optional<SimilarityGroupList> SimilarityCandidates;
970 };
971 
972 } // end namespace IRSimilarity
973 
974 /// An analysis pass based on legacy pass manager that runs and returns
975 /// IRSimilarityIdentifier run on the Module.
977  std::unique_ptr<IRSimilarity::IRSimilarityIdentifier> IRSI;
978 
979 public:
980  static char ID;
982 
984  const IRSimilarity::IRSimilarityIdentifier &getIRSI() const { return *IRSI; }
985 
986  bool doInitialization(Module &M) override;
987  bool doFinalization(Module &M) override;
988  bool runOnModule(Module &M) override;
989  void getAnalysisUsage(AnalysisUsage &AU) const override {
990  AU.setPreservesAll();
991  }
992 };
993 
994 /// An analysis pass that runs and returns the IRSimilarityIdentifier run on the
995 /// Module.
996 class IRSimilarityAnalysis : public AnalysisInfoMixin<IRSimilarityAnalysis> {
997 public:
999 
1001 
1002 private:
1004  static AnalysisKey Key;
1005 };
1006 
1007 /// Printer pass that uses \c IRSimilarityAnalysis.
1009  : public PassInfoMixin<IRSimilarityAnalysisPrinterPass> {
1010  raw_ostream &OS;
1011 
1012 public:
1015 };
1016 
1017 } // end namespace llvm
1018 
1019 #endif // LLVM_ANALYSIS_IRSIMILARITYIDENTIFIER_H
llvm::IRSimilarity::IRSimilarityCandidate::OperandMapping::ValueNumberMapping
DenseMap< unsigned, DenseSet< unsigned > > & ValueNumberMapping
The current mapping of global value numbers from one IRSimilarityCandidate to another IRSimilarityCan...
Definition: IRSimilarityIdentifier.h:625
llvm::PreservedAnalyses
A set of analyses that are preserved following a run of a transformation pass.
Definition: PassManager.h:155
llvm::IRSimilarity::IRInstructionDataTraits::isEqual
static bool isEqual(const IRInstructionData *LHS, const IRInstructionData *RHS)
Definition: IRSimilarityIdentifier.h:267
llvm::IRSimilarity::IRSimilarityCandidate::backInstruction
Instruction * backInstruction()
Definition: IRSimilarityIdentifier.h:782
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::InstructionClassification
InstructionClassification()
Definition: IRSimilarityIdentifier.h:457
llvm::IRSimilarity::IRSimilarityCandidate::getCanonicalNum
Optional< unsigned > getCanonicalNum(unsigned N)
Find the canonical number from the global value number N stored in the candidate.
Definition: IRSimilarityIdentifier.h:822
llvm::IRSimilarity::CandidateGVNMapping
DenseMap< IRSimilarityCandidate *, DenseMap< unsigned, DenseSet< unsigned > > > CandidateGVNMapping
Definition: IRSimilarityIdentifier.h:856
llvm::IRSimilarity::IRSimilarityCandidate::compareCommutativeOperandMapping
static bool compareCommutativeOperandMapping(OperandMapping A, OperandMapping B)
Compare the operands in A and B and check that the current mapping of global value numbers from A to ...
Definition: IRSimilarityIdentifier.cpp:579
llvm::IRSimilarity::IRSimilarityIdentifier::findSimilarity
SimilarityGroupList & findSimilarity(ArrayRef< std::unique_ptr< Module >> Modules)
Definition: IRSimilarityIdentifier.cpp:1071
llvm
---------------------— PointerInfo ------------------------------------—
Definition: AllocatorList.h:23
llvm::IRSimilarityAnalysis::Result
IRSimilarity::IRSimilarityIdentifier Result
Definition: IRSimilarityIdentifier.h:998
M
We currently emits eax Perhaps this is what we really should generate is Is imull three or four cycles eax eax The current instruction priority is based on pattern complexity The former is more complex because it folds a load so the latter will not be emitted Perhaps we should use AddedComplexity to give LEA32r a higher priority We should always try to match LEA first since the LEA matching code does some estimate to determine whether the match is profitable if we care more about code then imull is better It s two bytes shorter than movl leal On a Pentium M
Definition: README.txt:252
llvm::IRSimilarity::IRSimilarityCandidate::getBasicBlocks
void getBasicBlocks(DenseSet< BasicBlock * > &BBSet, SmallVector< BasicBlock * > &BBList) const
Definition: IRSimilarityIdentifier.h:743
llvm::VAArgInst
This class represents the va_arg llvm instruction, which returns an argument of the specified type gi...
Definition: Instructions.h:1833
llvm::IRSimilarity::IRInstructionData::setBranchSuccessors
void setBranchSuccessors(DenseMap< BasicBlock *, unsigned > &BasicBlockToInteger)
For an IRInstructionData containing a branch, finds the relative distances from the source basic bloc...
Definition: IRSimilarityIdentifier.cpp:65
llvm::IRSimilarity::isClose
bool isClose(const IRInstructionData &A, const IRInstructionData &B)
Compare one IRInstructionData class to another IRInstructionData class for whether they are performin...
Definition: IRSimilarityIdentifier.cpp:121
llvm::IRSimilarity::IRInstructionDataList
Definition: IRSimilarityIdentifier.h:241
llvm::IRSimilarity::IRInstructionMapper::BasicBlockToInteger
DenseMap< BasicBlock *, unsigned > BasicBlockToInteger
A mapping for a basic block in a module to its assigned number/location in the module.
Definition: IRSimilarityIdentifier.h:334
llvm::CmpInst::Predicate
Predicate
This enumeration lists the possible predicates for CmpInst subclasses.
Definition: InstrTypes.h:720
llvm::BasicBlock::iterator
InstListType::iterator iterator
Instruction iterators...
Definition: BasicBlock.h:90
llvm::ModulePass
ModulePass class - This class is used to implement unstructured interprocedural optimizations and ana...
Definition: Pass.h:238
llvm::BasicBlock::getParent
const Function * getParent() const
Return the enclosing method, or null if none.
Definition: BasicBlock.h:107
llvm::IRSimilarity::IRInstructionMapper::LegalInstrNumber
unsigned LegalInstrNumber
The next available integer to assign to a legal Instruction to.
Definition: IRSimilarityIdentifier.h:326
llvm::PassInfoMixin
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition: PassManager.h:374
llvm::Function
Definition: Function.h:61
llvm::IRSimilarity::IRSimilarityCandidate::compareStructure
static bool compareStructure(const IRSimilarityCandidate &A, const IRSimilarityCandidate &B)
Definition: IRSimilarityIdentifier.cpp:642
Pass.h
llvm::IRSimilarityAnalysis
An analysis pass that runs and returns the IRSimilarityIdentifier run on the Module.
Definition: IRSimilarityIdentifier.h:996
llvm::SmallVector
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1168
llvm::LandingPadInst
The landingpad instruction holds all of the information necessary to generate correct exception handl...
Definition: Instructions.h:2885
llvm::IRSimilarity::IRSimilarityCandidate::compareNonCommutativeOperandMapping
static bool compareNonCommutativeOperandMapping(OperandMapping A, OperandMapping B)
Compare the operands in A and B and check that the current mapping of global value numbers from A to ...
Definition: IRSimilarityIdentifier.cpp:545
llvm::IRSimilarity::IRSimilarityCandidate::getBasicBlocks
void getBasicBlocks(DenseSet< BasicBlock * > &BBSet) const
Definition: IRSimilarityIdentifier.h:732
Allocator.h
llvm::IRSimilarityIdentifierWrapperPass::getIRSI
const IRSimilarity::IRSimilarityIdentifier & getIRSI() const
Definition: IRSimilarityIdentifier.h:984
llvm::IRSimilarity::Invisible
@ Invisible
Definition: IRSimilarityIdentifier.h:75
llvm::SpecificBumpPtrAllocator
A BumpPtrAllocator that allows only elements of a specific type to be allocated.
Definition: Allocator.h:376
llvm::DenseMapIterator
Definition: DenseMap.h:56
llvm::IRSimilarity::IRSimilarityIdentifier::getSimilarity
Optional< SimilarityGroupList > & getSimilarity()
Definition: IRSimilarityIdentifier.h:948
llvm::IRSimilarity::IRInstructionData::RevisedPredicate
Optional< CmpInst::Predicate > RevisedPredicate
This is only relevant if we are wrapping a CmpInst where we needed to change the predicate of a compa...
Definition: IRSimilarityIdentifier.h:129
llvm::IRSimilarity::IRSimilarityCandidate::RelativeLocMapping::RelativeLocation
int RelativeLocation
The relative location to be analyzed.
Definition: IRSimilarityIdentifier.h:637
Module.h
llvm::IRSimilarity::SimilarityGroup
std::vector< IRSimilarityCandidate > SimilarityGroup
Definition: IRSimilarityIdentifier.h:857
llvm::Optional< CmpInst::Predicate >
llvm::IRSimilarity::IRSimilarityIdentifier::resetSimilarityCandidates
void resetSimilarityCandidates()
Definition: IRSimilarityIdentifier.h:936
llvm::IRSimilarity::IRSimilarityCandidate::fromCanonicalNum
Optional< unsigned > fromCanonicalNum(unsigned N)
Find the global value number from the canonical number N stored in the candidate.
Definition: IRSimilarityIdentifier.h:835
llvm::IRSimilarity::SimilarityGroupList
std::vector< SimilarityGroup > SimilarityGroupList
Definition: IRSimilarityIdentifier.h:858
llvm::hash_value
hash_code hash_value(const APFloat &Arg)
See friend declarations above.
Definition: APFloat.cpp:4827
llvm::detail::DenseSetImpl< ValueT, DenseMap< ValueT, detail::DenseSetEmpty, DenseMapInfo< ValueT >, detail::DenseSetPair< ValueT > >, DenseMapInfo< ValueT > >::insert
std::pair< iterator, bool > insert(const ValueT &V)
Definition: DenseSet.h:206
llvm::IRSimilarity::IRInstructionMapper::IDLAllocator
SpecificBumpPtrAllocator< IRInstructionDataList > * IDLAllocator
This allocator pointer is in charge of creating the IRInstructionDataList so it is not deallocated un...
Definition: IRSimilarityIdentifier.h:357
F
#define F(x, y, z)
Definition: MD5.cpp:56
llvm::IRSimilarity::IRInstructionMapper::IDL
IRInstructionDataList * IDL
Definition: IRSimilarityIdentifier.h:383
llvm::BasicBlock
LLVM Basic Block Representation.
Definition: BasicBlock.h:58
llvm::IRSimilarity::IRSimilarityCandidate::fromGVN
Optional< Value * > fromGVN(unsigned Num)
Finds the Value associate with Num if it exists.
Definition: IRSimilarityIdentifier.h:808
llvm::Optional::hasValue
constexpr bool hasValue() const
Definition: Optional.h:288
llvm::IRSimilarity::IRSimilarityCandidate::IRSimilarityCandidate
IRSimilarityCandidate(unsigned StartIdx, unsigned Len, IRInstructionData *FirstInstIt, IRInstructionData *LastInstIt)
Definition: IRSimilarityIdentifier.cpp:325
llvm::DenseMapInfo
Definition: APInt.h:34
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::EnableBranches
bool EnableBranches
Definition: IRSimilarityIdentifier.h:500
llvm::IRSimilarityIdentifierWrapperPass
An analysis pass based on legacy pass manager that runs and returns IRSimilarityIdentifier run on the...
Definition: IRSimilarityIdentifier.h:976
E
static GCRegistry::Add< CoreCLRGC > E("coreclr", "CoreCLR-compatible GC")
llvm::IRSimilarity::Illegal
@ Illegal
Definition: IRSimilarityIdentifier.h:75
llvm::CallBase::getCalledFunction
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation.
Definition: InstrTypes.h:1393
llvm::IRSimilarity::IRSimilarityIdentifier::IRSimilarityIdentifier
IRSimilarityIdentifier(bool MatchBranches=true)
Definition: IRSimilarityIdentifier.h:885
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitAllocaInst
InstrType visitAllocaInst(AllocaInst &AI)
Definition: IRSimilarityIdentifier.h:468
llvm::AnalysisUsage
Represent the analysis usage information of a pass.
Definition: PassAnalysisSupport.h:47
llvm::IRSimilarity::IRInstructionData::IDL
IRInstructionDataList * IDL
Definition: IRSimilarityIdentifier.h:238
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitCallBrInst
InstrType visitCallBrInst(CallBrInst &CBI)
Definition: IRSimilarityIdentifier.h:493
llvm::IRSimilarity::IRInstructionMapper
Helper struct for converting the Instructions in a Module into a vector of unsigned integers.
Definition: IRSimilarityIdentifier.h:319
llvm::IRSimilarity::IRSimilarityCandidate::OperandMapping
Definition: IRSimilarityIdentifier.h:615
B
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
llvm::IRSimilarity::IRInstructionMapper::HaveLegalRange
bool HaveLegalRange
Marks whether we have found a set of instructions that is long enough to be considered for similarity...
Definition: IRSimilarityIdentifier.h:347
llvm::Instruction
Definition: Instruction.h:45
llvm::IRSimilarity::IRSimilarityCandidate::isSimilar
static bool isSimilar(const IRSimilarityCandidate &A, const IRSimilarityCandidate &B)
Definition: IRSimilarityIdentifier.cpp:382
llvm::raw_ostream
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition: raw_ostream.h:53
llvm::IRSimilarity::IRSimilarityCandidate::begin
iterator begin() const
Definition: IRSimilarityIdentifier.h:850
llvm::DbgInfoIntrinsic
This is the common base class for debug info intrinsics.
Definition: IntrinsicInst.h:134
llvm::IRSimilarity::IRSimilarityCandidate::iterator
IRInstructionDataList::iterator iterator
Definition: IRSimilarityIdentifier.h:849
llvm::IRSimilarity::IRInstructionMapper::InstDataAllocator
SpecificBumpPtrAllocator< IRInstructionData > * InstDataAllocator
This allocator pointer is in charge of holding on to the IRInstructionData so it is not deallocated u...
Definition: IRSimilarityIdentifier.h:352
llvm::IRSimilarity::IRSimilarityCandidate::RelativeLocMapping::IRSC
const IRSimilarityCandidate & IRSC
The IRSimilarityCandidate that holds the instruction the relative location was pulled from.
Definition: IRSimilarityIdentifier.h:634
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitInstruction
InstrType visitInstruction(Instruction &I)
Definition: IRSimilarityIdentifier.h:496
llvm::IRSimilarity::IRInstructionMapper::InstClassifier
InstructionClassification InstClassifier
Maps an Instruction to a member of InstrType.
Definition: IRSimilarityIdentifier.h:504
llvm::None
const NoneType None
Definition: None.h:23
llvm::IRSimilarity::IRSimilarityCandidate::getEndIdx
unsigned getEndIdx() const
Definition: IRSimilarityIdentifier.h:772
llvm::InvokeInst
Invoke instruction.
Definition: Instructions.h:3749
llvm::CmpInst
This class is the base class for the comparison instructions.
Definition: InstrTypes.h:710
llvm::DenseSet< unsigned >
llvm::IRSimilarity::IRSimilarityCandidate::getFunction
Function * getFunction()
Definition: IRSimilarityIdentifier.h:790
llvm::IRSimilarityIdentifierWrapperPass::runOnModule
bool runOnModule(Module &M) override
runOnModule - Virtual method overriden by subclasses to process the module being operated on.
Definition: IRSimilarityIdentifier.cpp:1117
llvm::IRSimilarity::IRSimilarityCandidate::getStartBB
BasicBlock * getStartBB()
Definition: IRSimilarityIdentifier.h:785
llvm::IRSimilarity::IRInstructionMapper::InstructionIntegerMap
DenseMap< IRInstructionData *, unsigned, IRInstructionDataTraits > InstructionIntegerMap
Correspondence from IRInstructionData to unsigned integers.
Definition: IRSimilarityIdentifier.h:330
VI
@ VI
Definition: SIInstrInfo.cpp:7679
llvm::IRSimilarity::IRSimilarityCandidate::RelativeLocMapping::OperVal
Value * OperVal
The corresponding value.
Definition: IRSimilarityIdentifier.h:640
llvm::IRSimilarity::IRInstructionMapper::initializeForBBs
void initializeForBBs(Module &M)
Assigns values to all the basic blocks in Module M.
Definition: IRSimilarityIdentifier.h:397
llvm::IRSimilarity::Legal
@ Legal
Definition: IRSimilarityIdentifier.h:75
llvm::simple_ilist< IRInstructionData, ilist_sentinel_tracking< true > >::iterator
ilist_iterator< OptionsT, false, false > iterator
Definition: simple_ilist.h:95
llvm::IRSimilarity::IRSimilarityCandidate::overlap
static bool overlap(const IRSimilarityCandidate &A, const IRSimilarityCandidate &B)
Compare the start and end indices of the two IRSimilarityCandidates for whether they overlap.
Definition: IRSimilarityIdentifier.cpp:758
llvm::IRSimilarityIdentifierWrapperPass::getAnalysisUsage
void getAnalysisUsage(AnalysisUsage &AU) const override
getAnalysisUsage - This function should be overriden by passes that need analysis information to do t...
Definition: IRSimilarityIdentifier.h:989
llvm::IRSimilarity::IRInstructionDataTraits::getHashValue
static unsigned getHashValue(const IRInstructionData *E)
Definition: IRSimilarityIdentifier.h:261
llvm::CallBrInst
CallBr instruction, tracking function calls that may not return control but instead transfer it to a ...
Definition: Instructions.h:3958
llvm::IRSimilarityIdentifierWrapperPass::getIRSI
IRSimilarity::IRSimilarityIdentifier & getIRSI()
Definition: IRSimilarityIdentifier.h:983
llvm::IRSimilarityAnalysis::run
Result run(Module &M, ModuleAnalysisManager &)
Definition: IRSimilarityIdentifier.cpp:1123
llvm::IRSimilarity::IRInstructionMapper::allocateIRInstructionData
IRInstructionData * allocateIRInstructionData(Instruction &I, bool Legality, IRInstructionDataList &IDL)
Get an allocated IRInstructionData struct using the InstDataAllocator.
Definition: IRSimilarityIdentifier.cpp:274
llvm::DenseMap< BasicBlock *, unsigned >
llvm::AnalysisKey
A special type used by analysis passes to provide an address that identifies that particular analysis...
Definition: PassManager.h:72
llvm::IRSimilarity::IRSimilarityCandidate::operator<
bool operator<(const IRSimilarityCandidate &RHS) const
Definition: IRSimilarityIdentifier.h:845
I
#define I(x, y, z)
Definition: MD5.cpp:59
llvm::IRSimilarity::InstrType
InstrType
This represents what is and is not supported when finding similarity in Instructions.
Definition: IRSimilarityIdentifier.h:75
llvm::IRSimilarity::IRInstructionMapper::convertToUnsignedVec
void convertToUnsignedVec(BasicBlock &BB, std::vector< IRInstructionData * > &InstrList, std::vector< unsigned > &IntegerMapping)
Maps the Instructions in a BasicBlock BB to legal or illegal integers determined by InstrType.
Definition: IRSimilarityIdentifier.cpp:192
llvm::IRSimilarityIdentifierWrapperPass::doFinalization
bool doFinalization(Module &M) override
doFinalization - Virtual method overriden by subclasses to do any necessary clean up after all passes...
Definition: IRSimilarityIdentifier.cpp:1112
llvm::IRSimilarity::IRSimilarityCandidate::back
IRInstructionData * back() const
Definition: IRSimilarityIdentifier.h:777
llvm::DenseMapBase::find
iterator find(const_arg_type_t< KeyT > Val)
Definition: DenseMap.h:150
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitPHINode
InstrType visitPHINode(PHINode &PN)
Definition: IRSimilarityIdentifier.h:466
llvm::IRSimilarity::IRInstructionData::predicateForConsistency
static CmpInst::Predicate predicateForConsistency(CmpInst *CI)
A function that swaps the predicates to their less than form if they are in a greater than form.
Definition: IRSimilarityIdentifier.cpp:89
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification
Custom InstVisitor to classify different instructions for whether it can be analyzed for similarity.
Definition: IRSimilarityIdentifier.h:455
assert
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
llvm::IRSimilarity::IRInstructionData
This provides the utilities for hashing an Instruction to an unsigned integer.
Definition: IRSimilarityIdentifier.h:113
llvm::IRSimilarity::IRSimilarityCandidate::OperandMapping::OperVals
ArrayRef< Value * > & OperVals
The operand values to be analyzed.
Definition: IRSimilarityIdentifier.h:621
llvm::IRSimilarity::IRInstructionData::IRInstructionData
IRInstructionData(Instruction &I, bool Legality, IRInstructionDataList &IDL)
Gather the information that is difficult to gather for an Instruction, or is changed.
Definition: IRSimilarityIdentifier.cpp:32
llvm::IRSimilarity::IRSimilarityCandidate::OperandMapping::IRSC
const IRSimilarityCandidate & IRSC
The IRSimilarityCandidate that holds the instruction the OperVals were pulled from.
Definition: IRSimilarityIdentifier.h:618
llvm::IRSimilarity::IRSimilarityCandidate::getGVN
Optional< unsigned > getGVN(Value *V)
Finds the positive number associated with V if it has been mapped.
Definition: IRSimilarityIdentifier.h:796
llvm::IRSimilarity::IRInstructionMapper::mapToIllegalUnsigned
unsigned mapToIllegalUnsigned(BasicBlock::iterator &It, std::vector< unsigned > &IntegerMappingForBB, std::vector< IRInstructionData * > &InstrListForBB, bool End=false)
Maps an Instruction to an illegal integer.
Definition: IRSimilarityIdentifier.cpp:291
llvm::IRSimilarityAnalysisPrinterPass::run
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
Definition: IRSimilarityIdentifier.cpp:1132
llvm::Module
A Module instance is used to store all the information related to an LLVM module.
Definition: Module.h:67
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitFuncletPadInst
InstrType visitFuncletPadInst(FuncletPadInst &FPI)
Definition: IRSimilarityIdentifier.h:475
llvm::IRSimilarity::IRSimilarityCandidate::createCanonicalMappingFor
static void createCanonicalMappingFor(IRSimilarityCandidate &CurrCand)
Create a mapping from the value numbering to a different separate set of numbers.
Definition: IRSimilarityIdentifier.cpp:929
llvm::AnalysisInfoMixin
A CRTP mix-in that provides informational APIs needed for analysis passes.
Definition: PassManager.h:397
llvm::IRSimilarity::IRSimilarityCandidate::end
iterator end() const
Definition: IRSimilarityIdentifier.h:851
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitLandingPadInst
InstrType visitLandingPadInst(LandingPadInst &LPI)
Definition: IRSimilarityIdentifier.h:474
llvm::IRSimilarityIdentifierWrapperPass::IRSimilarityIdentifierWrapperPass
IRSimilarityIdentifierWrapperPass()
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitVAArgInst
InstrType visitVAArgInst(VAArgInst &VI)
Definition: IRSimilarityIdentifier.h:471
llvm::IRSimilarity::IRSimilarityCandidate::front
IRInstructionData * front() const
Definition: IRSimilarityIdentifier.h:775
llvm::IRSimilarity::IRInstructionData::Legal
bool Legal
The legality of the wrapped instruction.
Definition: IRSimilarityIdentifier.h:124
llvm::ArrayRef
ArrayRef - Represent a constant reference to an array (0 or more elements consecutively in memory),...
Definition: APInt.h:32
llvm::IRSimilarity::IRInstructionData::RelativeBlockLocations
SmallVector< int, 4 > RelativeBlockLocations
This structure holds the distances of how far "ahead of" or "behind" the target blocks of a branch,...
Definition: IRSimilarityIdentifier.h:153
llvm::IRSimilarity::IRInstructionData::Inst
Instruction * Inst
The source Instruction that is being wrapped.
Definition: IRSimilarityIdentifier.h:117
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitIntrinsicInst
InstrType visitIntrinsicInst(IntrinsicInst &II)
Definition: IRSimilarityIdentifier.h:481
InstVisitor.h
llvm::IRSimilarityIdentifierWrapperPass::ID
static char ID
Definition: IRSimilarityIdentifier.h:980
llvm::Value::getType
Type * getType() const
All values are typed, get the type of this value.
Definition: Value.h:256
llvm::detail::DenseSetImpl< ValueT, DenseMap< ValueT, detail::DenseSetEmpty, DenseMapInfo< ValueT >, detail::DenseSetPair< ValueT > >, DenseMapInfo< ValueT > >::contains
bool contains(const_arg_type_t< ValueT > V) const
Check if the set contains the given element.
Definition: DenseSet.h:185
llvm::IRSimilarity::IRInstructionMapper::mapToLegalUnsigned
unsigned mapToLegalUnsigned(BasicBlock::iterator &It, std::vector< unsigned > &IntegerMappingForBB, std::vector< IRInstructionData * > &InstrListForBB)
Maps an Instruction to a legal integer.
Definition: IRSimilarityIdentifier.cpp:226
llvm::InstVisitor
Base class for instruction visitors.
Definition: InstVisitor.h:79
llvm::DenseMapBase::insert
std::pair< iterator, bool > insert(const std::pair< KeyT, ValueT > &KV)
Definition: DenseMap.h:207
llvm::ilist_node
Definition: ilist_node.h:148
llvm::IRSimilarity::IRSimilarityCandidate::createCanonicalRelationFrom
void createCanonicalRelationFrom(IRSimilarityCandidate &SourceCand, DenseMap< unsigned, DenseSet< unsigned >> &ToSourceMapping, DenseMap< unsigned, DenseSet< unsigned >> &FromSourceMapping)
Create a mapping for the value numbering of the calling IRSimilarityCandidate, to a different separat...
Definition: IRSimilarityIdentifier.cpp:868
llvm::IRSimilarityIdentifierWrapperPass::doInitialization
bool doInitialization(Module &M) override
doInitialization - Virtual method overridden by subclasses to do any necessary initialization before ...
Definition: IRSimilarityIdentifier.cpp:1107
llvm::AnalysisUsage::setPreservesAll
void setPreservesAll()
Set by analyses that do not transform their input at all.
Definition: PassAnalysisSupport.h:130
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitInvokeInst
InstrType visitInvokeInst(InvokeInst &II)
Definition: IRSimilarityIdentifier.h:491
llvm::DenseMapBase::end
iterator end()
Definition: DenseMap.h:83
llvm::IRSimilarity::IRInstructionMapper::IllegalInstrNumber
unsigned IllegalInstrNumber
The starting illegal instruction number to map to.
Definition: IRSimilarityIdentifier.h:323
llvm::ilist_iterator
Iterator for intrusive lists based on ilist_node.
Definition: ilist_iterator.h:57
llvm::IRSimilarity::IRInstructionMapper::allocateIRInstructionDataList
IRInstructionDataList * allocateIRInstructionDataList()
Get an allocated IRInstructionDataList object using the IDLAllocator.
Definition: IRSimilarityIdentifier.cpp:285
llvm::IRSimilarity::IRInstructionMapper::IRInstructionMapper
IRInstructionMapper(SpecificBumpPtrAllocator< IRInstructionData > *IDA, SpecificBumpPtrAllocator< IRInstructionDataList > *IDLA)
Definition: IRSimilarityIdentifier.h:438
llvm::IRSimilarity::IRSimilarityCandidate::checkRelativeLocations
static bool checkRelativeLocations(RelativeLocMapping A, RelativeLocMapping B)
Compare the relative locations in A and B and check that the distances match if both locations are co...
Definition: IRSimilarityIdentifier.cpp:614
llvm::IRSimilarity::IRSimilarityCandidate::getStartIdx
unsigned getStartIdx() const
Definition: IRSimilarityIdentifier.h:769
llvm::IRSimilarity::IRSimilarityCandidate::frontInstruction
Instruction * frontInstruction()
Definition: IRSimilarityIdentifier.h:780
llvm::CallBase::isIndirectCall
bool isIndirectCall() const
Return true if the callsite is an indirect call.
Definition: Instructions.cpp:289
PassManager.h
llvm::IRSimilarity::IRSimilarityCandidate
This is a class that wraps a range of IRInstructionData from one point to another in the vector of IR...
Definition: IRSimilarityIdentifier.h:549
llvm::IntrinsicInst
A wrapper class for inspecting calls to intrinsic functions.
Definition: IntrinsicInst.h:45
llvm::IRSimilarity::IRInstructionDataTraits::getTombstoneKey
static IRInstructionData * getTombstoneKey()
Definition: IRSimilarityIdentifier.h:257
llvm::IRSimilarity::IRInstructionMapper::initializeForBBs
void initializeForBBs(Function &F, unsigned &BBNumber)
Assigns values to all the basic blocks in function F starting from integer BBNumber.
Definition: IRSimilarityIdentifier.h:390
llvm::IRSimilarity::IRInstructionData::initializeInstruction
void initializeInstruction()
Fills data stuctures for IRInstructionData when it is constructed from a.
Definition: IRSimilarityIdentifier.cpp:38
Instructions.h
llvm::IRSimilarity::IRSimilarityCandidate::getEndBB
BasicBlock * getEndBB()
Definition: IRSimilarityIdentifier.h:787
llvm::FuncletPadInst
Definition: InstrTypes.h:2304
llvm::hash_combine
hash_code hash_combine(const Ts &...args)
Combine values into a single hash_code.
Definition: Hashing.h:604
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitCallInst
InstrType visitCallInst(CallInst &CI)
Definition: IRSimilarityIdentifier.h:484
llvm::IRSimilarity::IRInstructionData::getPredicate
CmpInst::Predicate getPredicate() const
Get the predicate that the compare instruction is using for hashing the instruction.
Definition: IRSimilarityIdentifier.cpp:105
llvm::IRSimilarity::IRInstructionDataTraits::getEmptyKey
static IRInstructionData * getEmptyKey()
Definition: IRSimilarityIdentifier.h:256
N
#define N
llvm::IRSimilarity::IRSimilarityCandidate::RelativeLocMapping
A helper struct to hold the candidate, for a branch instruction, the relative location of a label,...
Definition: IRSimilarityIdentifier.h:631
llvm::hash_combine_range
hash_code hash_combine_range(InputIteratorT first, InputIteratorT last)
Compute a hash_code for a sequence of values.
Definition: Hashing.h:482
llvm::Instruction::getParent
const BasicBlock * getParent() const
Definition: Instruction.h:94
llvm::IRSimilarity::IRInstructionDataTraits
Definition: IRSimilarityIdentifier.h:255
llvm::IRSimilarity::IRSimilarityCandidate::getLength
unsigned getLength() const
Definition: IRSimilarityIdentifier.h:766
llvm::IRSimilarity::IRInstructionMapper::AddedIllegalLastTime
bool AddedIllegalLastTime
Set if we added an illegal number in the previous step.
Definition: IRSimilarityIdentifier.h:340
llvm::PHINode
Definition: Instructions.h:2633
llvm::simple_ilist
A simple intrusive list implementation.
Definition: simple_ilist.h:78
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitBranchInst
InstrType visitBranchInst(BranchInst &BI)
Definition: IRSimilarityIdentifier.h:460
llvm::AnalysisManager
A container for analyses that lazily runs them and caches their results.
Definition: InstructionSimplify.h:44
llvm::IRSimilarity::IRInstructionMapper::CanCombineWithPrevInstr
bool CanCombineWithPrevInstr
Marks whether we found a illegal instruction in the previous step.
Definition: IRSimilarityIdentifier.h:343
llvm::CallInst
This class represents a function call, abstracting a target machine's calling convention.
Definition: Instructions.h:1475
llvm::IRSimilarity::IRInstructionData::OperVals
SmallVector< Value *, 4 > OperVals
The values of the operands in the Instruction.
Definition: IRSimilarityIdentifier.h:119
BB
Common register allocation spilling lr str ldr sxth r3 ldr mla r4 can lr mov lr str ldr sxth r3 mla r4 and then merge mul and lr str ldr sxth r3 mla r4 It also increase the likelihood the store may become dead bb27 Successors according to LLVM BB
Definition: README.txt:39
llvm::IRSimilarity::IRSimilarityIdentifier
This class puts all the pieces of the IRInstructionData, IRInstructionMapper, IRSimilarityCandidate t...
Definition: IRSimilarityIdentifier.h:883
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitTerminator
InstrType visitTerminator(Instruction &I)
Definition: IRSimilarityIdentifier.h:495
llvm::AllocaInst
an instruction to allocate memory on the stack
Definition: Instructions.h:62
llvm::IRSimilarityAnalysisPrinterPass
Printer pass that uses IRSimilarityAnalysis.
Definition: IRSimilarityIdentifier.h:1008
llvm::IRSimilarity::IRInstructionMapper::InstructionClassification::visitDbgInfoIntrinsic
InstrType visitDbgInfoIntrinsic(DbgInfoIntrinsic &DII)
Definition: IRSimilarityIdentifier.h:479
llvm::BranchInst
Conditional or Unconditional Branch instruction.
Definition: Instructions.h:3068
llvm::Value
LLVM Value Representation.
Definition: Value.h:75
llvm::IRSimilarity::IRInstructionData::hash_value
friend hash_code hash_value(const IRInstructionData &ID)
Hashes Value based on its opcode, types, and operand types.
Definition: IRSimilarityIdentifier.h:215
llvm::IRSimilarityAnalysisPrinterPass::IRSimilarityAnalysisPrinterPass
IRSimilarityAnalysisPrinterPass(raw_ostream &OS)
Definition: IRSimilarityIdentifier.h:1013
llvm::Intrinsic::ID
unsigned ID
Definition: TargetTransformInfo.h:37
llvm::hash_code
An opaque object representing a hash code.
Definition: Hashing.h:72