LLVM 22.0.0git
MemorySanitizer.cpp
Go to the documentation of this file.
1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
15/// We associate a few shadow bits with every byte of the application memory,
16/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
17/// bits on every memory read, propagate the shadow bits through some of the
18/// arithmetic instruction (including MOV), store the shadow bits on every
19/// memory write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45/// Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68/// Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92/// Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101/// __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109/// KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112/// - KMSAN always tracks the origins and implies msan-keep-going=true;
113/// - KMSAN allocates shadow and origin memory for each page separately, so
114/// there are no explicit accesses to shadow and origin in the
115/// instrumentation.
116/// Shadow and origin values for a particular X-byte memory location
117/// (X=1,2,4,8) are accessed through pointers obtained via the
118/// __msan_metadata_ptr_for_load_X(ptr)
119/// __msan_metadata_ptr_for_store_X(ptr)
120/// functions. The corresponding functions check that the X-byte accesses
121/// are possible and returns the pointers to shadow and origin memory.
122/// Arbitrary sized accesses are handled with:
123/// __msan_metadata_ptr_for_load_n(ptr, size)
124/// __msan_metadata_ptr_for_store_n(ptr, size);
125/// Note that the sanitizer code has to deal with how shadow/origin pairs
126/// returned by the these functions are represented in different ABIs. In
127/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
128/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
129/// pointed to by a hidden parameter.
130/// - TLS variables are stored in a single per-task struct. A call to a
131/// function __msan_get_context_state() returning a pointer to that struct
132/// is inserted into every instrumented function before the entry block;
133/// - __msan_warning() takes a 32-bit origin parameter;
134/// - local variables are poisoned with __msan_poison_alloca() upon function
135/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
136/// function;
137/// - the pass doesn't declare any global variables or add global constructors
138/// to the translation unit.
139///
140/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
141/// calls, making sure we're on the safe side wrt. possible false positives.
142///
143/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
144/// moment.
145///
146//
147// FIXME: This sanitizer does not yet handle scalable vectors
148//
149//===----------------------------------------------------------------------===//
150
152#include "llvm/ADT/APInt.h"
153#include "llvm/ADT/ArrayRef.h"
154#include "llvm/ADT/DenseMap.h"
156#include "llvm/ADT/SetVector.h"
157#include "llvm/ADT/SmallPtrSet.h"
158#include "llvm/ADT/SmallVector.h"
160#include "llvm/ADT/StringRef.h"
164#include "llvm/IR/Argument.h"
166#include "llvm/IR/Attributes.h"
167#include "llvm/IR/BasicBlock.h"
168#include "llvm/IR/CallingConv.h"
169#include "llvm/IR/Constant.h"
170#include "llvm/IR/Constants.h"
171#include "llvm/IR/DataLayout.h"
172#include "llvm/IR/DerivedTypes.h"
173#include "llvm/IR/Function.h"
174#include "llvm/IR/GlobalValue.h"
176#include "llvm/IR/IRBuilder.h"
177#include "llvm/IR/InlineAsm.h"
178#include "llvm/IR/InstVisitor.h"
179#include "llvm/IR/InstrTypes.h"
180#include "llvm/IR/Instruction.h"
181#include "llvm/IR/Instructions.h"
183#include "llvm/IR/Intrinsics.h"
184#include "llvm/IR/IntrinsicsAArch64.h"
185#include "llvm/IR/IntrinsicsX86.h"
186#include "llvm/IR/MDBuilder.h"
187#include "llvm/IR/Module.h"
188#include "llvm/IR/Type.h"
189#include "llvm/IR/Value.h"
190#include "llvm/IR/ValueMap.h"
193#include "llvm/Support/Casting.h"
195#include "llvm/Support/Debug.h"
205#include <algorithm>
206#include <cassert>
207#include <cstddef>
208#include <cstdint>
209#include <memory>
210#include <numeric>
211#include <string>
212#include <tuple>
213
214using namespace llvm;
215
216#define DEBUG_TYPE "msan"
217
218DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
219 "Controls which checks to insert");
220
221DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
222 "Controls which instruction to instrument");
223
224static const unsigned kOriginSize = 4;
227
228// These constants must be kept in sync with the ones in msan.h.
229// TODO: increase size to match SVE/SVE2/SME/SME2 limits
230static const unsigned kParamTLSSize = 800;
231static const unsigned kRetvalTLSSize = 800;
232
233// Accesses sizes are powers of two: 1, 2, 4, 8.
234static const size_t kNumberOfAccessSizes = 4;
235
236/// Track origins of uninitialized values.
237///
238/// Adds a section to MemorySanitizer report that points to the allocation
239/// (stack or heap) the uninitialized bits came from originally.
241 "msan-track-origins",
242 cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
243 cl::init(0));
244
245static cl::opt<bool> ClKeepGoing("msan-keep-going",
246 cl::desc("keep going after reporting a UMR"),
247 cl::Hidden, cl::init(false));
248
249static cl::opt<bool>
250 ClPoisonStack("msan-poison-stack",
251 cl::desc("poison uninitialized stack variables"), cl::Hidden,
252 cl::init(true));
253
255 "msan-poison-stack-with-call",
256 cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
257 cl::init(false));
258
260 "msan-poison-stack-pattern",
261 cl::desc("poison uninitialized stack variables with the given pattern"),
262 cl::Hidden, cl::init(0xff));
263
264static cl::opt<bool>
265 ClPrintStackNames("msan-print-stack-names",
266 cl::desc("Print name of local stack variable"),
267 cl::Hidden, cl::init(true));
268
269static cl::opt<bool>
270 ClPoisonUndef("msan-poison-undef",
271 cl::desc("Poison fully undef temporary values. "
272 "Partially undefined constant vectors "
273 "are unaffected by this flag (see "
274 "-msan-poison-undef-vectors)."),
275 cl::Hidden, cl::init(true));
276
278 "msan-poison-undef-vectors",
279 cl::desc("Precisely poison partially undefined constant vectors. "
280 "If false (legacy behavior), the entire vector is "
281 "considered fully initialized, which may lead to false "
282 "negatives. Fully undefined constant vectors are "
283 "unaffected by this flag (see -msan-poison-undef)."),
284 cl::Hidden, cl::init(false));
285
287 "msan-precise-disjoint-or",
288 cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
289 "disjointedness is ignored (i.e., 1|1 is initialized)."),
290 cl::Hidden, cl::init(false));
291
292static cl::opt<bool>
293 ClHandleICmp("msan-handle-icmp",
294 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
295 cl::Hidden, cl::init(true));
296
297static cl::opt<bool>
298 ClHandleICmpExact("msan-handle-icmp-exact",
299 cl::desc("exact handling of relational integer ICmp"),
300 cl::Hidden, cl::init(true));
301
303 "msan-handle-lifetime-intrinsics",
304 cl::desc(
305 "when possible, poison scoped variables at the beginning of the scope "
306 "(slower, but more precise)"),
307 cl::Hidden, cl::init(true));
308
309// When compiling the Linux kernel, we sometimes see false positives related to
310// MSan being unable to understand that inline assembly calls may initialize
311// local variables.
312// This flag makes the compiler conservatively unpoison every memory location
313// passed into an assembly call. Note that this may cause false positives.
314// Because it's impossible to figure out the array sizes, we can only unpoison
315// the first sizeof(type) bytes for each type* pointer.
317 "msan-handle-asm-conservative",
318 cl::desc("conservative handling of inline assembly"), cl::Hidden,
319 cl::init(true));
320
321// This flag controls whether we check the shadow of the address
322// operand of load or store. Such bugs are very rare, since load from
323// a garbage address typically results in SEGV, but still happen
324// (e.g. only lower bits of address are garbage, or the access happens
325// early at program startup where malloc-ed memory is more likely to
326// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
328 "msan-check-access-address",
329 cl::desc("report accesses through a pointer which has poisoned shadow"),
330 cl::Hidden, cl::init(true));
331
333 "msan-eager-checks",
334 cl::desc("check arguments and return values at function call boundaries"),
335 cl::Hidden, cl::init(false));
336
338 "msan-dump-strict-instructions",
339 cl::desc("print out instructions with default strict semantics i.e.,"
340 "check that all the inputs are fully initialized, and mark "
341 "the output as fully initialized. These semantics are applied "
342 "to instructions that could not be handled explicitly nor "
343 "heuristically."),
344 cl::Hidden, cl::init(false));
345
346// Currently, all the heuristically handled instructions are specifically
347// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
348// to parallel 'msan-dump-strict-instructions', and to keep the door open to
349// handling non-intrinsic instructions heuristically.
351 "msan-dump-heuristic-instructions",
352 cl::desc("Prints 'unknown' instructions that were handled heuristically. "
353 "Use -msan-dump-strict-instructions to print instructions that "
354 "could not be handled explicitly nor heuristically."),
355 cl::Hidden, cl::init(false));
356
358 "msan-instrumentation-with-call-threshold",
359 cl::desc(
360 "If the function being instrumented requires more than "
361 "this number of checks and origin stores, use callbacks instead of "
362 "inline checks (-1 means never use callbacks)."),
363 cl::Hidden, cl::init(3500));
364
365static cl::opt<bool>
366 ClEnableKmsan("msan-kernel",
367 cl::desc("Enable KernelMemorySanitizer instrumentation"),
368 cl::Hidden, cl::init(false));
369
370static cl::opt<bool>
371 ClDisableChecks("msan-disable-checks",
372 cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
373 cl::init(false));
374
375static cl::opt<bool>
376 ClCheckConstantShadow("msan-check-constant-shadow",
377 cl::desc("Insert checks for constant shadow values"),
378 cl::Hidden, cl::init(true));
379
380// This is off by default because of a bug in gold:
381// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
382static cl::opt<bool>
383 ClWithComdat("msan-with-comdat",
384 cl::desc("Place MSan constructors in comdat sections"),
385 cl::Hidden, cl::init(false));
386
387// These options allow to specify custom memory map parameters
388// See MemoryMapParams for details.
389static cl::opt<uint64_t> ClAndMask("msan-and-mask",
390 cl::desc("Define custom MSan AndMask"),
391 cl::Hidden, cl::init(0));
392
393static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
394 cl::desc("Define custom MSan XorMask"),
395 cl::Hidden, cl::init(0));
396
397static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
398 cl::desc("Define custom MSan ShadowBase"),
399 cl::Hidden, cl::init(0));
400
401static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
402 cl::desc("Define custom MSan OriginBase"),
403 cl::Hidden, cl::init(0));
404
405static cl::opt<int>
406 ClDisambiguateWarning("msan-disambiguate-warning-threshold",
407 cl::desc("Define threshold for number of checks per "
408 "debug location to force origin update."),
409 cl::Hidden, cl::init(3));
410
411const char kMsanModuleCtorName[] = "msan.module_ctor";
412const char kMsanInitName[] = "__msan_init";
413
414namespace {
415
416// Memory map parameters used in application-to-shadow address calculation.
417// Offset = (Addr & ~AndMask) ^ XorMask
418// Shadow = ShadowBase + Offset
419// Origin = OriginBase + Offset
420struct MemoryMapParams {
421 uint64_t AndMask;
422 uint64_t XorMask;
423 uint64_t ShadowBase;
424 uint64_t OriginBase;
425};
426
427struct PlatformMemoryMapParams {
428 const MemoryMapParams *bits32;
429 const MemoryMapParams *bits64;
430};
431
432} // end anonymous namespace
433
434// i386 Linux
435static const MemoryMapParams Linux_I386_MemoryMapParams = {
436 0x000080000000, // AndMask
437 0, // XorMask (not used)
438 0, // ShadowBase (not used)
439 0x000040000000, // OriginBase
440};
441
442// x86_64 Linux
443static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
444 0, // AndMask (not used)
445 0x500000000000, // XorMask
446 0, // ShadowBase (not used)
447 0x100000000000, // OriginBase
448};
449
450// mips32 Linux
451// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
452// after picking good constants
453
454// mips64 Linux
455static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
456 0, // AndMask (not used)
457 0x008000000000, // XorMask
458 0, // ShadowBase (not used)
459 0x002000000000, // OriginBase
460};
461
462// ppc32 Linux
463// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
464// after picking good constants
465
466// ppc64 Linux
467static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
468 0xE00000000000, // AndMask
469 0x100000000000, // XorMask
470 0x080000000000, // ShadowBase
471 0x1C0000000000, // OriginBase
472};
473
474// s390x Linux
475static const MemoryMapParams Linux_S390X_MemoryMapParams = {
476 0xC00000000000, // AndMask
477 0, // XorMask (not used)
478 0x080000000000, // ShadowBase
479 0x1C0000000000, // OriginBase
480};
481
482// arm32 Linux
483// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
484// after picking good constants
485
486// aarch64 Linux
487static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
488 0, // AndMask (not used)
489 0x0B00000000000, // XorMask
490 0, // ShadowBase (not used)
491 0x0200000000000, // OriginBase
492};
493
494// loongarch64 Linux
495static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
496 0, // AndMask (not used)
497 0x500000000000, // XorMask
498 0, // ShadowBase (not used)
499 0x100000000000, // OriginBase
500};
501
502// riscv32 Linux
503// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
504// after picking good constants
505
506// aarch64 FreeBSD
507static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
508 0x1800000000000, // AndMask
509 0x0400000000000, // XorMask
510 0x0200000000000, // ShadowBase
511 0x0700000000000, // OriginBase
512};
513
514// i386 FreeBSD
515static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
516 0x000180000000, // AndMask
517 0x000040000000, // XorMask
518 0x000020000000, // ShadowBase
519 0x000700000000, // OriginBase
520};
521
522// x86_64 FreeBSD
523static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
524 0xc00000000000, // AndMask
525 0x200000000000, // XorMask
526 0x100000000000, // ShadowBase
527 0x380000000000, // OriginBase
528};
529
530// x86_64 NetBSD
531static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
532 0, // AndMask
533 0x500000000000, // XorMask
534 0, // ShadowBase
535 0x100000000000, // OriginBase
536};
537
538static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
541};
542
543static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
544 nullptr,
546};
547
548static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
549 nullptr,
551};
552
553static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
554 nullptr,
556};
557
558static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
559 nullptr,
561};
562
563static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
564 nullptr,
566};
567
568static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
569 nullptr,
571};
572
573static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
576};
577
578static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
579 nullptr,
581};
582
583namespace {
584
585/// Instrument functions of a module to detect uninitialized reads.
586///
587/// Instantiating MemorySanitizer inserts the msan runtime library API function
588/// declarations into the module if they don't exist already. Instantiating
589/// ensures the __msan_init function is in the list of global constructors for
590/// the module.
591class MemorySanitizer {
592public:
593 MemorySanitizer(Module &M, MemorySanitizerOptions Options)
594 : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
595 Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
596 initializeModule(M);
597 }
598
599 // MSan cannot be moved or copied because of MapParams.
600 MemorySanitizer(MemorySanitizer &&) = delete;
601 MemorySanitizer &operator=(MemorySanitizer &&) = delete;
602 MemorySanitizer(const MemorySanitizer &) = delete;
603 MemorySanitizer &operator=(const MemorySanitizer &) = delete;
604
605 bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
606
607private:
608 friend struct MemorySanitizerVisitor;
609 friend struct VarArgHelperBase;
610 friend struct VarArgAMD64Helper;
611 friend struct VarArgAArch64Helper;
612 friend struct VarArgPowerPC64Helper;
613 friend struct VarArgPowerPC32Helper;
614 friend struct VarArgSystemZHelper;
615 friend struct VarArgI386Helper;
616 friend struct VarArgGenericHelper;
617
618 void initializeModule(Module &M);
619 void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
620 void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
621 void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
622
623 template <typename... ArgsTy>
624 FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
625 ArgsTy... Args);
626
627 /// True if we're compiling the Linux kernel.
628 bool CompileKernel;
629 /// Track origins (allocation points) of uninitialized values.
630 int TrackOrigins;
631 bool Recover;
632 bool EagerChecks;
633
634 Triple TargetTriple;
635 LLVMContext *C;
636 Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
637 Type *OriginTy;
638 PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
639
640 // XxxTLS variables represent the per-thread state in MSan and per-task state
641 // in KMSAN.
642 // For the userspace these point to thread-local globals. In the kernel land
643 // they point to the members of a per-task struct obtained via a call to
644 // __msan_get_context_state().
645
646 /// Thread-local shadow storage for function parameters.
647 Value *ParamTLS;
648
649 /// Thread-local origin storage for function parameters.
650 Value *ParamOriginTLS;
651
652 /// Thread-local shadow storage for function return value.
653 Value *RetvalTLS;
654
655 /// Thread-local origin storage for function return value.
656 Value *RetvalOriginTLS;
657
658 /// Thread-local shadow storage for in-register va_arg function.
659 Value *VAArgTLS;
660
661 /// Thread-local shadow storage for in-register va_arg function.
662 Value *VAArgOriginTLS;
663
664 /// Thread-local shadow storage for va_arg overflow area.
665 Value *VAArgOverflowSizeTLS;
666
667 /// Are the instrumentation callbacks set up?
668 bool CallbacksInitialized = false;
669
670 /// The run-time callback to print a warning.
671 FunctionCallee WarningFn;
672
673 // These arrays are indexed by log2(AccessSize).
674 FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
675 FunctionCallee MaybeWarningVarSizeFn;
676 FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
677
678 /// Run-time helper that generates a new origin value for a stack
679 /// allocation.
680 FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
681 // No description version
682 FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
683
684 /// Run-time helper that poisons stack on function entry.
685 FunctionCallee MsanPoisonStackFn;
686
687 /// Run-time helper that records a store (or any event) of an
688 /// uninitialized value and returns an updated origin id encoding this info.
689 FunctionCallee MsanChainOriginFn;
690
691 /// Run-time helper that paints an origin over a region.
692 FunctionCallee MsanSetOriginFn;
693
694 /// MSan runtime replacements for memmove, memcpy and memset.
695 FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
696
697 /// KMSAN callback for task-local function argument shadow.
698 StructType *MsanContextStateTy;
699 FunctionCallee MsanGetContextStateFn;
700
701 /// Functions for poisoning/unpoisoning local variables
702 FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
703
704 /// Pair of shadow/origin pointers.
705 Type *MsanMetadata;
706
707 /// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
708 FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
709 FunctionCallee MsanMetadataPtrForLoad_1_8[4];
710 FunctionCallee MsanMetadataPtrForStore_1_8[4];
711 FunctionCallee MsanInstrumentAsmStoreFn;
712
713 /// Storage for return values of the MsanMetadataPtrXxx functions.
714 Value *MsanMetadataAlloca;
715
716 /// Helper to choose between different MsanMetadataPtrXxx().
717 FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
718
719 /// Memory map parameters used in application-to-shadow calculation.
720 const MemoryMapParams *MapParams;
721
722 /// Custom memory map parameters used when -msan-shadow-base or
723 // -msan-origin-base is provided.
724 MemoryMapParams CustomMapParams;
725
726 MDNode *ColdCallWeights;
727
728 /// Branch weights for origin store.
729 MDNode *OriginStoreWeights;
730};
731
732void insertModuleCtor(Module &M) {
735 /*InitArgTypes=*/{},
736 /*InitArgs=*/{},
737 // This callback is invoked when the functions are created the first
738 // time. Hook them into the global ctors list in that case:
739 [&](Function *Ctor, FunctionCallee) {
740 if (!ClWithComdat) {
741 appendToGlobalCtors(M, Ctor, 0);
742 return;
743 }
744 Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
745 Ctor->setComdat(MsanCtorComdat);
746 appendToGlobalCtors(M, Ctor, 0, Ctor);
747 });
748}
749
750template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
751 return (Opt.getNumOccurrences() > 0) ? Opt : Default;
752}
753
754} // end anonymous namespace
755
757 bool EagerChecks)
758 : Kernel(getOptOrDefault(ClEnableKmsan, K)),
759 TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
760 Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
761 EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
762
765 // Return early if nosanitize_memory module flag is present for the module.
766 if (checkIfAlreadyInstrumented(M, "nosanitize_memory"))
767 return PreservedAnalyses::all();
768 bool Modified = false;
769 if (!Options.Kernel) {
770 insertModuleCtor(M);
771 Modified = true;
772 }
773
774 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
775 for (Function &F : M) {
776 if (F.empty())
777 continue;
778 MemorySanitizer Msan(*F.getParent(), Options);
779 Modified |=
780 Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
781 }
782
783 if (!Modified)
784 return PreservedAnalyses::all();
785
787 // GlobalsAA is considered stateless and does not get invalidated unless
788 // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
789 // make changes that require GlobalsAA to be invalidated.
790 PA.abandon<GlobalsAA>();
791 return PA;
792}
793
795 raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
797 OS, MapClassName2PassName);
798 OS << '<';
799 if (Options.Recover)
800 OS << "recover;";
801 if (Options.Kernel)
802 OS << "kernel;";
803 if (Options.EagerChecks)
804 OS << "eager-checks;";
805 OS << "track-origins=" << Options.TrackOrigins;
806 OS << '>';
807}
808
809/// Create a non-const global initialized with the given string.
810///
811/// Creates a writable global for Str so that we can pass it to the
812/// run-time lib. Runtime uses first 4 bytes of the string to store the
813/// frame ID, so the string needs to be mutable.
815 StringRef Str) {
816 Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
817 return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
818 GlobalValue::PrivateLinkage, StrConst, "");
819}
820
821template <typename... ArgsTy>
823MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
824 ArgsTy... Args) {
825 if (TargetTriple.getArch() == Triple::systemz) {
826 // SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
827 return M.getOrInsertFunction(Name, Type::getVoidTy(*C), PtrTy,
828 std::forward<ArgsTy>(Args)...);
829 }
830
831 return M.getOrInsertFunction(Name, MsanMetadata,
832 std::forward<ArgsTy>(Args)...);
833}
834
835/// Create KMSAN API callbacks.
836void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
837 IRBuilder<> IRB(*C);
838
839 // These will be initialized in insertKmsanPrologue().
840 RetvalTLS = nullptr;
841 RetvalOriginTLS = nullptr;
842 ParamTLS = nullptr;
843 ParamOriginTLS = nullptr;
844 VAArgTLS = nullptr;
845 VAArgOriginTLS = nullptr;
846 VAArgOverflowSizeTLS = nullptr;
847
848 WarningFn = M.getOrInsertFunction("__msan_warning",
849 TLI.getAttrList(C, {0}, /*Signed=*/false),
850 IRB.getVoidTy(), IRB.getInt32Ty());
851
852 // Requests the per-task context state (kmsan_context_state*) from the
853 // runtime library.
854 MsanContextStateTy = StructType::get(
855 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
856 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
857 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
858 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
859 IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
860 OriginTy);
861 MsanGetContextStateFn =
862 M.getOrInsertFunction("__msan_get_context_state", PtrTy);
863
864 MsanMetadata = StructType::get(PtrTy, PtrTy);
865
866 for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
867 std::string name_load =
868 "__msan_metadata_ptr_for_load_" + std::to_string(size);
869 std::string name_store =
870 "__msan_metadata_ptr_for_store_" + std::to_string(size);
871 MsanMetadataPtrForLoad_1_8[ind] =
872 getOrInsertMsanMetadataFunction(M, name_load, PtrTy);
873 MsanMetadataPtrForStore_1_8[ind] =
874 getOrInsertMsanMetadataFunction(M, name_store, PtrTy);
875 }
876
877 MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
878 M, "__msan_metadata_ptr_for_load_n", PtrTy, IntptrTy);
879 MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
880 M, "__msan_metadata_ptr_for_store_n", PtrTy, IntptrTy);
881
882 // Functions for poisoning and unpoisoning memory.
883 MsanPoisonAllocaFn = M.getOrInsertFunction(
884 "__msan_poison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
885 MsanUnpoisonAllocaFn = M.getOrInsertFunction(
886 "__msan_unpoison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy);
887}
888
890 return M.getOrInsertGlobal(Name, Ty, [&] {
891 return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
892 nullptr, Name, nullptr,
894 });
895}
896
897/// Insert declarations for userspace-specific functions and globals.
898void MemorySanitizer::createUserspaceApi(Module &M,
899 const TargetLibraryInfo &TLI) {
900 IRBuilder<> IRB(*C);
901
902 // Create the callback.
903 // FIXME: this function should have "Cold" calling conv,
904 // which is not yet implemented.
905 if (TrackOrigins) {
906 StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
907 : "__msan_warning_with_origin_noreturn";
908 WarningFn = M.getOrInsertFunction(WarningFnName,
909 TLI.getAttrList(C, {0}, /*Signed=*/false),
910 IRB.getVoidTy(), IRB.getInt32Ty());
911 } else {
912 StringRef WarningFnName =
913 Recover ? "__msan_warning" : "__msan_warning_noreturn";
914 WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
915 }
916
917 // Create the global TLS variables.
918 RetvalTLS =
919 getOrInsertGlobal(M, "__msan_retval_tls",
920 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
921
922 RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
923
924 ParamTLS =
925 getOrInsertGlobal(M, "__msan_param_tls",
926 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
927
928 ParamOriginTLS =
929 getOrInsertGlobal(M, "__msan_param_origin_tls",
930 ArrayType::get(OriginTy, kParamTLSSize / 4));
931
932 VAArgTLS =
933 getOrInsertGlobal(M, "__msan_va_arg_tls",
934 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
935
936 VAArgOriginTLS =
937 getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
938 ArrayType::get(OriginTy, kParamTLSSize / 4));
939
940 VAArgOverflowSizeTLS = getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls",
941 IRB.getIntPtrTy(M.getDataLayout()));
942
943 for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
944 AccessSizeIndex++) {
945 unsigned AccessSize = 1 << AccessSizeIndex;
946 std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
947 MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
948 FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
949 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
950 MaybeWarningVarSizeFn = M.getOrInsertFunction(
951 "__msan_maybe_warning_N", TLI.getAttrList(C, {}, /*Signed=*/false),
952 IRB.getVoidTy(), PtrTy, IRB.getInt64Ty(), IRB.getInt32Ty());
953 FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
954 MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
955 FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
956 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), PtrTy,
957 IRB.getInt32Ty());
958 }
959
960 MsanSetAllocaOriginWithDescriptionFn =
961 M.getOrInsertFunction("__msan_set_alloca_origin_with_descr",
962 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy, PtrTy);
963 MsanSetAllocaOriginNoDescriptionFn =
964 M.getOrInsertFunction("__msan_set_alloca_origin_no_descr",
965 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
966 MsanPoisonStackFn = M.getOrInsertFunction("__msan_poison_stack",
967 IRB.getVoidTy(), PtrTy, IntptrTy);
968}
969
970/// Insert extern declaration of runtime-provided functions and globals.
971void MemorySanitizer::initializeCallbacks(Module &M,
972 const TargetLibraryInfo &TLI) {
973 // Only do this once.
974 if (CallbacksInitialized)
975 return;
976
977 IRBuilder<> IRB(*C);
978 // Initialize callbacks that are common for kernel and userspace
979 // instrumentation.
980 MsanChainOriginFn = M.getOrInsertFunction(
981 "__msan_chain_origin",
982 TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
983 IRB.getInt32Ty());
984 MsanSetOriginFn = M.getOrInsertFunction(
985 "__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
986 IRB.getVoidTy(), PtrTy, IntptrTy, IRB.getInt32Ty());
987 MemmoveFn =
988 M.getOrInsertFunction("__msan_memmove", PtrTy, PtrTy, PtrTy, IntptrTy);
989 MemcpyFn =
990 M.getOrInsertFunction("__msan_memcpy", PtrTy, PtrTy, PtrTy, IntptrTy);
991 MemsetFn = M.getOrInsertFunction("__msan_memset",
992 TLI.getAttrList(C, {1}, /*Signed=*/true),
993 PtrTy, PtrTy, IRB.getInt32Ty(), IntptrTy);
994
995 MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
996 "__msan_instrument_asm_store", IRB.getVoidTy(), PtrTy, IntptrTy);
997
998 if (CompileKernel) {
999 createKernelApi(M, TLI);
1000 } else {
1001 createUserspaceApi(M, TLI);
1002 }
1003 CallbacksInitialized = true;
1004}
1005
1006FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
1007 int size) {
1008 FunctionCallee *Fns =
1009 isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
1010 switch (size) {
1011 case 1:
1012 return Fns[0];
1013 case 2:
1014 return Fns[1];
1015 case 4:
1016 return Fns[2];
1017 case 8:
1018 return Fns[3];
1019 default:
1020 return nullptr;
1021 }
1022}
1023
1024/// Module-level initialization.
1025///
1026/// inserts a call to __msan_init to the module's constructor list.
1027void MemorySanitizer::initializeModule(Module &M) {
1028 auto &DL = M.getDataLayout();
1029
1030 TargetTriple = M.getTargetTriple();
1031
1032 bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
1033 bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
1034 // Check the overrides first
1035 if (ShadowPassed || OriginPassed) {
1036 CustomMapParams.AndMask = ClAndMask;
1037 CustomMapParams.XorMask = ClXorMask;
1038 CustomMapParams.ShadowBase = ClShadowBase;
1039 CustomMapParams.OriginBase = ClOriginBase;
1040 MapParams = &CustomMapParams;
1041 } else {
1042 switch (TargetTriple.getOS()) {
1043 case Triple::FreeBSD:
1044 switch (TargetTriple.getArch()) {
1045 case Triple::aarch64:
1046 MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
1047 break;
1048 case Triple::x86_64:
1049 MapParams = FreeBSD_X86_MemoryMapParams.bits64;
1050 break;
1051 case Triple::x86:
1052 MapParams = FreeBSD_X86_MemoryMapParams.bits32;
1053 break;
1054 default:
1055 report_fatal_error("unsupported architecture");
1056 }
1057 break;
1058 case Triple::NetBSD:
1059 switch (TargetTriple.getArch()) {
1060 case Triple::x86_64:
1061 MapParams = NetBSD_X86_MemoryMapParams.bits64;
1062 break;
1063 default:
1064 report_fatal_error("unsupported architecture");
1065 }
1066 break;
1067 case Triple::Linux:
1068 switch (TargetTriple.getArch()) {
1069 case Triple::x86_64:
1070 MapParams = Linux_X86_MemoryMapParams.bits64;
1071 break;
1072 case Triple::x86:
1073 MapParams = Linux_X86_MemoryMapParams.bits32;
1074 break;
1075 case Triple::mips64:
1076 case Triple::mips64el:
1077 MapParams = Linux_MIPS_MemoryMapParams.bits64;
1078 break;
1079 case Triple::ppc64:
1080 case Triple::ppc64le:
1081 MapParams = Linux_PowerPC_MemoryMapParams.bits64;
1082 break;
1083 case Triple::systemz:
1084 MapParams = Linux_S390_MemoryMapParams.bits64;
1085 break;
1086 case Triple::aarch64:
1087 case Triple::aarch64_be:
1088 MapParams = Linux_ARM_MemoryMapParams.bits64;
1089 break;
1091 MapParams = Linux_LoongArch_MemoryMapParams.bits64;
1092 break;
1093 default:
1094 report_fatal_error("unsupported architecture");
1095 }
1096 break;
1097 default:
1098 report_fatal_error("unsupported operating system");
1099 }
1100 }
1101
1102 C = &(M.getContext());
1103 IRBuilder<> IRB(*C);
1104 IntptrTy = IRB.getIntPtrTy(DL);
1105 OriginTy = IRB.getInt32Ty();
1106 PtrTy = IRB.getPtrTy();
1107
1108 ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1109 OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1110
1111 if (!CompileKernel) {
1112 if (TrackOrigins)
1113 M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
1114 return new GlobalVariable(
1115 M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1116 IRB.getInt32(TrackOrigins), "__msan_track_origins");
1117 });
1118
1119 if (Recover)
1120 M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
1121 return new GlobalVariable(M, IRB.getInt32Ty(), true,
1122 GlobalValue::WeakODRLinkage,
1123 IRB.getInt32(Recover), "__msan_keep_going");
1124 });
1125 }
1126}
1127
1128namespace {
1129
1130/// A helper class that handles instrumentation of VarArg
1131/// functions on a particular platform.
1132///
1133/// Implementations are expected to insert the instrumentation
1134/// necessary to propagate argument shadow through VarArg function
1135/// calls. Visit* methods are called during an InstVisitor pass over
1136/// the function, and should avoid creating new basic blocks. A new
1137/// instance of this class is created for each instrumented function.
1138struct VarArgHelper {
1139 virtual ~VarArgHelper() = default;
1140
1141 /// Visit a CallBase.
1142 virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1143
1144 /// Visit a va_start call.
1145 virtual void visitVAStartInst(VAStartInst &I) = 0;
1146
1147 /// Visit a va_copy call.
1148 virtual void visitVACopyInst(VACopyInst &I) = 0;
1149
1150 /// Finalize function instrumentation.
1151 ///
1152 /// This method is called after visiting all interesting (see above)
1153 /// instructions in a function.
1154 virtual void finalizeInstrumentation() = 0;
1155};
1156
1157struct MemorySanitizerVisitor;
1158
1159} // end anonymous namespace
1160
1161static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1162 MemorySanitizerVisitor &Visitor);
1163
1164static unsigned TypeSizeToSizeIndex(TypeSize TS) {
1165 if (TS.isScalable())
1166 // Scalable types unconditionally take slowpaths.
1167 return kNumberOfAccessSizes;
1168 unsigned TypeSizeFixed = TS.getFixedValue();
1169 if (TypeSizeFixed <= 8)
1170 return 0;
1171 return Log2_32_Ceil((TypeSizeFixed + 7) / 8);
1172}
1173
1174namespace {
1175
1176/// Helper class to attach debug information of the given instruction onto new
1177/// instructions inserted after.
1178class NextNodeIRBuilder : public IRBuilder<> {
1179public:
1180 explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1181 SetCurrentDebugLocation(IP->getDebugLoc());
1182 }
1183};
1184
1185/// This class does all the work for a given function. Store and Load
1186/// instructions store and load corresponding shadow and origin
1187/// values. Most instructions propagate shadow from arguments to their
1188/// return values. Certain instructions (most importantly, BranchInst)
1189/// test their argument shadow and print reports (with a runtime call) if it's
1190/// non-zero.
1191struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1192 Function &F;
1193 MemorySanitizer &MS;
1194 SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1195 ValueMap<Value *, Value *> ShadowMap, OriginMap;
1196 std::unique_ptr<VarArgHelper> VAHelper;
1197 const TargetLibraryInfo *TLI;
1198 Instruction *FnPrologueEnd;
1199 SmallVector<Instruction *, 16> Instructions;
1200
1201 // The following flags disable parts of MSan instrumentation based on
1202 // exclusion list contents and command-line options.
1203 bool InsertChecks;
1204 bool PropagateShadow;
1205 bool PoisonStack;
1206 bool PoisonUndef;
1207 bool PoisonUndefVectors;
1208
1209 struct ShadowOriginAndInsertPoint {
1210 Value *Shadow;
1211 Value *Origin;
1212 Instruction *OrigIns;
1213
1214 ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1215 : Shadow(S), Origin(O), OrigIns(I) {}
1216 };
1218 DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1219 SmallSetVector<AllocaInst *, 16> AllocaSet;
1222 int64_t SplittableBlocksCount = 0;
1223
1224 MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1225 const TargetLibraryInfo &TLI)
1226 : F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
1227 bool SanitizeFunction =
1228 F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
1229 InsertChecks = SanitizeFunction;
1230 PropagateShadow = SanitizeFunction;
1231 PoisonStack = SanitizeFunction && ClPoisonStack;
1232 PoisonUndef = SanitizeFunction && ClPoisonUndef;
1233 PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
1234
1235 // In the presence of unreachable blocks, we may see Phi nodes with
1236 // incoming nodes from such blocks. Since InstVisitor skips unreachable
1237 // blocks, such nodes will not have any shadow value associated with them.
1238 // It's easier to remove unreachable blocks than deal with missing shadow.
1240
1241 MS.initializeCallbacks(*F.getParent(), TLI);
1242 FnPrologueEnd =
1243 IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
1244 .CreateIntrinsic(Intrinsic::donothing, {});
1245
1246 if (MS.CompileKernel) {
1247 IRBuilder<> IRB(FnPrologueEnd);
1248 insertKmsanPrologue(IRB);
1249 }
1250
1251 LLVM_DEBUG(if (!InsertChecks) dbgs()
1252 << "MemorySanitizer is not inserting checks into '"
1253 << F.getName() << "'\n");
1254 }
1255
1256 bool instrumentWithCalls(Value *V) {
1257 // Constants likely will be eliminated by follow-up passes.
1258 if (isa<Constant>(V))
1259 return false;
1260 ++SplittableBlocksCount;
1262 SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1263 }
1264
1265 bool isInPrologue(Instruction &I) {
1266 return I.getParent() == FnPrologueEnd->getParent() &&
1267 (&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
1268 }
1269
1270 // Creates a new origin and records the stack trace. In general we can call
1271 // this function for any origin manipulation we like. However it will cost
1272 // runtime resources. So use this wisely only if it can provide additional
1273 // information helpful to a user.
1274 Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1275 if (MS.TrackOrigins <= 1)
1276 return V;
1277 return IRB.CreateCall(MS.MsanChainOriginFn, V);
1278 }
1279
1280 Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1281 const DataLayout &DL = F.getDataLayout();
1282 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1283 if (IntptrSize == kOriginSize)
1284 return Origin;
1285 assert(IntptrSize == kOriginSize * 2);
1286 Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
1287 return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
1288 }
1289
1290 /// Fill memory range with the given origin value.
1291 void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1292 TypeSize TS, Align Alignment) {
1293 const DataLayout &DL = F.getDataLayout();
1294 const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
1295 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1296 assert(IntptrAlignment >= kMinOriginAlignment);
1297 assert(IntptrSize >= kOriginSize);
1298
1299 // Note: The loop based formation works for fixed length vectors too,
1300 // however we prefer to unroll and specialize alignment below.
1301 if (TS.isScalable()) {
1302 Value *Size = IRB.CreateTypeSize(MS.IntptrTy, TS);
1303 Value *RoundUp =
1304 IRB.CreateAdd(Size, ConstantInt::get(MS.IntptrTy, kOriginSize - 1));
1305 Value *End =
1306 IRB.CreateUDiv(RoundUp, ConstantInt::get(MS.IntptrTy, kOriginSize));
1307 auto [InsertPt, Index] =
1309 IRB.SetInsertPoint(InsertPt);
1310
1311 Value *GEP = IRB.CreateGEP(MS.OriginTy, OriginPtr, Index);
1313 return;
1314 }
1315
1316 unsigned Size = TS.getFixedValue();
1317
1318 unsigned Ofs = 0;
1319 Align CurrentAlignment = Alignment;
1320 if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1321 Value *IntptrOrigin = originToIntptr(IRB, Origin);
1322 Value *IntptrOriginPtr = IRB.CreatePointerCast(OriginPtr, MS.PtrTy);
1323 for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1324 Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
1325 : IntptrOriginPtr;
1326 IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
1327 Ofs += IntptrSize / kOriginSize;
1328 CurrentAlignment = IntptrAlignment;
1329 }
1330 }
1331
1332 for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1333 Value *GEP =
1334 i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
1335 IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
1336 CurrentAlignment = kMinOriginAlignment;
1337 }
1338 }
1339
1340 void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1341 Value *OriginPtr, Align Alignment) {
1342 const DataLayout &DL = F.getDataLayout();
1343 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1344 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
1345 // ZExt cannot convert between vector and scalar
1346 Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
1347 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1348 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1349 // Origin is not needed: value is initialized or const shadow is
1350 // ignored.
1351 return;
1352 }
1353 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1354 // Copy origin as the value is definitely uninitialized.
1355 paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
1356 OriginAlignment);
1357 return;
1358 }
1359 // Fallback to runtime check, which still can be optimized out later.
1360 }
1361
1362 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1363 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1364 if (instrumentWithCalls(ConvertedShadow) &&
1365 SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1366 FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1367 Value *ConvertedShadow2 =
1368 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1369 CallBase *CB = IRB.CreateCall(Fn, {ConvertedShadow2, Addr, Origin});
1370 CB->addParamAttr(0, Attribute::ZExt);
1371 CB->addParamAttr(2, Attribute::ZExt);
1372 } else {
1373 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1375 Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
1376 IRBuilder<> IRBNew(CheckTerm);
1377 paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
1378 OriginAlignment);
1379 }
1380 }
1381
1382 void materializeStores() {
1383 for (StoreInst *SI : StoreList) {
1384 IRBuilder<> IRB(SI);
1385 Value *Val = SI->getValueOperand();
1386 Value *Addr = SI->getPointerOperand();
1387 Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
1388 Value *ShadowPtr, *OriginPtr;
1389 Type *ShadowTy = Shadow->getType();
1390 const Align Alignment = SI->getAlign();
1391 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1392 std::tie(ShadowPtr, OriginPtr) =
1393 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1394
1395 [[maybe_unused]] StoreInst *NewSI =
1396 IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
1397 LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
1398
1399 if (SI->isAtomic())
1400 SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
1401
1402 if (MS.TrackOrigins && !SI->isAtomic())
1403 storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
1404 OriginAlignment);
1405 }
1406 }
1407
1408 // Returns true if Debug Location corresponds to multiple warnings.
1409 bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1410 if (MS.TrackOrigins < 2)
1411 return false;
1412
1413 if (LazyWarningDebugLocationCount.empty())
1414 for (const auto &I : InstrumentationList)
1415 ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1416
1417 return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1418 }
1419
1420 /// Helper function to insert a warning at IRB's current insert point.
1421 void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1422 if (!Origin)
1423 Origin = (Value *)IRB.getInt32(0);
1424 assert(Origin->getType()->isIntegerTy());
1425
1426 if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
1427 // Try to create additional origin with debug info of the last origin
1428 // instruction. It may provide additional information to the user.
1429 if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
1430 assert(MS.TrackOrigins);
1431 auto NewDebugLoc = OI->getDebugLoc();
1432 // Origin update with missing or the same debug location provides no
1433 // additional value.
1434 if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1435 // Insert update just before the check, so we call runtime only just
1436 // before the report.
1437 IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1438 IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1439 Origin = updateOrigin(Origin, IRBOrigin);
1440 }
1441 }
1442 }
1443
1444 if (MS.CompileKernel || MS.TrackOrigins)
1445 IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
1446 else
1447 IRB.CreateCall(MS.WarningFn)->setCannotMerge();
1448 // FIXME: Insert UnreachableInst if !MS.Recover?
1449 // This may invalidate some of the following checks and needs to be done
1450 // at the very end.
1451 }
1452
1453 void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1454 Value *Origin) {
1455 const DataLayout &DL = F.getDataLayout();
1456 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1457 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1458 if (instrumentWithCalls(ConvertedShadow) && !MS.CompileKernel) {
1459 // ZExt cannot convert between vector and scalar
1460 ConvertedShadow = convertShadowToScalar(ConvertedShadow, IRB);
1461 Value *ConvertedShadow2 =
1462 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1463
1464 if (SizeIndex < kNumberOfAccessSizes) {
1465 FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1466 CallBase *CB = IRB.CreateCall(
1467 Fn,
1468 {ConvertedShadow2,
1469 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1470 CB->addParamAttr(0, Attribute::ZExt);
1471 CB->addParamAttr(1, Attribute::ZExt);
1472 } else {
1473 FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
1474 Value *ShadowAlloca = IRB.CreateAlloca(ConvertedShadow2->getType(), 0u);
1475 IRB.CreateStore(ConvertedShadow2, ShadowAlloca);
1476 unsigned ShadowSize = DL.getTypeAllocSize(ConvertedShadow2->getType());
1477 CallBase *CB = IRB.CreateCall(
1478 Fn,
1479 {ShadowAlloca, ConstantInt::get(IRB.getInt64Ty(), ShadowSize),
1480 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1481 CB->addParamAttr(1, Attribute::ZExt);
1482 CB->addParamAttr(2, Attribute::ZExt);
1483 }
1484 } else {
1485 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1487 Cmp, &*IRB.GetInsertPoint(),
1488 /* Unreachable */ !MS.Recover, MS.ColdCallWeights);
1489
1490 IRB.SetInsertPoint(CheckTerm);
1491 insertWarningFn(IRB, Origin);
1492 LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
1493 }
1494 }
1495
1496 void materializeInstructionChecks(
1497 ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1498 const DataLayout &DL = F.getDataLayout();
1499 // Disable combining in some cases. TrackOrigins checks each shadow to pick
1500 // correct origin.
1501 bool Combine = !MS.TrackOrigins;
1502 Instruction *Instruction = InstructionChecks.front().OrigIns;
1503 Value *Shadow = nullptr;
1504 for (const auto &ShadowData : InstructionChecks) {
1505 assert(ShadowData.OrigIns == Instruction);
1506 IRBuilder<> IRB(Instruction);
1507
1508 Value *ConvertedShadow = ShadowData.Shadow;
1509
1510 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1511 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1512 // Skip, value is initialized or const shadow is ignored.
1513 continue;
1514 }
1515 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1516 // Report as the value is definitely uninitialized.
1517 insertWarningFn(IRB, ShadowData.Origin);
1518 if (!MS.Recover)
1519 return; // Always fail and stop here, not need to check the rest.
1520 // Skip entire instruction,
1521 continue;
1522 }
1523 // Fallback to runtime check, which still can be optimized out later.
1524 }
1525
1526 if (!Combine) {
1527 materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
1528 continue;
1529 }
1530
1531 if (!Shadow) {
1532 Shadow = ConvertedShadow;
1533 continue;
1534 }
1535
1536 Shadow = convertToBool(Shadow, IRB, "_mscmp");
1537 ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
1538 Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
1539 }
1540
1541 if (Shadow) {
1542 assert(Combine);
1543 IRBuilder<> IRB(Instruction);
1544 materializeOneCheck(IRB, Shadow, nullptr);
1545 }
1546 }
1547
1548 static bool isAArch64SVCount(Type *Ty) {
1549 if (TargetExtType *TTy = dyn_cast<TargetExtType>(Ty))
1550 return TTy->getName() == "aarch64.svcount";
1551 return false;
1552 }
1553
1554 // This is intended to match the "AArch64 Predicate-as-Counter Type" (aka
1555 // 'target("aarch64.svcount")', but not e.g., <vscale x 4 x i32>.
1556 static bool isScalableNonVectorType(Type *Ty) {
1557 if (!isAArch64SVCount(Ty))
1558 LLVM_DEBUG(dbgs() << "isScalableNonVectorType: Unexpected type " << *Ty
1559 << "\n");
1560
1561 return Ty->isScalableTy() && !isa<VectorType>(Ty);
1562 }
1563
1564 void materializeChecks() {
1565#ifndef NDEBUG
1566 // For assert below.
1567 SmallPtrSet<Instruction *, 16> Done;
1568#endif
1569
1570 for (auto I = InstrumentationList.begin();
1571 I != InstrumentationList.end();) {
1572 auto OrigIns = I->OrigIns;
1573 // Checks are grouped by the original instruction. We call all
1574 // `insertShadowCheck` for an instruction at once.
1575 assert(Done.insert(OrigIns).second);
1576 auto J = std::find_if(I + 1, InstrumentationList.end(),
1577 [OrigIns](const ShadowOriginAndInsertPoint &R) {
1578 return OrigIns != R.OrigIns;
1579 });
1580 // Process all checks of instruction at once.
1581 materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1582 I = J;
1583 }
1584
1585 LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1586 }
1587
1588 // Returns the last instruction in the new prologue
1589 void insertKmsanPrologue(IRBuilder<> &IRB) {
1590 Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
1591 Constant *Zero = IRB.getInt32(0);
1592 MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1593 {Zero, IRB.getInt32(0)}, "param_shadow");
1594 MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1595 {Zero, IRB.getInt32(1)}, "retval_shadow");
1596 MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1597 {Zero, IRB.getInt32(2)}, "va_arg_shadow");
1598 MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1599 {Zero, IRB.getInt32(3)}, "va_arg_origin");
1600 MS.VAArgOverflowSizeTLS =
1601 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1602 {Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
1603 MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1604 {Zero, IRB.getInt32(5)}, "param_origin");
1605 MS.RetvalOriginTLS =
1606 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1607 {Zero, IRB.getInt32(6)}, "retval_origin");
1608 if (MS.TargetTriple.getArch() == Triple::systemz)
1609 MS.MsanMetadataAlloca = IRB.CreateAlloca(MS.MsanMetadata, 0u);
1610 }
1611
1612 /// Add MemorySanitizer instrumentation to a function.
1613 bool runOnFunction() {
1614 // Iterate all BBs in depth-first order and create shadow instructions
1615 // for all instructions (where applicable).
1616 // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1617 for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
1618 visit(*BB);
1619
1620 // `visit` above only collects instructions. Process them after iterating
1621 // CFG to avoid requirement on CFG transformations.
1622 for (Instruction *I : Instructions)
1624
1625 // Finalize PHI nodes.
1626 for (PHINode *PN : ShadowPHINodes) {
1627 PHINode *PNS = cast<PHINode>(getShadow(PN));
1628 PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
1629 size_t NumValues = PN->getNumIncomingValues();
1630 for (size_t v = 0; v < NumValues; v++) {
1631 PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
1632 if (PNO)
1633 PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
1634 }
1635 }
1636
1637 VAHelper->finalizeInstrumentation();
1638
1639 // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1640 // instrumenting only allocas.
1642 for (auto Item : LifetimeStartList) {
1643 instrumentAlloca(*Item.second, Item.first);
1644 AllocaSet.remove(Item.second);
1645 }
1646 }
1647 // Poison the allocas for which we didn't instrument the corresponding
1648 // lifetime intrinsics.
1649 for (AllocaInst *AI : AllocaSet)
1650 instrumentAlloca(*AI);
1651
1652 // Insert shadow value checks.
1653 materializeChecks();
1654
1655 // Delayed instrumentation of StoreInst.
1656 // This may not add new address checks.
1657 materializeStores();
1658
1659 return true;
1660 }
1661
1662 /// Compute the shadow type that corresponds to a given Value.
1663 Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
1664
1665 /// Compute the shadow type that corresponds to a given Type.
1666 Type *getShadowTy(Type *OrigTy) {
1667 if (!OrigTy->isSized()) {
1668 return nullptr;
1669 }
1670 // For integer type, shadow is the same as the original type.
1671 // This may return weird-sized types like i1.
1672 if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
1673 return IT;
1674 const DataLayout &DL = F.getDataLayout();
1675 if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
1676 uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
1677 return VectorType::get(IntegerType::get(*MS.C, EltSize),
1678 VT->getElementCount());
1679 }
1680 if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
1681 return ArrayType::get(getShadowTy(AT->getElementType()),
1682 AT->getNumElements());
1683 }
1684 if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
1686 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1687 Elements.push_back(getShadowTy(ST->getElementType(i)));
1688 StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
1689 LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1690 return Res;
1691 }
1692 if (isScalableNonVectorType(OrigTy)) {
1693 LLVM_DEBUG(dbgs() << "getShadowTy: Scalable non-vector type: " << *OrigTy
1694 << "\n");
1695 return OrigTy;
1696 }
1697
1698 uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
1699 return IntegerType::get(*MS.C, TypeSize);
1700 }
1701
1702 /// Extract combined shadow of struct elements as a bool
1703 Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1704 IRBuilder<> &IRB) {
1705 Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
1706 Value *Aggregator = FalseVal;
1707
1708 for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1709 // Combine by ORing together each element's bool shadow
1710 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1711 Value *ShadowBool = convertToBool(ShadowItem, IRB);
1712
1713 if (Aggregator != FalseVal)
1714 Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
1715 else
1716 Aggregator = ShadowBool;
1717 }
1718
1719 return Aggregator;
1720 }
1721
1722 // Extract combined shadow of array elements
1723 Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1724 IRBuilder<> &IRB) {
1725 if (!Array->getNumElements())
1726 return IRB.getIntN(/* width */ 1, /* value */ 0);
1727
1728 Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
1729 Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
1730
1731 for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1732 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1733 Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1734 Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
1735 }
1736 return Aggregator;
1737 }
1738
1739 /// Convert a shadow value to it's flattened variant. The resulting
1740 /// shadow may not necessarily have the same bit width as the input
1741 /// value, but it will always be comparable to zero.
1742 Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1743 if (StructType *Struct = dyn_cast<StructType>(V->getType()))
1744 return collapseStructShadow(Struct, V, IRB);
1745 if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
1746 return collapseArrayShadow(Array, V, IRB);
1747 if (isa<VectorType>(V->getType())) {
1748 if (isa<ScalableVectorType>(V->getType()))
1749 return convertShadowToScalar(IRB.CreateOrReduce(V), IRB);
1750 unsigned BitWidth =
1751 V->getType()->getPrimitiveSizeInBits().getFixedValue();
1752 return IRB.CreateBitCast(V, IntegerType::get(*MS.C, BitWidth));
1753 }
1754 return V;
1755 }
1756
1757 // Convert a scalar value to an i1 by comparing with 0
1758 Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1759 Type *VTy = V->getType();
1760 if (!VTy->isIntegerTy())
1761 return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
1762 if (VTy->getIntegerBitWidth() == 1)
1763 // Just converting a bool to a bool, so do nothing.
1764 return V;
1765 return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
1766 }
1767
1768 Type *ptrToIntPtrType(Type *PtrTy) const {
1769 if (VectorType *VectTy = dyn_cast<VectorType>(PtrTy)) {
1770 return VectorType::get(ptrToIntPtrType(VectTy->getElementType()),
1771 VectTy->getElementCount());
1772 }
1773 assert(PtrTy->isIntOrPtrTy());
1774 return MS.IntptrTy;
1775 }
1776
1777 Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1778 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1779 return VectorType::get(
1780 getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
1781 VectTy->getElementCount());
1782 }
1783 assert(IntPtrTy == MS.IntptrTy);
1784 return MS.PtrTy;
1785 }
1786
1787 Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1788 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1790 VectTy->getElementCount(),
1791 constToIntPtr(VectTy->getElementType(), C));
1792 }
1793 assert(IntPtrTy == MS.IntptrTy);
1794 return ConstantInt::get(MS.IntptrTy, C);
1795 }
1796
1797 /// Returns the integer shadow offset that corresponds to a given
1798 /// application address, whereby:
1799 ///
1800 /// Offset = (Addr & ~AndMask) ^ XorMask
1801 /// Shadow = ShadowBase + Offset
1802 /// Origin = (OriginBase + Offset) & ~Alignment
1803 ///
1804 /// Note: for efficiency, many shadow mappings only require use the XorMask
1805 /// and OriginBase; the AndMask and ShadowBase are often zero.
1806 Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1807 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1808 Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
1809
1810 if (uint64_t AndMask = MS.MapParams->AndMask)
1811 OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
1812
1813 if (uint64_t XorMask = MS.MapParams->XorMask)
1814 OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
1815 return OffsetLong;
1816 }
1817
1818 /// Compute the shadow and origin addresses corresponding to a given
1819 /// application address.
1820 ///
1821 /// Shadow = ShadowBase + Offset
1822 /// Origin = (OriginBase + Offset) & ~3ULL
1823 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1824 /// a single pointee.
1825 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1826 std::pair<Value *, Value *>
1827 getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1828 MaybeAlign Alignment) {
1829 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1830 if (!VectTy) {
1831 assert(Addr->getType()->isPointerTy());
1832 } else {
1833 assert(VectTy->getElementType()->isPointerTy());
1834 }
1835 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1836 Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1837 Value *ShadowLong = ShadowOffset;
1838 if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1839 ShadowLong =
1840 IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
1841 }
1842 Value *ShadowPtr = IRB.CreateIntToPtr(
1843 ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
1844
1845 Value *OriginPtr = nullptr;
1846 if (MS.TrackOrigins) {
1847 Value *OriginLong = ShadowOffset;
1848 uint64_t OriginBase = MS.MapParams->OriginBase;
1849 if (OriginBase != 0)
1850 OriginLong =
1851 IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
1852 if (!Alignment || *Alignment < kMinOriginAlignment) {
1853 uint64_t Mask = kMinOriginAlignment.value() - 1;
1854 OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
1855 }
1856 OriginPtr = IRB.CreateIntToPtr(
1857 OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
1858 }
1859 return std::make_pair(ShadowPtr, OriginPtr);
1860 }
1861
1862 template <typename... ArgsTy>
1863 Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
1864 ArgsTy... Args) {
1865 if (MS.TargetTriple.getArch() == Triple::systemz) {
1866 IRB.CreateCall(Callee,
1867 {MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
1868 return IRB.CreateLoad(MS.MsanMetadata, MS.MsanMetadataAlloca);
1869 }
1870
1871 return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
1872 }
1873
1874 std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1875 IRBuilder<> &IRB,
1876 Type *ShadowTy,
1877 bool isStore) {
1878 Value *ShadowOriginPtrs;
1879 const DataLayout &DL = F.getDataLayout();
1880 TypeSize Size = DL.getTypeStoreSize(ShadowTy);
1881
1882 FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
1883 Value *AddrCast = IRB.CreatePointerCast(Addr, MS.PtrTy);
1884 if (Getter) {
1885 ShadowOriginPtrs = createMetadataCall(IRB, Getter, AddrCast);
1886 } else {
1887 Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
1888 ShadowOriginPtrs = createMetadataCall(
1889 IRB,
1890 isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
1891 AddrCast, SizeVal);
1892 }
1893 Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
1894 ShadowPtr = IRB.CreatePointerCast(ShadowPtr, MS.PtrTy);
1895 Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
1896
1897 return std::make_pair(ShadowPtr, OriginPtr);
1898 }
1899
1900 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1901 /// a single pointee.
1902 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1903 std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1904 IRBuilder<> &IRB,
1905 Type *ShadowTy,
1906 bool isStore) {
1907 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1908 if (!VectTy) {
1909 assert(Addr->getType()->isPointerTy());
1910 return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1911 }
1912
1913 // TODO: Support callbacs with vectors of addresses.
1914 unsigned NumElements = cast<FixedVectorType>(VectTy)->getNumElements();
1915 Value *ShadowPtrs = ConstantInt::getNullValue(
1916 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1917 Value *OriginPtrs = nullptr;
1918 if (MS.TrackOrigins)
1919 OriginPtrs = ConstantInt::getNullValue(
1920 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1921 for (unsigned i = 0; i < NumElements; ++i) {
1922 Value *OneAddr =
1923 IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
1924 auto [ShadowPtr, OriginPtr] =
1925 getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
1926
1927 ShadowPtrs = IRB.CreateInsertElement(
1928 ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1929 if (MS.TrackOrigins)
1930 OriginPtrs = IRB.CreateInsertElement(
1931 OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1932 }
1933 return {ShadowPtrs, OriginPtrs};
1934 }
1935
1936 std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1937 Type *ShadowTy,
1938 MaybeAlign Alignment,
1939 bool isStore) {
1940 if (MS.CompileKernel)
1941 return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1942 return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1943 }
1944
1945 /// Compute the shadow address for a given function argument.
1946 ///
1947 /// Shadow = ParamTLS+ArgOffset.
1948 Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1949 return IRB.CreatePtrAdd(MS.ParamTLS,
1950 ConstantInt::get(MS.IntptrTy, ArgOffset), "_msarg");
1951 }
1952
1953 /// Compute the origin address for a given function argument.
1954 Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1955 if (!MS.TrackOrigins)
1956 return nullptr;
1957 return IRB.CreatePtrAdd(MS.ParamOriginTLS,
1958 ConstantInt::get(MS.IntptrTy, ArgOffset),
1959 "_msarg_o");
1960 }
1961
1962 /// Compute the shadow address for a retval.
1963 Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
1964 return IRB.CreatePointerCast(MS.RetvalTLS, IRB.getPtrTy(0), "_msret");
1965 }
1966
1967 /// Compute the origin address for a retval.
1968 Value *getOriginPtrForRetval() {
1969 // We keep a single origin for the entire retval. Might be too optimistic.
1970 return MS.RetvalOriginTLS;
1971 }
1972
1973 /// Set SV to be the shadow value for V.
1974 void setShadow(Value *V, Value *SV) {
1975 assert(!ShadowMap.count(V) && "Values may only have one shadow");
1976 ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1977 }
1978
1979 /// Set Origin to be the origin value for V.
1980 void setOrigin(Value *V, Value *Origin) {
1981 if (!MS.TrackOrigins)
1982 return;
1983 assert(!OriginMap.count(V) && "Values may only have one origin");
1984 LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
1985 OriginMap[V] = Origin;
1986 }
1987
1988 Constant *getCleanShadow(Type *OrigTy) {
1989 Type *ShadowTy = getShadowTy(OrigTy);
1990 if (!ShadowTy)
1991 return nullptr;
1992 return Constant::getNullValue(ShadowTy);
1993 }
1994
1995 /// Create a clean shadow value for a given value.
1996 ///
1997 /// Clean shadow (all zeroes) means all bits of the value are defined
1998 /// (initialized).
1999 Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
2000
2001 /// Create a dirty shadow of a given shadow type.
2002 Constant *getPoisonedShadow(Type *ShadowTy) {
2003 assert(ShadowTy);
2004 if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
2005 return Constant::getAllOnesValue(ShadowTy);
2006 if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
2007 SmallVector<Constant *, 4> Vals(AT->getNumElements(),
2008 getPoisonedShadow(AT->getElementType()));
2009 return ConstantArray::get(AT, Vals);
2010 }
2011 if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
2013 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
2014 Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
2015 return ConstantStruct::get(ST, Vals);
2016 }
2017 llvm_unreachable("Unexpected shadow type");
2018 }
2019
2020 /// Create a dirty shadow for a given value.
2021 Constant *getPoisonedShadow(Value *V) {
2022 Type *ShadowTy = getShadowTy(V);
2023 if (!ShadowTy)
2024 return nullptr;
2025 return getPoisonedShadow(ShadowTy);
2026 }
2027
2028 /// Create a clean (zero) origin.
2029 Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
2030
2031 /// Get the shadow value for a given Value.
2032 ///
2033 /// This function either returns the value set earlier with setShadow,
2034 /// or extracts if from ParamTLS (for function arguments).
2035 Value *getShadow(Value *V) {
2036 if (Instruction *I = dyn_cast<Instruction>(V)) {
2037 if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
2038 return getCleanShadow(V);
2039 // For instructions the shadow is already stored in the map.
2040 Value *Shadow = ShadowMap[V];
2041 if (!Shadow) {
2042 LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
2043 assert(Shadow && "No shadow for a value");
2044 }
2045 return Shadow;
2046 }
2047 // Handle fully undefined values
2048 // (partially undefined constant vectors are handled later)
2049 if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(V)) {
2050 Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
2051 : getCleanShadow(V);
2052 LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
2053 return AllOnes;
2054 }
2055 if (Argument *A = dyn_cast<Argument>(V)) {
2056 // For arguments we compute the shadow on demand and store it in the map.
2057 Value *&ShadowPtr = ShadowMap[V];
2058 if (ShadowPtr)
2059 return ShadowPtr;
2060 Function *F = A->getParent();
2061 IRBuilder<> EntryIRB(FnPrologueEnd);
2062 unsigned ArgOffset = 0;
2063 const DataLayout &DL = F->getDataLayout();
2064 for (auto &FArg : F->args()) {
2065 if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
2066 LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
2067 ? "vscale not fully supported\n"
2068 : "Arg is not sized\n"));
2069 if (A == &FArg) {
2070 ShadowPtr = getCleanShadow(V);
2071 setOrigin(A, getCleanOrigin());
2072 break;
2073 }
2074 continue;
2075 }
2076
2077 unsigned Size = FArg.hasByValAttr()
2078 ? DL.getTypeAllocSize(FArg.getParamByValType())
2079 : DL.getTypeAllocSize(FArg.getType());
2080
2081 if (A == &FArg) {
2082 bool Overflow = ArgOffset + Size > kParamTLSSize;
2083 if (FArg.hasByValAttr()) {
2084 // ByVal pointer itself has clean shadow. We copy the actual
2085 // argument shadow to the underlying memory.
2086 // Figure out maximal valid memcpy alignment.
2087 const Align ArgAlign = DL.getValueOrABITypeAlignment(
2088 FArg.getParamAlign(), FArg.getParamByValType());
2089 Value *CpShadowPtr, *CpOriginPtr;
2090 std::tie(CpShadowPtr, CpOriginPtr) =
2091 getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
2092 /*isStore*/ true);
2093 if (!PropagateShadow || Overflow) {
2094 // ParamTLS overflow.
2095 EntryIRB.CreateMemSet(
2096 CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
2097 Size, ArgAlign);
2098 } else {
2099 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2100 const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
2101 [[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
2102 CpShadowPtr, CopyAlign, Base, CopyAlign, Size);
2103 LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
2104
2105 if (MS.TrackOrigins) {
2106 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2107 // FIXME: OriginSize should be:
2108 // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
2109 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
2110 EntryIRB.CreateMemCpy(
2111 CpOriginPtr,
2112 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
2113 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
2114 OriginSize);
2115 }
2116 }
2117 }
2118
2119 if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
2120 (MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
2121 ShadowPtr = getCleanShadow(V);
2122 setOrigin(A, getCleanOrigin());
2123 } else {
2124 // Shadow over TLS
2125 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2126 ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
2128 if (MS.TrackOrigins) {
2129 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2130 setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
2131 }
2132 }
2134 << " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
2135 break;
2136 }
2137
2138 ArgOffset += alignTo(Size, kShadowTLSAlignment);
2139 }
2140 assert(ShadowPtr && "Could not find shadow for an argument");
2141 return ShadowPtr;
2142 }
2143
2144 // Check for partially-undefined constant vectors
2145 // TODO: scalable vectors (this is hard because we do not have IRBuilder)
2146 if (isa<FixedVectorType>(V->getType()) && isa<Constant>(V) &&
2147 cast<Constant>(V)->containsUndefOrPoisonElement() && PropagateShadow &&
2148 PoisonUndefVectors) {
2149 unsigned NumElems = cast<FixedVectorType>(V->getType())->getNumElements();
2150 SmallVector<Constant *, 32> ShadowVector(NumElems);
2151 for (unsigned i = 0; i != NumElems; ++i) {
2152 Constant *Elem = cast<Constant>(V)->getAggregateElement(i);
2153 ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem)
2154 : getCleanShadow(Elem);
2155 }
2156
2157 Value *ShadowConstant = ConstantVector::get(ShadowVector);
2158 LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
2159 << *ShadowConstant << "\n");
2160
2161 return ShadowConstant;
2162 }
2163
2164 // TODO: partially-undefined constant arrays, structures, and nested types
2165
2166 // For everything else the shadow is zero.
2167 return getCleanShadow(V);
2168 }
2169
2170 /// Get the shadow for i-th argument of the instruction I.
2171 Value *getShadow(Instruction *I, int i) {
2172 return getShadow(I->getOperand(i));
2173 }
2174
2175 /// Get the origin for a value.
2176 Value *getOrigin(Value *V) {
2177 if (!MS.TrackOrigins)
2178 return nullptr;
2179 if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
2180 return getCleanOrigin();
2182 "Unexpected value type in getOrigin()");
2183 if (Instruction *I = dyn_cast<Instruction>(V)) {
2184 if (I->getMetadata(LLVMContext::MD_nosanitize))
2185 return getCleanOrigin();
2186 }
2187 Value *Origin = OriginMap[V];
2188 assert(Origin && "Missing origin");
2189 return Origin;
2190 }
2191
2192 /// Get the origin for i-th argument of the instruction I.
2193 Value *getOrigin(Instruction *I, int i) {
2194 return getOrigin(I->getOperand(i));
2195 }
2196
2197 /// Remember the place where a shadow check should be inserted.
2198 ///
2199 /// This location will be later instrumented with a check that will print a
2200 /// UMR warning in runtime if the shadow value is not 0.
2201 void insertCheckShadow(Value *Shadow, Value *Origin, Instruction *OrigIns) {
2202 assert(Shadow);
2203 if (!InsertChecks)
2204 return;
2205
2206 if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
2207 LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
2208 << *OrigIns << "\n");
2209 return;
2210 }
2211
2212 Type *ShadowTy = Shadow->getType();
2213 if (isScalableNonVectorType(ShadowTy)) {
2214 LLVM_DEBUG(dbgs() << "Skipping check of scalable non-vector " << *Shadow
2215 << " before " << *OrigIns << "\n");
2216 return;
2217 }
2218#ifndef NDEBUG
2219 assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2220 isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2221 "Can only insert checks for integer, vector, and aggregate shadow "
2222 "types");
2223#endif
2224 InstrumentationList.push_back(
2225 ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2226 }
2227
2228 /// Get shadow for value, and remember the place where a shadow check should
2229 /// be inserted.
2230 ///
2231 /// This location will be later instrumented with a check that will print a
2232 /// UMR warning in runtime if the value is not fully defined.
2233 void insertCheckShadowOf(Value *Val, Instruction *OrigIns) {
2234 assert(Val);
2235 Value *Shadow, *Origin;
2237 Shadow = getShadow(Val);
2238 if (!Shadow)
2239 return;
2240 Origin = getOrigin(Val);
2241 } else {
2242 Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
2243 if (!Shadow)
2244 return;
2245 Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
2246 }
2247 insertCheckShadow(Shadow, Origin, OrigIns);
2248 }
2249
2251 switch (a) {
2252 case AtomicOrdering::NotAtomic:
2253 return AtomicOrdering::NotAtomic;
2254 case AtomicOrdering::Unordered:
2255 case AtomicOrdering::Monotonic:
2256 case AtomicOrdering::Release:
2257 return AtomicOrdering::Release;
2258 case AtomicOrdering::Acquire:
2259 case AtomicOrdering::AcquireRelease:
2260 return AtomicOrdering::AcquireRelease;
2261 case AtomicOrdering::SequentiallyConsistent:
2262 return AtomicOrdering::SequentiallyConsistent;
2263 }
2264 llvm_unreachable("Unknown ordering");
2265 }
2266
2267 Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2268 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2269 uint32_t OrderingTable[NumOrderings] = {};
2270
2271 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2272 OrderingTable[(int)AtomicOrderingCABI::release] =
2273 (int)AtomicOrderingCABI::release;
2274 OrderingTable[(int)AtomicOrderingCABI::consume] =
2275 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2276 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2277 (int)AtomicOrderingCABI::acq_rel;
2278 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2279 (int)AtomicOrderingCABI::seq_cst;
2280
2281 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2282 }
2283
2285 switch (a) {
2286 case AtomicOrdering::NotAtomic:
2287 return AtomicOrdering::NotAtomic;
2288 case AtomicOrdering::Unordered:
2289 case AtomicOrdering::Monotonic:
2290 case AtomicOrdering::Acquire:
2291 return AtomicOrdering::Acquire;
2292 case AtomicOrdering::Release:
2293 case AtomicOrdering::AcquireRelease:
2294 return AtomicOrdering::AcquireRelease;
2295 case AtomicOrdering::SequentiallyConsistent:
2296 return AtomicOrdering::SequentiallyConsistent;
2297 }
2298 llvm_unreachable("Unknown ordering");
2299 }
2300
2301 Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2302 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2303 uint32_t OrderingTable[NumOrderings] = {};
2304
2305 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2306 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2307 OrderingTable[(int)AtomicOrderingCABI::consume] =
2308 (int)AtomicOrderingCABI::acquire;
2309 OrderingTable[(int)AtomicOrderingCABI::release] =
2310 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2311 (int)AtomicOrderingCABI::acq_rel;
2312 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2313 (int)AtomicOrderingCABI::seq_cst;
2314
2315 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2316 }
2317
2318 // ------------------- Visitors.
2319 using InstVisitor<MemorySanitizerVisitor>::visit;
2320 void visit(Instruction &I) {
2321 if (I.getMetadata(LLVMContext::MD_nosanitize))
2322 return;
2323 // Don't want to visit if we're in the prologue
2324 if (isInPrologue(I))
2325 return;
2326 if (!DebugCounter::shouldExecute(DebugInstrumentInstruction)) {
2327 LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
2328 // We still need to set the shadow and origin to clean values.
2329 setShadow(&I, getCleanShadow(&I));
2330 setOrigin(&I, getCleanOrigin());
2331 return;
2332 }
2333
2334 Instructions.push_back(&I);
2335 }
2336
2337 /// Instrument LoadInst
2338 ///
2339 /// Loads the corresponding shadow and (optionally) origin.
2340 /// Optionally, checks that the load address is fully defined.
2341 void visitLoadInst(LoadInst &I) {
2342 assert(I.getType()->isSized() && "Load type must have size");
2343 assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2344 NextNodeIRBuilder IRB(&I);
2345 Type *ShadowTy = getShadowTy(&I);
2346 Value *Addr = I.getPointerOperand();
2347 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2348 const Align Alignment = I.getAlign();
2349 if (PropagateShadow) {
2350 std::tie(ShadowPtr, OriginPtr) =
2351 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2352 setShadow(&I,
2353 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2354 } else {
2355 setShadow(&I, getCleanShadow(&I));
2356 }
2357
2359 insertCheckShadowOf(I.getPointerOperand(), &I);
2360
2361 if (I.isAtomic())
2362 I.setOrdering(addAcquireOrdering(I.getOrdering()));
2363
2364 if (MS.TrackOrigins) {
2365 if (PropagateShadow) {
2366 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
2367 setOrigin(
2368 &I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
2369 } else {
2370 setOrigin(&I, getCleanOrigin());
2371 }
2372 }
2373 }
2374
2375 /// Instrument StoreInst
2376 ///
2377 /// Stores the corresponding shadow and (optionally) origin.
2378 /// Optionally, checks that the store address is fully defined.
2379 void visitStoreInst(StoreInst &I) {
2380 StoreList.push_back(&I);
2382 insertCheckShadowOf(I.getPointerOperand(), &I);
2383 }
2384
2385 void handleCASOrRMW(Instruction &I) {
2387
2388 IRBuilder<> IRB(&I);
2389 Value *Addr = I.getOperand(0);
2390 Value *Val = I.getOperand(1);
2391 Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
2392 /*isStore*/ true)
2393 .first;
2394
2396 insertCheckShadowOf(Addr, &I);
2397
2398 // Only test the conditional argument of cmpxchg instruction.
2399 // The other argument can potentially be uninitialized, but we can not
2400 // detect this situation reliably without possible false positives.
2402 insertCheckShadowOf(Val, &I);
2403
2404 IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
2405
2406 setShadow(&I, getCleanShadow(&I));
2407 setOrigin(&I, getCleanOrigin());
2408 }
2409
2410 void visitAtomicRMWInst(AtomicRMWInst &I) {
2411 handleCASOrRMW(I);
2412 I.setOrdering(addReleaseOrdering(I.getOrdering()));
2413 }
2414
2415 void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2416 handleCASOrRMW(I);
2417 I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
2418 }
2419
2420 // Vector manipulation.
2421 void visitExtractElementInst(ExtractElementInst &I) {
2422 insertCheckShadowOf(I.getOperand(1), &I);
2423 IRBuilder<> IRB(&I);
2424 setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
2425 "_msprop"));
2426 setOrigin(&I, getOrigin(&I, 0));
2427 }
2428
2429 void visitInsertElementInst(InsertElementInst &I) {
2430 insertCheckShadowOf(I.getOperand(2), &I);
2431 IRBuilder<> IRB(&I);
2432 auto *Shadow0 = getShadow(&I, 0);
2433 auto *Shadow1 = getShadow(&I, 1);
2434 setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
2435 "_msprop"));
2436 setOriginForNaryOp(I);
2437 }
2438
2439 void visitShuffleVectorInst(ShuffleVectorInst &I) {
2440 IRBuilder<> IRB(&I);
2441 auto *Shadow0 = getShadow(&I, 0);
2442 auto *Shadow1 = getShadow(&I, 1);
2443 setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
2444 "_msprop"));
2445 setOriginForNaryOp(I);
2446 }
2447
2448 // Casts.
2449 void visitSExtInst(SExtInst &I) {
2450 IRBuilder<> IRB(&I);
2451 setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
2452 setOrigin(&I, getOrigin(&I, 0));
2453 }
2454
2455 void visitZExtInst(ZExtInst &I) {
2456 IRBuilder<> IRB(&I);
2457 setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
2458 setOrigin(&I, getOrigin(&I, 0));
2459 }
2460
2461 void visitTruncInst(TruncInst &I) {
2462 IRBuilder<> IRB(&I);
2463 setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
2464 setOrigin(&I, getOrigin(&I, 0));
2465 }
2466
2467 void visitBitCastInst(BitCastInst &I) {
2468 // Special case: if this is the bitcast (there is exactly 1 allowed) between
2469 // a musttail call and a ret, don't instrument. New instructions are not
2470 // allowed after a musttail call.
2471 if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
2472 if (CI->isMustTailCall())
2473 return;
2474 IRBuilder<> IRB(&I);
2475 setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
2476 setOrigin(&I, getOrigin(&I, 0));
2477 }
2478
2479 void visitPtrToIntInst(PtrToIntInst &I) {
2480 IRBuilder<> IRB(&I);
2481 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2482 "_msprop_ptrtoint"));
2483 setOrigin(&I, getOrigin(&I, 0));
2484 }
2485
2486 void visitIntToPtrInst(IntToPtrInst &I) {
2487 IRBuilder<> IRB(&I);
2488 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2489 "_msprop_inttoptr"));
2490 setOrigin(&I, getOrigin(&I, 0));
2491 }
2492
2493 void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2494 void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2495 void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2496 void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2497 void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2498 void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2499
2500 /// Propagate shadow for bitwise AND.
2501 ///
2502 /// This code is exact, i.e. if, for example, a bit in the left argument
2503 /// is defined and 0, then neither the value not definedness of the
2504 /// corresponding bit in B don't affect the resulting shadow.
2505 void visitAnd(BinaryOperator &I) {
2506 IRBuilder<> IRB(&I);
2507 // "And" of 0 and a poisoned value results in unpoisoned value.
2508 // 1&1 => 1; 0&1 => 0; p&1 => p;
2509 // 1&0 => 0; 0&0 => 0; p&0 => 0;
2510 // 1&p => p; 0&p => 0; p&p => p;
2511 // S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2512 Value *S1 = getShadow(&I, 0);
2513 Value *S2 = getShadow(&I, 1);
2514 Value *V1 = I.getOperand(0);
2515 Value *V2 = I.getOperand(1);
2516 if (V1->getType() != S1->getType()) {
2517 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2518 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2519 }
2520 Value *S1S2 = IRB.CreateAnd(S1, S2);
2521 Value *V1S2 = IRB.CreateAnd(V1, S2);
2522 Value *S1V2 = IRB.CreateAnd(S1, V2);
2523 setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2524 setOriginForNaryOp(I);
2525 }
2526
2527 void visitOr(BinaryOperator &I) {
2528 IRBuilder<> IRB(&I);
2529 // "Or" of 1 and a poisoned value results in unpoisoned value:
2530 // 1|1 => 1; 0|1 => 1; p|1 => 1;
2531 // 1|0 => 1; 0|0 => 0; p|0 => p;
2532 // 1|p => 1; 0|p => p; p|p => p;
2533 //
2534 // S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2535 //
2536 // If the "disjoint OR" property is violated, the result is poison, and
2537 // hence the entire shadow is uninitialized:
2538 // S = S | SignExt(V1 & V2 != 0)
2539 Value *S1 = getShadow(&I, 0);
2540 Value *S2 = getShadow(&I, 1);
2541 Value *V1 = I.getOperand(0);
2542 Value *V2 = I.getOperand(1);
2543 if (V1->getType() != S1->getType()) {
2544 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2545 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2546 }
2547
2548 Value *NotV1 = IRB.CreateNot(V1);
2549 Value *NotV2 = IRB.CreateNot(V2);
2550
2551 Value *S1S2 = IRB.CreateAnd(S1, S2);
2552 Value *S2NotV1 = IRB.CreateAnd(NotV1, S2);
2553 Value *S1NotV2 = IRB.CreateAnd(S1, NotV2);
2554
2555 Value *S = IRB.CreateOr({S1S2, S2NotV1, S1NotV2});
2556
2557 if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint()) {
2558 Value *V1V2 = IRB.CreateAnd(V1, V2);
2559 Value *DisjointOrShadow = IRB.CreateSExt(
2560 IRB.CreateICmpNE(V1V2, getCleanShadow(V1V2)), V1V2->getType());
2561 S = IRB.CreateOr(S, DisjointOrShadow, "_ms_disjoint");
2562 }
2563
2564 setShadow(&I, S);
2565 setOriginForNaryOp(I);
2566 }
2567
2568 /// Default propagation of shadow and/or origin.
2569 ///
2570 /// This class implements the general case of shadow propagation, used in all
2571 /// cases where we don't know and/or don't care about what the operation
2572 /// actually does. It converts all input shadow values to a common type
2573 /// (extending or truncating as necessary), and bitwise OR's them.
2574 ///
2575 /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2576 /// fully initialized), and less prone to false positives.
2577 ///
2578 /// This class also implements the general case of origin propagation. For a
2579 /// Nary operation, result origin is set to the origin of an argument that is
2580 /// not entirely initialized. If there is more than one such arguments, the
2581 /// rightmost of them is picked. It does not matter which one is picked if all
2582 /// arguments are initialized.
2583 template <bool CombineShadow> class Combiner {
2584 Value *Shadow = nullptr;
2585 Value *Origin = nullptr;
2586 IRBuilder<> &IRB;
2587 MemorySanitizerVisitor *MSV;
2588
2589 public:
2590 Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2591 : IRB(IRB), MSV(MSV) {}
2592
2593 /// Add a pair of shadow and origin values to the mix.
2594 Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2595 if (CombineShadow) {
2596 assert(OpShadow);
2597 if (!Shadow)
2598 Shadow = OpShadow;
2599 else {
2600 OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
2601 Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
2602 }
2603 }
2604
2605 if (MSV->MS.TrackOrigins) {
2606 assert(OpOrigin);
2607 if (!Origin) {
2608 Origin = OpOrigin;
2609 } else {
2610 Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
2611 // No point in adding something that might result in 0 origin value.
2612 if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2613 Value *Cond = MSV->convertToBool(OpShadow, IRB);
2614 Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
2615 }
2616 }
2617 }
2618 return *this;
2619 }
2620
2621 /// Add an application value to the mix.
2622 Combiner &Add(Value *V) {
2623 Value *OpShadow = MSV->getShadow(V);
2624 Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2625 return Add(OpShadow, OpOrigin);
2626 }
2627
2628 /// Set the current combined values as the given instruction's shadow
2629 /// and origin.
2630 void Done(Instruction *I) {
2631 if (CombineShadow) {
2632 assert(Shadow);
2633 Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
2634 MSV->setShadow(I, Shadow);
2635 }
2636 if (MSV->MS.TrackOrigins) {
2637 assert(Origin);
2638 MSV->setOrigin(I, Origin);
2639 }
2640 }
2641
2642 /// Store the current combined value at the specified origin
2643 /// location.
2644 void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
2645 if (MSV->MS.TrackOrigins) {
2646 assert(Origin);
2647 MSV->paintOrigin(IRB, Origin, OriginPtr, TS, kMinOriginAlignment);
2648 }
2649 }
2650 };
2651
2652 using ShadowAndOriginCombiner = Combiner<true>;
2653 using OriginCombiner = Combiner<false>;
2654
2655 /// Propagate origin for arbitrary operation.
2656 void setOriginForNaryOp(Instruction &I) {
2657 if (!MS.TrackOrigins)
2658 return;
2659 IRBuilder<> IRB(&I);
2660 OriginCombiner OC(this, IRB);
2661 for (Use &Op : I.operands())
2662 OC.Add(Op.get());
2663 OC.Done(&I);
2664 }
2665
2666 size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2667 assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2668 "Vector of pointers is not a valid shadow type");
2669 return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
2671 : Ty->getPrimitiveSizeInBits();
2672 }
2673
2674 /// Cast between two shadow types, extending or truncating as
2675 /// necessary.
2676 Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2677 bool Signed = false) {
2678 Type *srcTy = V->getType();
2679 if (srcTy == dstTy)
2680 return V;
2681 size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
2682 size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
2683 if (srcSizeInBits > 1 && dstSizeInBits == 1)
2684 return IRB.CreateICmpNE(V, getCleanShadow(V));
2685
2686 if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2687 return IRB.CreateIntCast(V, dstTy, Signed);
2688 if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2689 cast<VectorType>(dstTy)->getElementCount() ==
2690 cast<VectorType>(srcTy)->getElementCount())
2691 return IRB.CreateIntCast(V, dstTy, Signed);
2692 Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
2693 Value *V2 =
2694 IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
2695 return IRB.CreateBitCast(V2, dstTy);
2696 // TODO: handle struct types.
2697 }
2698
2699 /// Cast an application value to the type of its own shadow.
2700 Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2701 Type *ShadowTy = getShadowTy(V);
2702 if (V->getType() == ShadowTy)
2703 return V;
2704 if (V->getType()->isPtrOrPtrVectorTy())
2705 return IRB.CreatePtrToInt(V, ShadowTy);
2706 else
2707 return IRB.CreateBitCast(V, ShadowTy);
2708 }
2709
2710 /// Propagate shadow for arbitrary operation.
2711 void handleShadowOr(Instruction &I) {
2712 IRBuilder<> IRB(&I);
2713 ShadowAndOriginCombiner SC(this, IRB);
2714 for (Use &Op : I.operands())
2715 SC.Add(Op.get());
2716 SC.Done(&I);
2717 }
2718
2719 // Perform a bitwise OR on the horizontal pairs (or other specified grouping)
2720 // of elements.
2721 //
2722 // For example, suppose we have:
2723 // VectorA: <a1, a2, a3, a4, a5, a6>
2724 // VectorB: <b1, b2, b3, b4, b5, b6>
2725 // ReductionFactor: 3.
2726 // The output would be:
2727 // <a1|a2|a3, a4|a5|a6, b1|b2|b3, b4|b5|b6>
2728 //
2729 // This is convenient for instrumenting horizontal add/sub.
2730 // For bitwise OR on "vertical" pairs, see maybeHandleSimpleNomemIntrinsic().
2731 Value *horizontalReduce(IntrinsicInst &I, unsigned ReductionFactor,
2732 Value *VectorA, Value *VectorB) {
2733 assert(isa<FixedVectorType>(VectorA->getType()));
2734 unsigned TotalNumElems =
2735 cast<FixedVectorType>(VectorA->getType())->getNumElements();
2736
2737 if (VectorB) {
2738 assert(VectorA->getType() == VectorB->getType());
2739 TotalNumElems = TotalNumElems * 2;
2740 }
2741
2742 assert(TotalNumElems % ReductionFactor == 0);
2743
2744 Value *Or = nullptr;
2745
2746 IRBuilder<> IRB(&I);
2747 for (unsigned i = 0; i < ReductionFactor; i++) {
2748 SmallVector<int, 16> Mask;
2749 for (unsigned X = 0; X < TotalNumElems; X += ReductionFactor)
2750 Mask.push_back(X + i);
2751
2752 Value *Masked;
2753 if (VectorB)
2754 Masked = IRB.CreateShuffleVector(VectorA, VectorB, Mask);
2755 else
2756 Masked = IRB.CreateShuffleVector(VectorA, Mask);
2757
2758 if (Or)
2759 Or = IRB.CreateOr(Or, Masked);
2760 else
2761 Or = Masked;
2762 }
2763
2764 return Or;
2765 }
2766
2767 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2768 /// fields.
2769 ///
2770 /// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
2771 /// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
2772 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I) {
2773 assert(I.arg_size() == 1 || I.arg_size() == 2);
2774
2775 assert(I.getType()->isVectorTy());
2776 assert(I.getArgOperand(0)->getType()->isVectorTy());
2777
2778 [[maybe_unused]] FixedVectorType *ParamType =
2779 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2780 assert((I.arg_size() != 2) ||
2781 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2782 [[maybe_unused]] FixedVectorType *ReturnType =
2783 cast<FixedVectorType>(I.getType());
2784 assert(ParamType->getNumElements() * I.arg_size() ==
2785 2 * ReturnType->getNumElements());
2786
2787 IRBuilder<> IRB(&I);
2788
2789 // Horizontal OR of shadow
2790 Value *FirstArgShadow = getShadow(&I, 0);
2791 Value *SecondArgShadow = nullptr;
2792 if (I.arg_size() == 2)
2793 SecondArgShadow = getShadow(&I, 1);
2794
2795 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2796 SecondArgShadow);
2797
2798 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2799
2800 setShadow(&I, OrShadow);
2801 setOriginForNaryOp(I);
2802 }
2803
2804 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2805 /// fields, with the parameters reinterpreted to have elements of a specified
2806 /// width. For example:
2807 /// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
2808 /// conceptually operates on
2809 /// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
2810 /// and can be handled with ReinterpretElemWidth == 16.
2811 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I,
2812 int ReinterpretElemWidth) {
2813 assert(I.arg_size() == 1 || I.arg_size() == 2);
2814
2815 assert(I.getType()->isVectorTy());
2816 assert(I.getArgOperand(0)->getType()->isVectorTy());
2817
2818 FixedVectorType *ParamType =
2819 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2820 assert((I.arg_size() != 2) ||
2821 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2822
2823 [[maybe_unused]] FixedVectorType *ReturnType =
2824 cast<FixedVectorType>(I.getType());
2825 assert(ParamType->getNumElements() * I.arg_size() ==
2826 2 * ReturnType->getNumElements());
2827
2828 IRBuilder<> IRB(&I);
2829
2830 FixedVectorType *ReinterpretShadowTy = nullptr;
2831 assert(isAligned(Align(ReinterpretElemWidth),
2832 ParamType->getPrimitiveSizeInBits()));
2833 ReinterpretShadowTy = FixedVectorType::get(
2834 IRB.getIntNTy(ReinterpretElemWidth),
2835 ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
2836
2837 // Horizontal OR of shadow
2838 Value *FirstArgShadow = getShadow(&I, 0);
2839 FirstArgShadow = IRB.CreateBitCast(FirstArgShadow, ReinterpretShadowTy);
2840
2841 // If we had two parameters each with an odd number of elements, the total
2842 // number of elements is even, but we have never seen this in extant
2843 // instruction sets, so we enforce that each parameter must have an even
2844 // number of elements.
2846 Align(2),
2847 cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
2848
2849 Value *SecondArgShadow = nullptr;
2850 if (I.arg_size() == 2) {
2851 SecondArgShadow = getShadow(&I, 1);
2852 SecondArgShadow = IRB.CreateBitCast(SecondArgShadow, ReinterpretShadowTy);
2853 }
2854
2855 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, FirstArgShadow,
2856 SecondArgShadow);
2857
2858 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2859
2860 setShadow(&I, OrShadow);
2861 setOriginForNaryOp(I);
2862 }
2863
2864 void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2865
2866 // Handle multiplication by constant.
2867 //
2868 // Handle a special case of multiplication by constant that may have one or
2869 // more zeros in the lower bits. This makes corresponding number of lower bits
2870 // of the result zero as well. We model it by shifting the other operand
2871 // shadow left by the required number of bits. Effectively, we transform
2872 // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2873 // We use multiplication by 2**N instead of shift to cover the case of
2874 // multiplication by 0, which may occur in some elements of a vector operand.
2875 void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2876 Value *OtherArg) {
2877 Constant *ShadowMul;
2878 Type *Ty = ConstArg->getType();
2879 if (auto *VTy = dyn_cast<VectorType>(Ty)) {
2880 unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
2881 Type *EltTy = VTy->getElementType();
2883 for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2884 if (ConstantInt *Elt =
2886 const APInt &V = Elt->getValue();
2887 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2888 Elements.push_back(ConstantInt::get(EltTy, V2));
2889 } else {
2890 Elements.push_back(ConstantInt::get(EltTy, 1));
2891 }
2892 }
2893 ShadowMul = ConstantVector::get(Elements);
2894 } else {
2895 if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
2896 const APInt &V = Elt->getValue();
2897 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2898 ShadowMul = ConstantInt::get(Ty, V2);
2899 } else {
2900 ShadowMul = ConstantInt::get(Ty, 1);
2901 }
2902 }
2903
2904 IRBuilder<> IRB(&I);
2905 setShadow(&I,
2906 IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
2907 setOrigin(&I, getOrigin(OtherArg));
2908 }
2909
2910 void visitMul(BinaryOperator &I) {
2911 Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
2912 Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
2913 if (constOp0 && !constOp1)
2914 handleMulByConstant(I, constOp0, I.getOperand(1));
2915 else if (constOp1 && !constOp0)
2916 handleMulByConstant(I, constOp1, I.getOperand(0));
2917 else
2918 handleShadowOr(I);
2919 }
2920
2921 void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2922 void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2923 void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2924 void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2925 void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2926 void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2927
2928 void handleIntegerDiv(Instruction &I) {
2929 IRBuilder<> IRB(&I);
2930 // Strict on the second argument.
2931 insertCheckShadowOf(I.getOperand(1), &I);
2932 setShadow(&I, getShadow(&I, 0));
2933 setOrigin(&I, getOrigin(&I, 0));
2934 }
2935
2936 void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2937 void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2938 void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2939 void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2940
2941 // Floating point division is side-effect free. We can not require that the
2942 // divisor is fully initialized and must propagate shadow. See PR37523.
2943 void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2944 void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2945
2946 /// Instrument == and != comparisons.
2947 ///
2948 /// Sometimes the comparison result is known even if some of the bits of the
2949 /// arguments are not.
2950 void handleEqualityComparison(ICmpInst &I) {
2951 IRBuilder<> IRB(&I);
2952 Value *A = I.getOperand(0);
2953 Value *B = I.getOperand(1);
2954 Value *Sa = getShadow(A);
2955 Value *Sb = getShadow(B);
2956
2957 // Get rid of pointers and vectors of pointers.
2958 // For ints (and vectors of ints), types of A and Sa match,
2959 // and this is a no-op.
2960 A = IRB.CreatePointerCast(A, Sa->getType());
2961 B = IRB.CreatePointerCast(B, Sb->getType());
2962
2963 // A == B <==> (C = A^B) == 0
2964 // A != B <==> (C = A^B) != 0
2965 // Sc = Sa | Sb
2966 Value *C = IRB.CreateXor(A, B);
2967 Value *Sc = IRB.CreateOr(Sa, Sb);
2968 // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2969 // Result is defined if one of the following is true
2970 // * there is a defined 1 bit in C
2971 // * C is fully defined
2972 // Si = !(C & ~Sc) && Sc
2974 Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
2975 Value *LHS = IRB.CreateICmpNE(Sc, Zero);
2976 Value *RHS =
2977 IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
2978 Value *Si = IRB.CreateAnd(LHS, RHS);
2979 Si->setName("_msprop_icmp");
2980 setShadow(&I, Si);
2981 setOriginForNaryOp(I);
2982 }
2983
2984 /// Instrument relational comparisons.
2985 ///
2986 /// This function does exact shadow propagation for all relational
2987 /// comparisons of integers, pointers and vectors of those.
2988 /// FIXME: output seems suboptimal when one of the operands is a constant
2989 void handleRelationalComparisonExact(ICmpInst &I) {
2990 IRBuilder<> IRB(&I);
2991 Value *A = I.getOperand(0);
2992 Value *B = I.getOperand(1);
2993 Value *Sa = getShadow(A);
2994 Value *Sb = getShadow(B);
2995
2996 // Get rid of pointers and vectors of pointers.
2997 // For ints (and vectors of ints), types of A and Sa match,
2998 // and this is a no-op.
2999 A = IRB.CreatePointerCast(A, Sa->getType());
3000 B = IRB.CreatePointerCast(B, Sb->getType());
3001
3002 // Let [a0, a1] be the interval of possible values of A, taking into account
3003 // its undefined bits. Let [b0, b1] be the interval of possible values of B.
3004 // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
3005 bool IsSigned = I.isSigned();
3006
3007 auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
3008 if (IsSigned) {
3009 // Sign-flip to map from signed range to unsigned range. Relation A vs B
3010 // should be preserved, if checked with `getUnsignedPredicate()`.
3011 // Relationship between Amin, Amax, Bmin, Bmax also will not be
3012 // affected, as they are created by effectively adding/substructing from
3013 // A (or B) a value, derived from shadow, with no overflow, either
3014 // before or after sign flip.
3015 APInt MinVal =
3016 APInt::getSignedMinValue(V->getType()->getScalarSizeInBits());
3017 V = IRB.CreateXor(V, ConstantInt::get(V->getType(), MinVal));
3018 }
3019 // Minimize undefined bits.
3020 Value *Min = IRB.CreateAnd(V, IRB.CreateNot(S));
3021 Value *Max = IRB.CreateOr(V, S);
3022 return std::make_pair(Min, Max);
3023 };
3024
3025 auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
3026 auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
3027 Value *S1 = IRB.CreateICmp(I.getUnsignedPredicate(), Amin, Bmax);
3028 Value *S2 = IRB.CreateICmp(I.getUnsignedPredicate(), Amax, Bmin);
3029
3030 Value *Si = IRB.CreateXor(S1, S2);
3031 setShadow(&I, Si);
3032 setOriginForNaryOp(I);
3033 }
3034
3035 /// Instrument signed relational comparisons.
3036 ///
3037 /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
3038 /// bit of the shadow. Everything else is delegated to handleShadowOr().
3039 void handleSignedRelationalComparison(ICmpInst &I) {
3040 Constant *constOp;
3041 Value *op = nullptr;
3043 if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
3044 op = I.getOperand(0);
3045 pre = I.getPredicate();
3046 } else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
3047 op = I.getOperand(1);
3048 pre = I.getSwappedPredicate();
3049 } else {
3050 handleShadowOr(I);
3051 return;
3052 }
3053
3054 if ((constOp->isNullValue() &&
3055 (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
3056 (constOp->isAllOnesValue() &&
3057 (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
3058 IRBuilder<> IRB(&I);
3059 Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
3060 "_msprop_icmp_s");
3061 setShadow(&I, Shadow);
3062 setOrigin(&I, getOrigin(op));
3063 } else {
3064 handleShadowOr(I);
3065 }
3066 }
3067
3068 void visitICmpInst(ICmpInst &I) {
3069 if (!ClHandleICmp) {
3070 handleShadowOr(I);
3071 return;
3072 }
3073 if (I.isEquality()) {
3074 handleEqualityComparison(I);
3075 return;
3076 }
3077
3078 assert(I.isRelational());
3079 if (ClHandleICmpExact) {
3080 handleRelationalComparisonExact(I);
3081 return;
3082 }
3083 if (I.isSigned()) {
3084 handleSignedRelationalComparison(I);
3085 return;
3086 }
3087
3088 assert(I.isUnsigned());
3089 if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
3090 handleRelationalComparisonExact(I);
3091 return;
3092 }
3093
3094 handleShadowOr(I);
3095 }
3096
3097 void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
3098
3099 void handleShift(BinaryOperator &I) {
3100 IRBuilder<> IRB(&I);
3101 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3102 // Otherwise perform the same shift on S1.
3103 Value *S1 = getShadow(&I, 0);
3104 Value *S2 = getShadow(&I, 1);
3105 Value *S2Conv =
3106 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3107 Value *V2 = I.getOperand(1);
3108 Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
3109 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3110 setOriginForNaryOp(I);
3111 }
3112
3113 void visitShl(BinaryOperator &I) { handleShift(I); }
3114 void visitAShr(BinaryOperator &I) { handleShift(I); }
3115 void visitLShr(BinaryOperator &I) { handleShift(I); }
3116
3117 void handleFunnelShift(IntrinsicInst &I) {
3118 IRBuilder<> IRB(&I);
3119 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3120 // Otherwise perform the same shift on S0 and S1.
3121 Value *S0 = getShadow(&I, 0);
3122 Value *S1 = getShadow(&I, 1);
3123 Value *S2 = getShadow(&I, 2);
3124 Value *S2Conv =
3125 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3126 Value *V2 = I.getOperand(2);
3127 Value *Shift = IRB.CreateIntrinsic(I.getIntrinsicID(), S2Conv->getType(),
3128 {S0, S1, V2});
3129 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3130 setOriginForNaryOp(I);
3131 }
3132
3133 /// Instrument llvm.memmove
3134 ///
3135 /// At this point we don't know if llvm.memmove will be inlined or not.
3136 /// If we don't instrument it and it gets inlined,
3137 /// our interceptor will not kick in and we will lose the memmove.
3138 /// If we instrument the call here, but it does not get inlined,
3139 /// we will memmove the shadow twice: which is bad in case
3140 /// of overlapping regions. So, we simply lower the intrinsic to a call.
3141 ///
3142 /// Similar situation exists for memcpy and memset.
3143 void visitMemMoveInst(MemMoveInst &I) {
3144 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3145 IRBuilder<> IRB(&I);
3146 IRB.CreateCall(MS.MemmoveFn,
3147 {I.getArgOperand(0), I.getArgOperand(1),
3148 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3150 }
3151
3152 /// Instrument memcpy
3153 ///
3154 /// Similar to memmove: avoid copying shadow twice. This is somewhat
3155 /// unfortunate as it may slowdown small constant memcpys.
3156 /// FIXME: consider doing manual inline for small constant sizes and proper
3157 /// alignment.
3158 ///
3159 /// Note: This also handles memcpy.inline, which promises no calls to external
3160 /// functions as an optimization. However, with instrumentation enabled this
3161 /// is difficult to promise; additionally, we know that the MSan runtime
3162 /// exists and provides __msan_memcpy(). Therefore, we assume that with
3163 /// instrumentation it's safe to turn memcpy.inline into a call to
3164 /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
3165 /// itself, instrumentation should be disabled with the no_sanitize attribute.
3166 void visitMemCpyInst(MemCpyInst &I) {
3167 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3168 IRBuilder<> IRB(&I);
3169 IRB.CreateCall(MS.MemcpyFn,
3170 {I.getArgOperand(0), I.getArgOperand(1),
3171 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3173 }
3174
3175 // Same as memcpy.
3176 void visitMemSetInst(MemSetInst &I) {
3177 IRBuilder<> IRB(&I);
3178 IRB.CreateCall(
3179 MS.MemsetFn,
3180 {I.getArgOperand(0),
3181 IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
3182 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3184 }
3185
3186 void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
3187
3188 void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
3189
3190 /// Handle vector store-like intrinsics.
3191 ///
3192 /// Instrument intrinsics that look like a simple SIMD store: writes memory,
3193 /// has 1 pointer argument and 1 vector argument, returns void.
3194 bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
3195 assert(I.arg_size() == 2);
3196
3197 IRBuilder<> IRB(&I);
3198 Value *Addr = I.getArgOperand(0);
3199 Value *Shadow = getShadow(&I, 1);
3200 Value *ShadowPtr, *OriginPtr;
3201
3202 // We don't know the pointer alignment (could be unaligned SSE store!).
3203 // Have to assume to worst case.
3204 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
3205 Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
3206 IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
3207
3209 insertCheckShadowOf(Addr, &I);
3210
3211 // FIXME: factor out common code from materializeStores
3212 if (MS.TrackOrigins)
3213 IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
3214 return true;
3215 }
3216
3217 /// Handle vector load-like intrinsics.
3218 ///
3219 /// Instrument intrinsics that look like a simple SIMD load: reads memory,
3220 /// has 1 pointer argument, returns a vector.
3221 bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
3222 assert(I.arg_size() == 1);
3223
3224 IRBuilder<> IRB(&I);
3225 Value *Addr = I.getArgOperand(0);
3226
3227 Type *ShadowTy = getShadowTy(&I);
3228 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
3229 if (PropagateShadow) {
3230 // We don't know the pointer alignment (could be unaligned SSE load!).
3231 // Have to assume to worst case.
3232 const Align Alignment = Align(1);
3233 std::tie(ShadowPtr, OriginPtr) =
3234 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3235 setShadow(&I,
3236 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
3237 } else {
3238 setShadow(&I, getCleanShadow(&I));
3239 }
3240
3242 insertCheckShadowOf(Addr, &I);
3243
3244 if (MS.TrackOrigins) {
3245 if (PropagateShadow)
3246 setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
3247 else
3248 setOrigin(&I, getCleanOrigin());
3249 }
3250 return true;
3251 }
3252
3253 /// Handle (SIMD arithmetic)-like intrinsics.
3254 ///
3255 /// Instrument intrinsics with any number of arguments of the same type [*],
3256 /// equal to the return type, plus a specified number of trailing flags of
3257 /// any type.
3258 ///
3259 /// [*] The type should be simple (no aggregates or pointers; vectors are
3260 /// fine).
3261 ///
3262 /// Caller guarantees that this intrinsic does not access memory.
3263 ///
3264 /// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
3265 /// by this handler. See horizontalReduce().
3266 ///
3267 /// TODO: permutation intrinsics are also often incorrectly matched.
3268 [[maybe_unused]] bool
3269 maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
3270 unsigned int trailingFlags) {
3271 Type *RetTy = I.getType();
3272 if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
3273 return false;
3274
3275 unsigned NumArgOperands = I.arg_size();
3276 assert(NumArgOperands >= trailingFlags);
3277 for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
3278 Type *Ty = I.getArgOperand(i)->getType();
3279 if (Ty != RetTy)
3280 return false;
3281 }
3282
3283 IRBuilder<> IRB(&I);
3284 ShadowAndOriginCombiner SC(this, IRB);
3285 for (unsigned i = 0; i < NumArgOperands; ++i)
3286 SC.Add(I.getArgOperand(i));
3287 SC.Done(&I);
3288
3289 return true;
3290 }
3291
3292 /// Returns whether it was able to heuristically instrument unknown
3293 /// intrinsics.
3294 ///
3295 /// The main purpose of this code is to do something reasonable with all
3296 /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
3297 /// We recognize several classes of intrinsics by their argument types and
3298 /// ModRefBehaviour and apply special instrumentation when we are reasonably
3299 /// sure that we know what the intrinsic does.
3300 ///
3301 /// We special-case intrinsics where this approach fails. See llvm.bswap
3302 /// handling as an example of that.
3303 bool maybeHandleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
3304 unsigned NumArgOperands = I.arg_size();
3305 if (NumArgOperands == 0)
3306 return false;
3307
3308 if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
3309 I.getArgOperand(1)->getType()->isVectorTy() &&
3310 I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
3311 // This looks like a vector store.
3312 return handleVectorStoreIntrinsic(I);
3313 }
3314
3315 if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
3316 I.getType()->isVectorTy() && I.onlyReadsMemory()) {
3317 // This looks like a vector load.
3318 return handleVectorLoadIntrinsic(I);
3319 }
3320
3321 if (I.doesNotAccessMemory())
3322 if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
3323 return true;
3324
3325 // FIXME: detect and handle SSE maskstore/maskload?
3326 // Some cases are now handled in handleAVXMasked{Load,Store}.
3327 return false;
3328 }
3329
3330 bool maybeHandleUnknownIntrinsic(IntrinsicInst &I) {
3331 if (maybeHandleUnknownIntrinsicUnlogged(I)) {
3333 dumpInst(I);
3334
3335 LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
3336 << "\n");
3337 return true;
3338 } else
3339 return false;
3340 }
3341
3342 void handleInvariantGroup(IntrinsicInst &I) {
3343 setShadow(&I, getShadow(&I, 0));
3344 setOrigin(&I, getOrigin(&I, 0));
3345 }
3346
3347 void handleLifetimeStart(IntrinsicInst &I) {
3348 if (!PoisonStack)
3349 return;
3350 AllocaInst *AI = dyn_cast<AllocaInst>(I.getArgOperand(0));
3351 if (AI)
3352 LifetimeStartList.push_back(std::make_pair(&I, AI));
3353 }
3354
3355 void handleBswap(IntrinsicInst &I) {
3356 IRBuilder<> IRB(&I);
3357 Value *Op = I.getArgOperand(0);
3358 Type *OpType = Op->getType();
3359 setShadow(&I, IRB.CreateIntrinsic(Intrinsic::bswap, ArrayRef(&OpType, 1),
3360 getShadow(Op)));
3361 setOrigin(&I, getOrigin(Op));
3362 }
3363
3364 // Uninitialized bits are ok if they appear after the leading/trailing 0's
3365 // and a 1. If the input is all zero, it is fully initialized iff
3366 // !is_zero_poison.
3367 //
3368 // e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
3369 // concrete value 0/1, and ? is an uninitialized bit:
3370 // - 0001 0??? is fully initialized
3371 // - 000? ???? is fully uninitialized (*)
3372 // - ???? ???? is fully uninitialized
3373 // - 0000 0000 is fully uninitialized if is_zero_poison,
3374 // fully initialized otherwise
3375 //
3376 // (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
3377 // only need to poison 4 bits.
3378 //
3379 // OutputShadow =
3380 // ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
3381 // || (is_zero_poison && AllZeroSrc)
3382 void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
3383 IRBuilder<> IRB(&I);
3384 Value *Src = I.getArgOperand(0);
3385 Value *SrcShadow = getShadow(Src);
3386
3387 Value *False = IRB.getInt1(false);
3388 Value *ConcreteZerosCount = IRB.CreateIntrinsic(
3389 I.getType(), I.getIntrinsicID(), {Src, /*is_zero_poison=*/False});
3390 Value *ShadowZerosCount = IRB.CreateIntrinsic(
3391 I.getType(), I.getIntrinsicID(), {SrcShadow, /*is_zero_poison=*/False});
3392
3393 Value *CompareConcreteZeros = IRB.CreateICmpUGE(
3394 ConcreteZerosCount, ShadowZerosCount, "_mscz_cmp_zeros");
3395
3396 Value *NotAllZeroShadow =
3397 IRB.CreateIsNotNull(SrcShadow, "_mscz_shadow_not_null");
3398 Value *OutputShadow =
3399 IRB.CreateAnd(CompareConcreteZeros, NotAllZeroShadow, "_mscz_main");
3400
3401 // If zero poison is requested, mix in with the shadow
3402 Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
3403 if (!IsZeroPoison->isZeroValue()) {
3404 Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
3405 OutputShadow = IRB.CreateOr(OutputShadow, BoolZeroPoison, "_mscz_bs");
3406 }
3407
3408 OutputShadow = IRB.CreateSExt(OutputShadow, getShadowTy(Src), "_mscz_os");
3409
3410 setShadow(&I, OutputShadow);
3411 setOriginForNaryOp(I);
3412 }
3413
3414 /// Handle Arm NEON vector convert intrinsics.
3415 ///
3416 /// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
3417 /// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
3418 ///
3419 /// For x86 SSE vector convert intrinsics, see
3420 /// handleSSEVectorConvertIntrinsic().
3421 void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
3422 assert(I.arg_size() == 1);
3423
3424 IRBuilder<> IRB(&I);
3425 Value *S0 = getShadow(&I, 0);
3426
3427 /// For scalars:
3428 /// Since they are converting from floating-point to integer, the output is
3429 /// - fully uninitialized if *any* bit of the input is uninitialized
3430 /// - fully ininitialized if all bits of the input are ininitialized
3431 /// We apply the same principle on a per-field basis for vectors.
3432 Value *OutShadow = IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)),
3433 getShadowTy(&I));
3434 setShadow(&I, OutShadow);
3435 setOriginForNaryOp(I);
3436 }
3437
3438 /// Some instructions have additional zero-elements in the return type
3439 /// e.g., <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512(<8 x i64>, ...)
3440 ///
3441 /// This function will return a vector type with the same number of elements
3442 /// as the input, but same per-element width as the return value e.g.,
3443 /// <8 x i8>.
3444 FixedVectorType *maybeShrinkVectorShadowType(Value *Src, IntrinsicInst &I) {
3445 assert(isa<FixedVectorType>(getShadowTy(&I)));
3446 FixedVectorType *ShadowType = cast<FixedVectorType>(getShadowTy(&I));
3447
3448 // TODO: generalize beyond 2x?
3449 if (ShadowType->getElementCount() ==
3450 cast<VectorType>(Src->getType())->getElementCount() * 2)
3451 ShadowType = FixedVectorType::getHalfElementsVectorType(ShadowType);
3452
3453 assert(ShadowType->getElementCount() ==
3454 cast<VectorType>(Src->getType())->getElementCount());
3455
3456 return ShadowType;
3457 }
3458
3459 /// Doubles the length of a vector shadow (extending with zeros) if necessary
3460 /// to match the length of the shadow for the instruction.
3461 /// If scalar types of the vectors are different, it will use the type of the
3462 /// input vector.
3463 /// This is more type-safe than CreateShadowCast().
3464 Value *maybeExtendVectorShadowWithZeros(Value *Shadow, IntrinsicInst &I) {
3465 IRBuilder<> IRB(&I);
3467 assert(isa<FixedVectorType>(I.getType()));
3468
3469 Value *FullShadow = getCleanShadow(&I);
3470 unsigned ShadowNumElems =
3471 cast<FixedVectorType>(Shadow->getType())->getNumElements();
3472 unsigned FullShadowNumElems =
3473 cast<FixedVectorType>(FullShadow->getType())->getNumElements();
3474
3475 assert((ShadowNumElems == FullShadowNumElems) ||
3476 (ShadowNumElems * 2 == FullShadowNumElems));
3477
3478 if (ShadowNumElems == FullShadowNumElems) {
3479 FullShadow = Shadow;
3480 } else {
3481 // TODO: generalize beyond 2x?
3482 SmallVector<int, 32> ShadowMask(FullShadowNumElems);
3483 std::iota(ShadowMask.begin(), ShadowMask.end(), 0);
3484
3485 // Append zeros
3486 FullShadow =
3487 IRB.CreateShuffleVector(Shadow, getCleanShadow(Shadow), ShadowMask);
3488 }
3489
3490 return FullShadow;
3491 }
3492
3493 /// Handle x86 SSE vector conversion.
3494 ///
3495 /// e.g., single-precision to half-precision conversion:
3496 /// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
3497 /// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
3498 ///
3499 /// floating-point to integer:
3500 /// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
3501 /// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
3502 ///
3503 /// Note: if the output has more elements, they are zero-initialized (and
3504 /// therefore the shadow will also be initialized).
3505 ///
3506 /// This differs from handleSSEVectorConvertIntrinsic() because it
3507 /// propagates uninitialized shadow (instead of checking the shadow).
3508 void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
3509 bool HasRoundingMode) {
3510 if (HasRoundingMode) {
3511 assert(I.arg_size() == 2);
3512 [[maybe_unused]] Value *RoundingMode = I.getArgOperand(1);
3513 assert(RoundingMode->getType()->isIntegerTy());
3514 } else {
3515 assert(I.arg_size() == 1);
3516 }
3517
3518 Value *Src = I.getArgOperand(0);
3519 assert(Src->getType()->isVectorTy());
3520
3521 // The return type might have more elements than the input.
3522 // Temporarily shrink the return type's number of elements.
3523 VectorType *ShadowType = maybeShrinkVectorShadowType(Src, I);
3524
3525 IRBuilder<> IRB(&I);
3526 Value *S0 = getShadow(&I, 0);
3527
3528 /// For scalars:
3529 /// Since they are converting to and/or from floating-point, the output is:
3530 /// - fully uninitialized if *any* bit of the input is uninitialized
3531 /// - fully ininitialized if all bits of the input are ininitialized
3532 /// We apply the same principle on a per-field basis for vectors.
3533 Value *Shadow =
3534 IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)), ShadowType);
3535
3536 // The return type might have more elements than the input.
3537 // Extend the return type back to its original width if necessary.
3538 Value *FullShadow = maybeExtendVectorShadowWithZeros(Shadow, I);
3539
3540 setShadow(&I, FullShadow);
3541 setOriginForNaryOp(I);
3542 }
3543
3544 // Instrument x86 SSE vector convert intrinsic.
3545 //
3546 // This function instruments intrinsics like cvtsi2ss:
3547 // %Out = int_xxx_cvtyyy(%ConvertOp)
3548 // or
3549 // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
3550 // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
3551 // number \p Out elements, and (if has 2 arguments) copies the rest of the
3552 // elements from \p CopyOp.
3553 // In most cases conversion involves floating-point value which may trigger a
3554 // hardware exception when not fully initialized. For this reason we require
3555 // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
3556 // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
3557 // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
3558 // return a fully initialized value.
3559 //
3560 // For Arm NEON vector convert intrinsics, see
3561 // handleNEONVectorConvertIntrinsic().
3562 void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
3563 bool HasRoundingMode = false) {
3564 IRBuilder<> IRB(&I);
3565 Value *CopyOp, *ConvertOp;
3566
3567 assert((!HasRoundingMode ||
3568 isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3569 "Invalid rounding mode");
3570
3571 switch (I.arg_size() - HasRoundingMode) {
3572 case 2:
3573 CopyOp = I.getArgOperand(0);
3574 ConvertOp = I.getArgOperand(1);
3575 break;
3576 case 1:
3577 ConvertOp = I.getArgOperand(0);
3578 CopyOp = nullptr;
3579 break;
3580 default:
3581 llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3582 }
3583
3584 // The first *NumUsedElements* elements of ConvertOp are converted to the
3585 // same number of output elements. The rest of the output is copied from
3586 // CopyOp, or (if not available) filled with zeroes.
3587 // Combine shadow for elements of ConvertOp that are used in this operation,
3588 // and insert a check.
3589 // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3590 // int->any conversion.
3591 Value *ConvertShadow = getShadow(ConvertOp);
3592 Value *AggShadow = nullptr;
3593 if (ConvertOp->getType()->isVectorTy()) {
3594 AggShadow = IRB.CreateExtractElement(
3595 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
3596 for (int i = 1; i < NumUsedElements; ++i) {
3597 Value *MoreShadow = IRB.CreateExtractElement(
3598 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
3599 AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
3600 }
3601 } else {
3602 AggShadow = ConvertShadow;
3603 }
3604 assert(AggShadow->getType()->isIntegerTy());
3605 insertCheckShadow(AggShadow, getOrigin(ConvertOp), &I);
3606
3607 // Build result shadow by zero-filling parts of CopyOp shadow that come from
3608 // ConvertOp.
3609 if (CopyOp) {
3610 assert(CopyOp->getType() == I.getType());
3611 assert(CopyOp->getType()->isVectorTy());
3612 Value *ResultShadow = getShadow(CopyOp);
3613 Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
3614 for (int i = 0; i < NumUsedElements; ++i) {
3615 ResultShadow = IRB.CreateInsertElement(
3616 ResultShadow, ConstantInt::getNullValue(EltTy),
3617 ConstantInt::get(IRB.getInt32Ty(), i));
3618 }
3619 setShadow(&I, ResultShadow);
3620 setOrigin(&I, getOrigin(CopyOp));
3621 } else {
3622 setShadow(&I, getCleanShadow(&I));
3623 setOrigin(&I, getCleanOrigin());
3624 }
3625 }
3626
3627 // Given a scalar or vector, extract lower 64 bits (or less), and return all
3628 // zeroes if it is zero, and all ones otherwise.
3629 Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3630 if (S->getType()->isVectorTy())
3631 S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
3632 assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3633 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3634 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3635 }
3636
3637 // Given a vector, extract its first element, and return all
3638 // zeroes if it is zero, and all ones otherwise.
3639 Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3640 Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
3641 Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
3642 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3643 }
3644
3645 Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3646 Type *T = S->getType();
3647 assert(T->isVectorTy());
3648 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3649 return IRB.CreateSExt(S2, T);
3650 }
3651
3652 // Instrument vector shift intrinsic.
3653 //
3654 // This function instruments intrinsics like int_x86_avx2_psll_w.
3655 // Intrinsic shifts %In by %ShiftSize bits.
3656 // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3657 // size, and the rest is ignored. Behavior is defined even if shift size is
3658 // greater than register (or field) width.
3659 void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3660 assert(I.arg_size() == 2);
3661 IRBuilder<> IRB(&I);
3662 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3663 // Otherwise perform the same shift on S1.
3664 Value *S1 = getShadow(&I, 0);
3665 Value *S2 = getShadow(&I, 1);
3666 Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
3667 : Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
3668 Value *V1 = I.getOperand(0);
3669 Value *V2 = I.getOperand(1);
3670 Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
3671 {IRB.CreateBitCast(S1, V1->getType()), V2});
3672 Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
3673 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3674 setOriginForNaryOp(I);
3675 }
3676
3677 // Get an MMX-sized (64-bit) vector type, or optionally, other sized
3678 // vectors.
3679 Type *getMMXVectorTy(unsigned EltSizeInBits,
3680 unsigned X86_MMXSizeInBits = 64) {
3681 assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3682 "Illegal MMX vector element size");
3683 return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
3684 X86_MMXSizeInBits / EltSizeInBits);
3685 }
3686
3687 // Returns a signed counterpart for an (un)signed-saturate-and-pack
3688 // intrinsic.
3689 Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3690 switch (id) {
3691 case Intrinsic::x86_sse2_packsswb_128:
3692 case Intrinsic::x86_sse2_packuswb_128:
3693 return Intrinsic::x86_sse2_packsswb_128;
3694
3695 case Intrinsic::x86_sse2_packssdw_128:
3696 case Intrinsic::x86_sse41_packusdw:
3697 return Intrinsic::x86_sse2_packssdw_128;
3698
3699 case Intrinsic::x86_avx2_packsswb:
3700 case Intrinsic::x86_avx2_packuswb:
3701 return Intrinsic::x86_avx2_packsswb;
3702
3703 case Intrinsic::x86_avx2_packssdw:
3704 case Intrinsic::x86_avx2_packusdw:
3705 return Intrinsic::x86_avx2_packssdw;
3706
3707 case Intrinsic::x86_mmx_packsswb:
3708 case Intrinsic::x86_mmx_packuswb:
3709 return Intrinsic::x86_mmx_packsswb;
3710
3711 case Intrinsic::x86_mmx_packssdw:
3712 return Intrinsic::x86_mmx_packssdw;
3713
3714 case Intrinsic::x86_avx512_packssdw_512:
3715 case Intrinsic::x86_avx512_packusdw_512:
3716 return Intrinsic::x86_avx512_packssdw_512;
3717
3718 case Intrinsic::x86_avx512_packsswb_512:
3719 case Intrinsic::x86_avx512_packuswb_512:
3720 return Intrinsic::x86_avx512_packsswb_512;
3721
3722 default:
3723 llvm_unreachable("unexpected intrinsic id");
3724 }
3725 }
3726
3727 // Instrument vector pack intrinsic.
3728 //
3729 // This function instruments intrinsics like x86_mmx_packsswb, that
3730 // packs elements of 2 input vectors into half as many bits with saturation.
3731 // Shadow is propagated with the signed variant of the same intrinsic applied
3732 // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3733 // MMXEltSizeInBits is used only for x86mmx arguments.
3734 //
3735 // TODO: consider using GetMinMaxUnsigned() to handle saturation precisely
3736 void handleVectorPackIntrinsic(IntrinsicInst &I,
3737 unsigned MMXEltSizeInBits = 0) {
3738 assert(I.arg_size() == 2);
3739 IRBuilder<> IRB(&I);
3740 Value *S1 = getShadow(&I, 0);
3741 Value *S2 = getShadow(&I, 1);
3742 assert(S1->getType()->isVectorTy());
3743
3744 // SExt and ICmpNE below must apply to individual elements of input vectors.
3745 // In case of x86mmx arguments, cast them to appropriate vector types and
3746 // back.
3747 Type *T =
3748 MMXEltSizeInBits ? getMMXVectorTy(MMXEltSizeInBits) : S1->getType();
3749 if (MMXEltSizeInBits) {
3750 S1 = IRB.CreateBitCast(S1, T);
3751 S2 = IRB.CreateBitCast(S2, T);
3752 }
3753 Value *S1_ext =
3755 Value *S2_ext =
3757 if (MMXEltSizeInBits) {
3758 S1_ext = IRB.CreateBitCast(S1_ext, getMMXVectorTy(64));
3759 S2_ext = IRB.CreateBitCast(S2_ext, getMMXVectorTy(64));
3760 }
3761
3762 Value *S = IRB.CreateIntrinsic(getSignedPackIntrinsic(I.getIntrinsicID()),
3763 {S1_ext, S2_ext}, /*FMFSource=*/nullptr,
3764 "_msprop_vector_pack");
3765 if (MMXEltSizeInBits)
3766 S = IRB.CreateBitCast(S, getShadowTy(&I));
3767 setShadow(&I, S);
3768 setOriginForNaryOp(I);
3769 }
3770
3771 // Convert `Mask` into `<n x i1>`.
3772 Constant *createDppMask(unsigned Width, unsigned Mask) {
3774 for (auto &M : R) {
3775 M = ConstantInt::getBool(F.getContext(), Mask & 1);
3776 Mask >>= 1;
3777 }
3778 return ConstantVector::get(R);
3779 }
3780
3781 // Calculate output shadow as array of booleans `<n x i1>`, assuming if any
3782 // arg is poisoned, entire dot product is poisoned.
3783 Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
3784 unsigned DstMask) {
3785 const unsigned Width =
3786 cast<FixedVectorType>(S->getType())->getNumElements();
3787
3788 S = IRB.CreateSelect(createDppMask(Width, SrcMask), S,
3790 Value *SElem = IRB.CreateOrReduce(S);
3791 Value *IsClean = IRB.CreateIsNull(SElem, "_msdpp");
3792 Value *DstMaskV = createDppMask(Width, DstMask);
3793
3794 return IRB.CreateSelect(
3795 IsClean, Constant::getNullValue(DstMaskV->getType()), DstMaskV);
3796 }
3797
3798 // See `Intel Intrinsics Guide` for `_dp_p*` instructions.
3799 //
3800 // 2 and 4 element versions produce single scalar of dot product, and then
3801 // puts it into elements of output vector, selected by 4 lowest bits of the
3802 // mask. Top 4 bits of the mask control which elements of input to use for dot
3803 // product.
3804 //
3805 // 8 element version mask still has only 4 bit for input, and 4 bit for output
3806 // mask. According to the spec it just operates as 4 element version on first
3807 // 4 elements of inputs and output, and then on last 4 elements of inputs and
3808 // output.
3809 void handleDppIntrinsic(IntrinsicInst &I) {
3810 IRBuilder<> IRB(&I);
3811
3812 Value *S0 = getShadow(&I, 0);
3813 Value *S1 = getShadow(&I, 1);
3814 Value *S = IRB.CreateOr(S0, S1);
3815
3816 const unsigned Width =
3817 cast<FixedVectorType>(S->getType())->getNumElements();
3818 assert(Width == 2 || Width == 4 || Width == 8);
3819
3820 const unsigned Mask = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
3821 const unsigned SrcMask = Mask >> 4;
3822 const unsigned DstMask = Mask & 0xf;
3823
3824 // Calculate shadow as `<n x i1>`.
3825 Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
3826 if (Width == 8) {
3827 // First 4 elements of shadow are already calculated. `makeDppShadow`
3828 // operats on 32 bit masks, so we can just shift masks, and repeat.
3829 SI1 = IRB.CreateOr(
3830 SI1, findDppPoisonedOutput(IRB, S, SrcMask << 4, DstMask << 4));
3831 }
3832 // Extend to real size of shadow, poisoning either all or none bits of an
3833 // element.
3834 S = IRB.CreateSExt(SI1, S->getType(), "_msdpp");
3835
3836 setShadow(&I, S);
3837 setOriginForNaryOp(I);
3838 }
3839
3840 Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
3841 C = CreateAppToShadowCast(IRB, C);
3842 FixedVectorType *FVT = cast<FixedVectorType>(C->getType());
3843 unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
3844 C = IRB.CreateAShr(C, ElSize - 1);
3845 FVT = FixedVectorType::get(IRB.getInt1Ty(), FVT->getNumElements());
3846 return IRB.CreateTrunc(C, FVT);
3847 }
3848
3849 // `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
3850 void handleBlendvIntrinsic(IntrinsicInst &I) {
3851 Value *C = I.getOperand(2);
3852 Value *T = I.getOperand(1);
3853 Value *F = I.getOperand(0);
3854
3855 Value *Sc = getShadow(&I, 2);
3856 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
3857
3858 {
3859 IRBuilder<> IRB(&I);
3860 // Extract top bit from condition and its shadow.
3861 C = convertBlendvToSelectMask(IRB, C);
3862 Sc = convertBlendvToSelectMask(IRB, Sc);
3863
3864 setShadow(C, Sc);
3865 setOrigin(C, Oc);
3866 }
3867
3868 handleSelectLikeInst(I, C, T, F);
3869 }
3870
3871 // Instrument sum-of-absolute-differences intrinsic.
3872 void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
3873 const unsigned SignificantBitsPerResultElement = 16;
3874 Type *ResTy = IsMMX ? IntegerType::get(*MS.C, 64) : I.getType();
3875 unsigned ZeroBitsPerResultElement =
3876 ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3877
3878 IRBuilder<> IRB(&I);
3879 auto *Shadow0 = getShadow(&I, 0);
3880 auto *Shadow1 = getShadow(&I, 1);
3881 Value *S = IRB.CreateOr(Shadow0, Shadow1);
3882 S = IRB.CreateBitCast(S, ResTy);
3883 S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3884 ResTy);
3885 S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
3886 S = IRB.CreateBitCast(S, getShadowTy(&I));
3887 setShadow(&I, S);
3888 setOriginForNaryOp(I);
3889 }
3890
3891 // Instrument multiply-add(-accumulate)? intrinsics.
3892 //
3893 // e.g., Two operands:
3894 // <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a, <8 x i16> %b)
3895 //
3896 // Two operands which require an EltSizeInBits override:
3897 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64> %a, <1 x i64> %b)
3898 //
3899 // Three operands:
3900 // <4 x i32> @llvm.x86.avx512.vpdpbusd.128
3901 // (<4 x i32> %s, <16 x i8> %a, <16 x i8> %b)
3902 // (this is equivalent to multiply-add on %a and %b, followed by
3903 // adding/"accumulating" %s. "Accumulation" stores the result in one
3904 // of the source registers, but this accumulate vs. add distinction
3905 // is lost when dealing with LLVM intrinsics.)
3906 //
3907 // ZeroPurifies means that multiplying a known-zero with an uninitialized
3908 // value results in an initialized value. This is applicable for integer
3909 // multiplication, but not floating-point (counter-example: NaN).
3910 void handleVectorPmaddIntrinsic(IntrinsicInst &I, unsigned ReductionFactor,
3911 bool ZeroPurifies,
3912 unsigned EltSizeInBits = 0) {
3913 IRBuilder<> IRB(&I);
3914
3915 [[maybe_unused]] FixedVectorType *ReturnType =
3916 cast<FixedVectorType>(I.getType());
3917 assert(isa<FixedVectorType>(ReturnType));
3918
3919 // Vectors A and B, and shadows
3920 Value *Va = nullptr;
3921 Value *Vb = nullptr;
3922 Value *Sa = nullptr;
3923 Value *Sb = nullptr;
3924
3925 assert(I.arg_size() == 2 || I.arg_size() == 3);
3926 if (I.arg_size() == 2) {
3927 Va = I.getOperand(0);
3928 Vb = I.getOperand(1);
3929
3930 Sa = getShadow(&I, 0);
3931 Sb = getShadow(&I, 1);
3932 } else if (I.arg_size() == 3) {
3933 // Operand 0 is the accumulator. We will deal with that below.
3934 Va = I.getOperand(1);
3935 Vb = I.getOperand(2);
3936
3937 Sa = getShadow(&I, 1);
3938 Sb = getShadow(&I, 2);
3939 }
3940
3941 FixedVectorType *ParamType = cast<FixedVectorType>(Va->getType());
3942 assert(ParamType == Vb->getType());
3943
3944 assert(ParamType->getPrimitiveSizeInBits() ==
3945 ReturnType->getPrimitiveSizeInBits());
3946
3947 if (I.arg_size() == 3) {
3948 [[maybe_unused]] auto *AccumulatorType =
3949 cast<FixedVectorType>(I.getOperand(0)->getType());
3950 assert(AccumulatorType == ReturnType);
3951 }
3952
3953 FixedVectorType *ImplicitReturnType =
3954 cast<FixedVectorType>(getShadowTy(ReturnType));
3955 // Step 1: instrument multiplication of corresponding vector elements
3956 if (EltSizeInBits) {
3957 ImplicitReturnType = cast<FixedVectorType>(
3958 getMMXVectorTy(EltSizeInBits * ReductionFactor,
3959 ParamType->getPrimitiveSizeInBits()));
3960 ParamType = cast<FixedVectorType>(
3961 getMMXVectorTy(EltSizeInBits, ParamType->getPrimitiveSizeInBits()));
3962
3963 Va = IRB.CreateBitCast(Va, ParamType);
3964 Vb = IRB.CreateBitCast(Vb, ParamType);
3965
3966 Sa = IRB.CreateBitCast(Sa, getShadowTy(ParamType));
3967 Sb = IRB.CreateBitCast(Sb, getShadowTy(ParamType));
3968 } else {
3969 assert(ParamType->getNumElements() ==
3970 ReturnType->getNumElements() * ReductionFactor);
3971 }
3972
3973 // Each element of the vector is represented by a single bit (poisoned or
3974 // not) e.g., <8 x i1>.
3975 Value *SaNonZero = IRB.CreateIsNotNull(Sa);
3976 Value *SbNonZero = IRB.CreateIsNotNull(Sb);
3977 Value *And;
3978 if (ZeroPurifies) {
3979 // Multiplying an *initialized* zero by an uninitialized element results
3980 // in an initialized zero element.
3981 //
3982 // This is analogous to bitwise AND, where "AND" of 0 and a poisoned value
3983 // results in an unpoisoned value. We can therefore adapt the visitAnd()
3984 // instrumentation:
3985 // OutShadow = (SaNonZero & SbNonZero)
3986 // | (VaNonZero & SbNonZero)
3987 // | (SaNonZero & VbNonZero)
3988 // where non-zero is checked on a per-element basis (not per bit).
3989 Value *VaInt = Va;
3990 Value *VbInt = Vb;
3991 if (!Va->getType()->isIntegerTy()) {
3992 VaInt = CreateAppToShadowCast(IRB, Va);
3993 VbInt = CreateAppToShadowCast(IRB, Vb);
3994 }
3995
3996 Value *VaNonZero = IRB.CreateIsNotNull(VaInt);
3997 Value *VbNonZero = IRB.CreateIsNotNull(VbInt);
3998
3999 Value *SaAndSbNonZero = IRB.CreateAnd(SaNonZero, SbNonZero);
4000 Value *VaAndSbNonZero = IRB.CreateAnd(VaNonZero, SbNonZero);
4001 Value *SaAndVbNonZero = IRB.CreateAnd(SaNonZero, VbNonZero);
4002
4003 And = IRB.CreateOr({SaAndSbNonZero, VaAndSbNonZero, SaAndVbNonZero});
4004 } else {
4005 And = IRB.CreateOr({SaNonZero, SbNonZero});
4006 }
4007
4008 // Extend <8 x i1> to <8 x i16>.
4009 // (The real pmadd intrinsic would have computed intermediate values of
4010 // <8 x i32>, but that is irrelevant for our shadow purposes because we
4011 // consider each element to be either fully initialized or fully
4012 // uninitialized.)
4013 And = IRB.CreateSExt(And, Sa->getType());
4014
4015 // Step 2: instrument horizontal add
4016 // We don't need bit-precise horizontalReduce because we only want to check
4017 // if each pair/quad of elements is fully zero.
4018 // Cast to <4 x i32>.
4019 Value *Horizontal = IRB.CreateBitCast(And, ImplicitReturnType);
4020
4021 // Compute <4 x i1>, then extend back to <4 x i32>.
4022 Value *OutShadow = IRB.CreateSExt(
4023 IRB.CreateICmpNE(Horizontal,
4024 Constant::getNullValue(Horizontal->getType())),
4025 ImplicitReturnType);
4026
4027 // Cast it back to the required fake return type (if MMX: <1 x i64>; for
4028 // AVX, it is already correct).
4029 if (EltSizeInBits)
4030 OutShadow = CreateShadowCast(IRB, OutShadow, getShadowTy(&I));
4031
4032 // Step 3 (if applicable): instrument accumulator
4033 if (I.arg_size() == 3)
4034 OutShadow = IRB.CreateOr(OutShadow, getShadow(&I, 0));
4035
4036 setShadow(&I, OutShadow);
4037 setOriginForNaryOp(I);
4038 }
4039
4040 // Instrument compare-packed intrinsic.
4041 // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
4042 // all-ones shadow.
4043 void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
4044 IRBuilder<> IRB(&I);
4045 Type *ResTy = getShadowTy(&I);
4046 auto *Shadow0 = getShadow(&I, 0);
4047 auto *Shadow1 = getShadow(&I, 1);
4048 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4049 Value *S = IRB.CreateSExt(
4050 IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
4051 setShadow(&I, S);
4052 setOriginForNaryOp(I);
4053 }
4054
4055 // Instrument compare-scalar intrinsic.
4056 // This handles both cmp* intrinsics which return the result in the first
4057 // element of a vector, and comi* which return the result as i32.
4058 void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
4059 IRBuilder<> IRB(&I);
4060 auto *Shadow0 = getShadow(&I, 0);
4061 auto *Shadow1 = getShadow(&I, 1);
4062 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4063 Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
4064 setShadow(&I, S);
4065 setOriginForNaryOp(I);
4066 }
4067
4068 // Instrument generic vector reduction intrinsics
4069 // by ORing together all their fields.
4070 //
4071 // If AllowShadowCast is true, the return type does not need to be the same
4072 // type as the fields
4073 // e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
4074 void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
4075 assert(I.arg_size() == 1);
4076
4077 IRBuilder<> IRB(&I);
4078 Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
4079 if (AllowShadowCast)
4080 S = CreateShadowCast(IRB, S, getShadowTy(&I));
4081 else
4082 assert(S->getType() == getShadowTy(&I));
4083 setShadow(&I, S);
4084 setOriginForNaryOp(I);
4085 }
4086
4087 // Similar to handleVectorReduceIntrinsic but with an initial starting value.
4088 // e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
4089 // %a1)
4090 // shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
4091 //
4092 // The type of the return value, initial starting value, and elements of the
4093 // vector must be identical.
4094 void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
4095 assert(I.arg_size() == 2);
4096
4097 IRBuilder<> IRB(&I);
4098 Value *Shadow0 = getShadow(&I, 0);
4099 Value *Shadow1 = IRB.CreateOrReduce(getShadow(&I, 1));
4100 assert(Shadow0->getType() == Shadow1->getType());
4101 Value *S = IRB.CreateOr(Shadow0, Shadow1);
4102 assert(S->getType() == getShadowTy(&I));
4103 setShadow(&I, S);
4104 setOriginForNaryOp(I);
4105 }
4106
4107 // Instrument vector.reduce.or intrinsic.
4108 // Valid (non-poisoned) set bits in the operand pull low the
4109 // corresponding shadow bits.
4110 void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
4111 assert(I.arg_size() == 1);
4112
4113 IRBuilder<> IRB(&I);
4114 Value *OperandShadow = getShadow(&I, 0);
4115 Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
4116 Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
4117 // Bit N is clean if any field's bit N is 1 and unpoison
4118 Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
4119 // Otherwise, it is clean if every field's bit N is unpoison
4120 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4121 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4122
4123 setShadow(&I, S);
4124 setOrigin(&I, getOrigin(&I, 0));
4125 }
4126
4127 // Instrument vector.reduce.and intrinsic.
4128 // Valid (non-poisoned) unset bits in the operand pull down the
4129 // corresponding shadow bits.
4130 void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
4131 assert(I.arg_size() == 1);
4132
4133 IRBuilder<> IRB(&I);
4134 Value *OperandShadow = getShadow(&I, 0);
4135 Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
4136 // Bit N is clean if any field's bit N is 0 and unpoison
4137 Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
4138 // Otherwise, it is clean if every field's bit N is unpoison
4139 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4140 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4141
4142 setShadow(&I, S);
4143 setOrigin(&I, getOrigin(&I, 0));
4144 }
4145
4146 void handleStmxcsr(IntrinsicInst &I) {
4147 IRBuilder<> IRB(&I);
4148 Value *Addr = I.getArgOperand(0);
4149 Type *Ty = IRB.getInt32Ty();
4150 Value *ShadowPtr =
4151 getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
4152
4153 IRB.CreateStore(getCleanShadow(Ty), ShadowPtr);
4154
4156 insertCheckShadowOf(Addr, &I);
4157 }
4158
4159 void handleLdmxcsr(IntrinsicInst &I) {
4160 if (!InsertChecks)
4161 return;
4162
4163 IRBuilder<> IRB(&I);
4164 Value *Addr = I.getArgOperand(0);
4165 Type *Ty = IRB.getInt32Ty();
4166 const Align Alignment = Align(1);
4167 Value *ShadowPtr, *OriginPtr;
4168 std::tie(ShadowPtr, OriginPtr) =
4169 getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
4170
4172 insertCheckShadowOf(Addr, &I);
4173
4174 Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
4175 Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
4176 : getCleanOrigin();
4177 insertCheckShadow(Shadow, Origin, &I);
4178 }
4179
4180 void handleMaskedExpandLoad(IntrinsicInst &I) {
4181 IRBuilder<> IRB(&I);
4182 Value *Ptr = I.getArgOperand(0);
4183 MaybeAlign Align = I.getParamAlign(0);
4184 Value *Mask = I.getArgOperand(1);
4185 Value *PassThru = I.getArgOperand(2);
4186
4188 insertCheckShadowOf(Ptr, &I);
4189 insertCheckShadowOf(Mask, &I);
4190 }
4191
4192 if (!PropagateShadow) {
4193 setShadow(&I, getCleanShadow(&I));
4194 setOrigin(&I, getCleanOrigin());
4195 return;
4196 }
4197
4198 Type *ShadowTy = getShadowTy(&I);
4199 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4200 auto [ShadowPtr, OriginPtr] =
4201 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ false);
4202
4203 Value *Shadow =
4204 IRB.CreateMaskedExpandLoad(ShadowTy, ShadowPtr, Align, Mask,
4205 getShadow(PassThru), "_msmaskedexpload");
4206
4207 setShadow(&I, Shadow);
4208
4209 // TODO: Store origins.
4210 setOrigin(&I, getCleanOrigin());
4211 }
4212
4213 void handleMaskedCompressStore(IntrinsicInst &I) {
4214 IRBuilder<> IRB(&I);
4215 Value *Values = I.getArgOperand(0);
4216 Value *Ptr = I.getArgOperand(1);
4217 MaybeAlign Align = I.getParamAlign(1);
4218 Value *Mask = I.getArgOperand(2);
4219
4221 insertCheckShadowOf(Ptr, &I);
4222 insertCheckShadowOf(Mask, &I);
4223 }
4224
4225 Value *Shadow = getShadow(Values);
4226 Type *ElementShadowTy =
4227 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4228 auto [ShadowPtr, OriginPtrs] =
4229 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ true);
4230
4231 IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Align, Mask);
4232
4233 // TODO: Store origins.
4234 }
4235
4236 void handleMaskedGather(IntrinsicInst &I) {
4237 IRBuilder<> IRB(&I);
4238 Value *Ptrs = I.getArgOperand(0);
4239 const Align Alignment = I.getParamAlign(0).valueOrOne();
4240 Value *Mask = I.getArgOperand(1);
4241 Value *PassThru = I.getArgOperand(2);
4242
4243 Type *PtrsShadowTy = getShadowTy(Ptrs);
4245 insertCheckShadowOf(Mask, &I);
4246 Value *MaskedPtrShadow = IRB.CreateSelect(
4247 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4248 "_msmaskedptrs");
4249 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4250 }
4251
4252 if (!PropagateShadow) {
4253 setShadow(&I, getCleanShadow(&I));
4254 setOrigin(&I, getCleanOrigin());
4255 return;
4256 }
4257
4258 Type *ShadowTy = getShadowTy(&I);
4259 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4260 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4261 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
4262
4263 Value *Shadow =
4264 IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
4265 getShadow(PassThru), "_msmaskedgather");
4266
4267 setShadow(&I, Shadow);
4268
4269 // TODO: Store origins.
4270 setOrigin(&I, getCleanOrigin());
4271 }
4272
4273 void handleMaskedScatter(IntrinsicInst &I) {
4274 IRBuilder<> IRB(&I);
4275 Value *Values = I.getArgOperand(0);
4276 Value *Ptrs = I.getArgOperand(1);
4277 const Align Alignment = I.getParamAlign(1).valueOrOne();
4278 Value *Mask = I.getArgOperand(2);
4279
4280 Type *PtrsShadowTy = getShadowTy(Ptrs);
4282 insertCheckShadowOf(Mask, &I);
4283 Value *MaskedPtrShadow = IRB.CreateSelect(
4284 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4285 "_msmaskedptrs");
4286 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4287 }
4288
4289 Value *Shadow = getShadow(Values);
4290 Type *ElementShadowTy =
4291 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4292 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4293 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
4294
4295 IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
4296
4297 // TODO: Store origin.
4298 }
4299
4300 // Intrinsic::masked_store
4301 //
4302 // Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
4303 // stores are lowered to Intrinsic::masked_store.
4304 void handleMaskedStore(IntrinsicInst &I) {
4305 IRBuilder<> IRB(&I);
4306 Value *V = I.getArgOperand(0);
4307 Value *Ptr = I.getArgOperand(1);
4308 const Align Alignment = I.getParamAlign(1).valueOrOne();
4309 Value *Mask = I.getArgOperand(2);
4310 Value *Shadow = getShadow(V);
4311
4313 insertCheckShadowOf(Ptr, &I);
4314 insertCheckShadowOf(Mask, &I);
4315 }
4316
4317 Value *ShadowPtr;
4318 Value *OriginPtr;
4319 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
4320 Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
4321
4322 IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
4323
4324 if (!MS.TrackOrigins)
4325 return;
4326
4327 auto &DL = F.getDataLayout();
4328 paintOrigin(IRB, getOrigin(V), OriginPtr,
4329 DL.getTypeStoreSize(Shadow->getType()),
4330 std::max(Alignment, kMinOriginAlignment));
4331 }
4332
4333 // Intrinsic::masked_load
4334 //
4335 // Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
4336 // loads are lowered to Intrinsic::masked_load.
4337 void handleMaskedLoad(IntrinsicInst &I) {
4338 IRBuilder<> IRB(&I);
4339 Value *Ptr = I.getArgOperand(0);
4340 const Align Alignment = I.getParamAlign(0).valueOrOne();
4341 Value *Mask = I.getArgOperand(1);
4342 Value *PassThru = I.getArgOperand(2);
4343
4345 insertCheckShadowOf(Ptr, &I);
4346 insertCheckShadowOf(Mask, &I);
4347 }
4348
4349 if (!PropagateShadow) {
4350 setShadow(&I, getCleanShadow(&I));
4351 setOrigin(&I, getCleanOrigin());
4352 return;
4353 }
4354
4355 Type *ShadowTy = getShadowTy(&I);
4356 Value *ShadowPtr, *OriginPtr;
4357 std::tie(ShadowPtr, OriginPtr) =
4358 getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
4359 setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
4360 getShadow(PassThru), "_msmaskedld"));
4361
4362 if (!MS.TrackOrigins)
4363 return;
4364
4365 // Choose between PassThru's and the loaded value's origins.
4366 Value *MaskedPassThruShadow = IRB.CreateAnd(
4367 getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
4368
4369 Value *NotNull = convertToBool(MaskedPassThruShadow, IRB, "_mscmp");
4370
4371 Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
4372 Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
4373
4374 setOrigin(&I, Origin);
4375 }
4376
4377 // e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
4378 // dst mask src
4379 //
4380 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4381 // by handleMaskedStore.
4382 //
4383 // This function handles AVX and AVX2 masked stores; these use the MSBs of a
4384 // vector of integers, unlike the LLVM masked intrinsics, which require a
4385 // vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
4386 // mentions that the x86 backend does not know how to efficiently convert
4387 // from a vector of booleans back into the AVX mask format; therefore, they
4388 // (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
4389 // intrinsics.
4390 void handleAVXMaskedStore(IntrinsicInst &I) {
4391 assert(I.arg_size() == 3);
4392
4393 IRBuilder<> IRB(&I);
4394
4395 Value *Dst = I.getArgOperand(0);
4396 assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
4397
4398 Value *Mask = I.getArgOperand(1);
4399 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4400
4401 Value *Src = I.getArgOperand(2);
4402 assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
4403
4404 const Align Alignment = Align(1);
4405
4406 Value *SrcShadow = getShadow(Src);
4407
4409 insertCheckShadowOf(Dst, &I);
4410 insertCheckShadowOf(Mask, &I);
4411 }
4412
4413 Value *DstShadowPtr;
4414 Value *DstOriginPtr;
4415 std::tie(DstShadowPtr, DstOriginPtr) = getShadowOriginPtr(
4416 Dst, IRB, SrcShadow->getType(), Alignment, /*isStore*/ true);
4417
4418 SmallVector<Value *, 2> ShadowArgs;
4419 ShadowArgs.append(1, DstShadowPtr);
4420 ShadowArgs.append(1, Mask);
4421 // The intrinsic may require floating-point but shadows can be arbitrary
4422 // bit patterns, of which some would be interpreted as "invalid"
4423 // floating-point values (NaN etc.); we assume the intrinsic will happily
4424 // copy them.
4425 ShadowArgs.append(1, IRB.CreateBitCast(SrcShadow, Src->getType()));
4426
4427 CallInst *CI =
4428 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
4429 setShadow(&I, CI);
4430
4431 if (!MS.TrackOrigins)
4432 return;
4433
4434 // Approximation only
4435 auto &DL = F.getDataLayout();
4436 paintOrigin(IRB, getOrigin(Src), DstOriginPtr,
4437 DL.getTypeStoreSize(SrcShadow->getType()),
4438 std::max(Alignment, kMinOriginAlignment));
4439 }
4440
4441 // e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
4442 // return src mask
4443 //
4444 // Masked-off values are replaced with 0, which conveniently also represents
4445 // initialized memory.
4446 //
4447 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4448 // by handleMaskedStore.
4449 //
4450 // We do not combine this with handleMaskedLoad; see comment in
4451 // handleAVXMaskedStore for the rationale.
4452 //
4453 // This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
4454 // because we need to apply getShadowOriginPtr, not getShadow, to the first
4455 // parameter.
4456 void handleAVXMaskedLoad(IntrinsicInst &I) {
4457 assert(I.arg_size() == 2);
4458
4459 IRBuilder<> IRB(&I);
4460
4461 Value *Src = I.getArgOperand(0);
4462 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4463
4464 Value *Mask = I.getArgOperand(1);
4465 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4466
4467 const Align Alignment = Align(1);
4468
4470 insertCheckShadowOf(Mask, &I);
4471 }
4472
4473 Type *SrcShadowTy = getShadowTy(Src);
4474 Value *SrcShadowPtr, *SrcOriginPtr;
4475 std::tie(SrcShadowPtr, SrcOriginPtr) =
4476 getShadowOriginPtr(Src, IRB, SrcShadowTy, Alignment, /*isStore*/ false);
4477
4478 SmallVector<Value *, 2> ShadowArgs;
4479 ShadowArgs.append(1, SrcShadowPtr);
4480 ShadowArgs.append(1, Mask);
4481
4482 CallInst *CI =
4483 IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(), ShadowArgs);
4484 // The AVX masked load intrinsics do not have integer variants. We use the
4485 // floating-point variants, which will happily copy the shadows even if
4486 // they are interpreted as "invalid" floating-point values (NaN etc.).
4487 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4488
4489 if (!MS.TrackOrigins)
4490 return;
4491
4492 // The "pass-through" value is always zero (initialized). To the extent
4493 // that that results in initialized aligned 4-byte chunks, the origin value
4494 // is ignored. It is therefore correct to simply copy the origin from src.
4495 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
4496 setOrigin(&I, PtrSrcOrigin);
4497 }
4498
4499 // Test whether the mask indices are initialized, only checking the bits that
4500 // are actually used.
4501 //
4502 // e.g., if Idx is <32 x i16>, only (log2(32) == 5) bits of each index are
4503 // used/checked.
4504 void maskedCheckAVXIndexShadow(IRBuilder<> &IRB, Value *Idx, Instruction *I) {
4505 assert(isFixedIntVector(Idx));
4506 auto IdxVectorSize =
4507 cast<FixedVectorType>(Idx->getType())->getNumElements();
4508 assert(isPowerOf2_64(IdxVectorSize));
4509
4510 // Compiler isn't smart enough, let's help it
4511 if (isa<Constant>(Idx))
4512 return;
4513
4514 auto *IdxShadow = getShadow(Idx);
4515 Value *Truncated = IRB.CreateTrunc(
4516 IdxShadow,
4517 FixedVectorType::get(Type::getIntNTy(*MS.C, Log2_64(IdxVectorSize)),
4518 IdxVectorSize));
4519 insertCheckShadow(Truncated, getOrigin(Idx), I);
4520 }
4521
4522 // Instrument AVX permutation intrinsic.
4523 // We apply the same permutation (argument index 1) to the shadow.
4524 void handleAVXVpermilvar(IntrinsicInst &I) {
4525 IRBuilder<> IRB(&I);
4526 Value *Shadow = getShadow(&I, 0);
4527 maskedCheckAVXIndexShadow(IRB, I.getArgOperand(1), &I);
4528
4529 // Shadows are integer-ish types but some intrinsics require a
4530 // different (e.g., floating-point) type.
4531 Shadow = IRB.CreateBitCast(Shadow, I.getArgOperand(0)->getType());
4532 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4533 {Shadow, I.getArgOperand(1)});
4534
4535 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4536 setOriginForNaryOp(I);
4537 }
4538
4539 // Instrument AVX permutation intrinsic.
4540 // We apply the same permutation (argument index 1) to the shadows.
4541 void handleAVXVpermi2var(IntrinsicInst &I) {
4542 assert(I.arg_size() == 3);
4543 assert(isa<FixedVectorType>(I.getArgOperand(0)->getType()));
4544 assert(isa<FixedVectorType>(I.getArgOperand(1)->getType()));
4545 assert(isa<FixedVectorType>(I.getArgOperand(2)->getType()));
4546 [[maybe_unused]] auto ArgVectorSize =
4547 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4548 assert(cast<FixedVectorType>(I.getArgOperand(1)->getType())
4549 ->getNumElements() == ArgVectorSize);
4550 assert(cast<FixedVectorType>(I.getArgOperand(2)->getType())
4551 ->getNumElements() == ArgVectorSize);
4552 assert(I.getArgOperand(0)->getType() == I.getArgOperand(2)->getType());
4553 assert(I.getType() == I.getArgOperand(0)->getType());
4554 assert(I.getArgOperand(1)->getType()->isIntOrIntVectorTy());
4555 IRBuilder<> IRB(&I);
4556 Value *AShadow = getShadow(&I, 0);
4557 Value *Idx = I.getArgOperand(1);
4558 Value *BShadow = getShadow(&I, 2);
4559
4560 maskedCheckAVXIndexShadow(IRB, Idx, &I);
4561
4562 // Shadows are integer-ish types but some intrinsics require a
4563 // different (e.g., floating-point) type.
4564 AShadow = IRB.CreateBitCast(AShadow, I.getArgOperand(0)->getType());
4565 BShadow = IRB.CreateBitCast(BShadow, I.getArgOperand(2)->getType());
4566 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4567 {AShadow, Idx, BShadow});
4568 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4569 setOriginForNaryOp(I);
4570 }
4571
4572 [[maybe_unused]] static bool isFixedIntVectorTy(const Type *T) {
4573 return isa<FixedVectorType>(T) && T->isIntOrIntVectorTy();
4574 }
4575
4576 [[maybe_unused]] static bool isFixedFPVectorTy(const Type *T) {
4577 return isa<FixedVectorType>(T) && T->isFPOrFPVectorTy();
4578 }
4579
4580 [[maybe_unused]] static bool isFixedIntVector(const Value *V) {
4581 return isFixedIntVectorTy(V->getType());
4582 }
4583
4584 [[maybe_unused]] static bool isFixedFPVector(const Value *V) {
4585 return isFixedFPVectorTy(V->getType());
4586 }
4587
4588 // e.g., <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
4589 // (<16 x float> a, <16 x i32> writethru, i16 mask,
4590 // i32 rounding)
4591 //
4592 // Inconveniently, some similar intrinsics have a different operand order:
4593 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
4594 // (<16 x float> a, i32 rounding, <16 x i16> writethru,
4595 // i16 mask)
4596 //
4597 // If the return type has more elements than A, the excess elements are
4598 // zeroed (and the corresponding shadow is initialized).
4599 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
4600 // (<4 x float> a, i32 rounding, <8 x i16> writethru,
4601 // i8 mask)
4602 //
4603 // dst[i] = mask[i] ? convert(a[i]) : writethru[i]
4604 // dst_shadow[i] = mask[i] ? all_or_nothing(a_shadow[i]) : writethru_shadow[i]
4605 // where all_or_nothing(x) is fully uninitialized if x has any
4606 // uninitialized bits
4607 void handleAVX512VectorConvertFPToInt(IntrinsicInst &I, bool LastMask) {
4608 IRBuilder<> IRB(&I);
4609
4610 assert(I.arg_size() == 4);
4611 Value *A = I.getOperand(0);
4612 Value *WriteThrough;
4613 Value *Mask;
4615 if (LastMask) {
4616 WriteThrough = I.getOperand(2);
4617 Mask = I.getOperand(3);
4618 RoundingMode = I.getOperand(1);
4619 } else {
4620 WriteThrough = I.getOperand(1);
4621 Mask = I.getOperand(2);
4622 RoundingMode = I.getOperand(3);
4623 }
4624
4625 assert(isFixedFPVector(A));
4626 assert(isFixedIntVector(WriteThrough));
4627
4628 unsigned ANumElements =
4629 cast<FixedVectorType>(A->getType())->getNumElements();
4630 [[maybe_unused]] unsigned WriteThruNumElements =
4631 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4632 assert(ANumElements == WriteThruNumElements ||
4633 ANumElements * 2 == WriteThruNumElements);
4634
4635 assert(Mask->getType()->isIntegerTy());
4636 unsigned MaskNumElements = Mask->getType()->getScalarSizeInBits();
4637 assert(ANumElements == MaskNumElements ||
4638 ANumElements * 2 == MaskNumElements);
4639
4640 assert(WriteThruNumElements == MaskNumElements);
4641
4642 // Some bits of the mask may be unused, though it's unusual to have partly
4643 // uninitialized bits.
4644 insertCheckShadowOf(Mask, &I);
4645
4646 assert(RoundingMode->getType()->isIntegerTy());
4647 // Only some bits of the rounding mode are used, though it's very
4648 // unusual to have uninitialized bits there (more commonly, it's a
4649 // constant).
4650 insertCheckShadowOf(RoundingMode, &I);
4651
4652 assert(I.getType() == WriteThrough->getType());
4653
4654 Value *AShadow = getShadow(A);
4655 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4656
4657 if (ANumElements * 2 == MaskNumElements) {
4658 // Ensure that the irrelevant bits of the mask are zero, hence selecting
4659 // from the zeroed shadow instead of the writethrough's shadow.
4660 Mask =
4661 IRB.CreateTrunc(Mask, IRB.getIntNTy(ANumElements), "_ms_mask_trunc");
4662 Mask =
4663 IRB.CreateZExt(Mask, IRB.getIntNTy(MaskNumElements), "_ms_mask_zext");
4664 }
4665
4666 // Convert i16 mask to <16 x i1>
4667 Mask = IRB.CreateBitCast(
4668 Mask, FixedVectorType::get(IRB.getInt1Ty(), MaskNumElements),
4669 "_ms_mask_bitcast");
4670
4671 /// For floating-point to integer conversion, the output is:
4672 /// - fully uninitialized if *any* bit of the input is uninitialized
4673 /// - fully ininitialized if all bits of the input are ininitialized
4674 /// We apply the same principle on a per-element basis for vectors.
4675 ///
4676 /// We use the scalar width of the return type instead of A's.
4677 AShadow = IRB.CreateSExt(
4678 IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow->getType())),
4679 getShadowTy(&I), "_ms_a_shadow");
4680
4681 Value *WriteThroughShadow = getShadow(WriteThrough);
4682 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow,
4683 "_ms_writethru_select");
4684
4685 setShadow(&I, Shadow);
4686 setOriginForNaryOp(I);
4687 }
4688
4689 // Instrument BMI / BMI2 intrinsics.
4690 // All of these intrinsics are Z = I(X, Y)
4691 // where the types of all operands and the result match, and are either i32 or
4692 // i64. The following instrumentation happens to work for all of them:
4693 // Sz = I(Sx, Y) | (sext (Sy != 0))
4694 void handleBmiIntrinsic(IntrinsicInst &I) {
4695 IRBuilder<> IRB(&I);
4696 Type *ShadowTy = getShadowTy(&I);
4697
4698 // If any bit of the mask operand is poisoned, then the whole thing is.
4699 Value *SMask = getShadow(&I, 1);
4700 SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
4701 ShadowTy);
4702 // Apply the same intrinsic to the shadow of the first operand.
4703 Value *S = IRB.CreateCall(I.getCalledFunction(),
4704 {getShadow(&I, 0), I.getOperand(1)});
4705 S = IRB.CreateOr(SMask, S);
4706 setShadow(&I, S);
4707 setOriginForNaryOp(I);
4708 }
4709
4710 static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
4711 SmallVector<int, 8> Mask;
4712 for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
4713 Mask.append(2, X);
4714 }
4715 return Mask;
4716 }
4717
4718 // Instrument pclmul intrinsics.
4719 // These intrinsics operate either on odd or on even elements of the input
4720 // vectors, depending on the constant in the 3rd argument, ignoring the rest.
4721 // Replace the unused elements with copies of the used ones, ex:
4722 // (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
4723 // or
4724 // (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
4725 // and then apply the usual shadow combining logic.
4726 void handlePclmulIntrinsic(IntrinsicInst &I) {
4727 IRBuilder<> IRB(&I);
4728 unsigned Width =
4729 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4730 assert(isa<ConstantInt>(I.getArgOperand(2)) &&
4731 "pclmul 3rd operand must be a constant");
4732 unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
4733 Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
4734 getPclmulMask(Width, Imm & 0x01));
4735 Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
4736 getPclmulMask(Width, Imm & 0x10));
4737 ShadowAndOriginCombiner SOC(this, IRB);
4738 SOC.Add(Shuf0, getOrigin(&I, 0));
4739 SOC.Add(Shuf1, getOrigin(&I, 1));
4740 SOC.Done(&I);
4741 }
4742
4743 // Instrument _mm_*_sd|ss intrinsics
4744 void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
4745 IRBuilder<> IRB(&I);
4746 unsigned Width =
4747 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4748 Value *First = getShadow(&I, 0);
4749 Value *Second = getShadow(&I, 1);
4750 // First element of second operand, remaining elements of first operand
4751 SmallVector<int, 16> Mask;
4752 Mask.push_back(Width);
4753 for (unsigned i = 1; i < Width; i++)
4754 Mask.push_back(i);
4755 Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
4756
4757 setShadow(&I, Shadow);
4758 setOriginForNaryOp(I);
4759 }
4760
4761 void handleVtestIntrinsic(IntrinsicInst &I) {
4762 IRBuilder<> IRB(&I);
4763 Value *Shadow0 = getShadow(&I, 0);
4764 Value *Shadow1 = getShadow(&I, 1);
4765 Value *Or = IRB.CreateOr(Shadow0, Shadow1);
4766 Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
4767 Value *Scalar = convertShadowToScalar(NZ, IRB);
4768 Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
4769
4770 setShadow(&I, Shadow);
4771 setOriginForNaryOp(I);
4772 }
4773
4774 void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
4775 IRBuilder<> IRB(&I);
4776 unsigned Width =
4777 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4778 Value *First = getShadow(&I, 0);
4779 Value *Second = getShadow(&I, 1);
4780 Value *OrShadow = IRB.CreateOr(First, Second);
4781 // First element of both OR'd together, remaining elements of first operand
4782 SmallVector<int, 16> Mask;
4783 Mask.push_back(Width);
4784 for (unsigned i = 1; i < Width; i++)
4785 Mask.push_back(i);
4786 Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
4787
4788 setShadow(&I, Shadow);
4789 setOriginForNaryOp(I);
4790 }
4791
4792 // _mm_round_ps / _mm_round_ps.
4793 // Similar to maybeHandleSimpleNomemIntrinsic except
4794 // the second argument is guaranteed to be a constant integer.
4795 void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
4796 assert(I.getArgOperand(0)->getType() == I.getType());
4797 assert(I.arg_size() == 2);
4798 assert(isa<ConstantInt>(I.getArgOperand(1)));
4799
4800 IRBuilder<> IRB(&I);
4801 ShadowAndOriginCombiner SC(this, IRB);
4802 SC.Add(I.getArgOperand(0));
4803 SC.Done(&I);
4804 }
4805
4806 // Instrument @llvm.abs intrinsic.
4807 //
4808 // e.g., i32 @llvm.abs.i32 (i32 <Src>, i1 <is_int_min_poison>)
4809 // <4 x i32> @llvm.abs.v4i32(<4 x i32> <Src>, i1 <is_int_min_poison>)
4810 void handleAbsIntrinsic(IntrinsicInst &I) {
4811 assert(I.arg_size() == 2);
4812 Value *Src = I.getArgOperand(0);
4813 Value *IsIntMinPoison = I.getArgOperand(1);
4814
4815 assert(I.getType()->isIntOrIntVectorTy());
4816
4817 assert(Src->getType() == I.getType());
4818
4819 assert(IsIntMinPoison->getType()->isIntegerTy());
4820 assert(IsIntMinPoison->getType()->getIntegerBitWidth() == 1);
4821
4822 IRBuilder<> IRB(&I);
4823 Value *SrcShadow = getShadow(Src);
4824
4825 APInt MinVal =
4826 APInt::getSignedMinValue(Src->getType()->getScalarSizeInBits());
4827 Value *MinValVec = ConstantInt::get(Src->getType(), MinVal);
4828 Value *SrcIsMin = IRB.CreateICmp(CmpInst::ICMP_EQ, Src, MinValVec);
4829
4830 Value *PoisonedShadow = getPoisonedShadow(Src);
4831 Value *PoisonedIfIntMinShadow =
4832 IRB.CreateSelect(SrcIsMin, PoisonedShadow, SrcShadow);
4833 Value *Shadow =
4834 IRB.CreateSelect(IsIntMinPoison, PoisonedIfIntMinShadow, SrcShadow);
4835
4836 setShadow(&I, Shadow);
4837 setOrigin(&I, getOrigin(&I, 0));
4838 }
4839
4840 void handleIsFpClass(IntrinsicInst &I) {
4841 IRBuilder<> IRB(&I);
4842 Value *Shadow = getShadow(&I, 0);
4843 setShadow(&I, IRB.CreateICmpNE(Shadow, getCleanShadow(Shadow)));
4844 setOrigin(&I, getOrigin(&I, 0));
4845 }
4846
4847 void handleArithmeticWithOverflow(IntrinsicInst &I) {
4848 IRBuilder<> IRB(&I);
4849 Value *Shadow0 = getShadow(&I, 0);
4850 Value *Shadow1 = getShadow(&I, 1);
4851 Value *ShadowElt0 = IRB.CreateOr(Shadow0, Shadow1);
4852 Value *ShadowElt1 =
4853 IRB.CreateICmpNE(ShadowElt0, getCleanShadow(ShadowElt0));
4854
4855 Value *Shadow = PoisonValue::get(getShadowTy(&I));
4856 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt0, 0);
4857 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt1, 1);
4858
4859 setShadow(&I, Shadow);
4860 setOriginForNaryOp(I);
4861 }
4862
4863 Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
4864 assert(isa<FixedVectorType>(V->getType()));
4865 assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
4866 Value *Shadow = getShadow(V);
4867 return IRB.CreateExtractElement(Shadow,
4868 ConstantInt::get(IRB.getInt32Ty(), 0));
4869 }
4870
4871 // Handle llvm.x86.avx512.mask.pmov{,s,us}.*.512
4872 //
4873 // e.g., call <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512
4874 // (<8 x i64>, <16 x i8>, i8)
4875 // A WriteThru Mask
4876 //
4877 // call <16 x i8> @llvm.x86.avx512.mask.pmovs.db.512
4878 // (<16 x i32>, <16 x i8>, i16)
4879 //
4880 // Dst[i] = Mask[i] ? truncate_or_saturate(A[i]) : WriteThru[i]
4881 // Dst_shadow[i] = Mask[i] ? truncate(A_shadow[i]) : WriteThru_shadow[i]
4882 //
4883 // If Dst has more elements than A, the excess elements are zeroed (and the
4884 // corresponding shadow is initialized).
4885 //
4886 // Note: for PMOV (truncation), handleIntrinsicByApplyingToShadow is precise
4887 // and is much faster than this handler.
4888 void handleAVX512VectorDownConvert(IntrinsicInst &I) {
4889 IRBuilder<> IRB(&I);
4890
4891 assert(I.arg_size() == 3);
4892 Value *A = I.getOperand(0);
4893 Value *WriteThrough = I.getOperand(1);
4894 Value *Mask = I.getOperand(2);
4895
4896 assert(isFixedIntVector(A));
4897 assert(isFixedIntVector(WriteThrough));
4898
4899 unsigned ANumElements =
4900 cast<FixedVectorType>(A->getType())->getNumElements();
4901 unsigned OutputNumElements =
4902 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4903 assert(ANumElements == OutputNumElements ||
4904 ANumElements * 2 == OutputNumElements);
4905
4906 assert(Mask->getType()->isIntegerTy());
4907 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
4908 insertCheckShadowOf(Mask, &I);
4909
4910 assert(I.getType() == WriteThrough->getType());
4911
4912 // Widen the mask, if necessary, to have one bit per element of the output
4913 // vector.
4914 // We want the extra bits to have '1's, so that the CreateSelect will
4915 // select the values from AShadow instead of WriteThroughShadow ("maskless"
4916 // versions of the intrinsics are sometimes implemented using an all-1's
4917 // mask and an undefined value for WriteThroughShadow). We accomplish this
4918 // by using bitwise NOT before and after the ZExt.
4919 if (ANumElements != OutputNumElements) {
4920 Mask = IRB.CreateNot(Mask);
4921 Mask = IRB.CreateZExt(Mask, Type::getIntNTy(*MS.C, OutputNumElements),
4922 "_ms_widen_mask");
4923 Mask = IRB.CreateNot(Mask);
4924 }
4925 Mask = IRB.CreateBitCast(
4926 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
4927
4928 Value *AShadow = getShadow(A);
4929
4930 // The return type might have more elements than the input.
4931 // Temporarily shrink the return type's number of elements.
4932 VectorType *ShadowType = maybeShrinkVectorShadowType(A, I);
4933
4934 // PMOV truncates; PMOVS/PMOVUS uses signed/unsigned saturation.
4935 // This handler treats them all as truncation, which leads to some rare
4936 // false positives in the cases where the truncated bytes could
4937 // unambiguously saturate the value e.g., if A = ??????10 ????????
4938 // (big-endian), the unsigned saturated byte conversion is 11111111 i.e.,
4939 // fully defined, but the truncated byte is ????????.
4940 //
4941 // TODO: use GetMinMaxUnsigned() to handle saturation precisely.
4942 AShadow = IRB.CreateTrunc(AShadow, ShadowType, "_ms_trunc_shadow");
4943 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4944
4945 Value *WriteThroughShadow = getShadow(WriteThrough);
4946
4947 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
4948 setShadow(&I, Shadow);
4949 setOriginForNaryOp(I);
4950 }
4951
4952 // Handle llvm.x86.avx512.* instructions that take a vector of floating-point
4953 // values and perform an operation whose shadow propagation should be handled
4954 // as all-or-nothing [*], with masking provided by a vector and a mask
4955 // supplied as an integer.
4956 //
4957 // [*] if all bits of a vector element are initialized, the output is fully
4958 // initialized; otherwise, the output is fully uninitialized
4959 //
4960 // e.g., <16 x float> @llvm.x86.avx512.rsqrt14.ps.512
4961 // (<16 x float>, <16 x float>, i16)
4962 // A WriteThru Mask
4963 //
4964 // <2 x double> @llvm.x86.avx512.rcp14.pd.128
4965 // (<2 x double>, <2 x double>, i8)
4966 //
4967 // <8 x double> @llvm.x86.avx512.mask.rndscale.pd.512
4968 // (<8 x double>, i32, <8 x double>, i8, i32)
4969 // A Imm WriteThru Mask Rounding
4970 //
4971 // All operands other than A and WriteThru (e.g., Mask, Imm, Rounding) must
4972 // be fully initialized.
4973 //
4974 // Dst[i] = Mask[i] ? some_op(A[i]) : WriteThru[i]
4975 // Dst_shadow[i] = Mask[i] ? all_or_nothing(A_shadow[i]) : WriteThru_shadow[i]
4976 void handleAVX512VectorGenericMaskedFP(IntrinsicInst &I, unsigned AIndex,
4977 unsigned WriteThruIndex,
4978 unsigned MaskIndex) {
4979 IRBuilder<> IRB(&I);
4980
4981 unsigned NumArgs = I.arg_size();
4982 assert(AIndex < NumArgs);
4983 assert(WriteThruIndex < NumArgs);
4984 assert(MaskIndex < NumArgs);
4985 assert(AIndex != WriteThruIndex);
4986 assert(AIndex != MaskIndex);
4987 assert(WriteThruIndex != MaskIndex);
4988
4989 Value *A = I.getOperand(AIndex);
4990 Value *WriteThru = I.getOperand(WriteThruIndex);
4991 Value *Mask = I.getOperand(MaskIndex);
4992
4993 assert(isFixedFPVector(A));
4994 assert(isFixedFPVector(WriteThru));
4995
4996 [[maybe_unused]] unsigned ANumElements =
4997 cast<FixedVectorType>(A->getType())->getNumElements();
4998 unsigned OutputNumElements =
4999 cast<FixedVectorType>(WriteThru->getType())->getNumElements();
5000 assert(ANumElements == OutputNumElements);
5001
5002 for (unsigned i = 0; i < NumArgs; ++i) {
5003 if (i != AIndex && i != WriteThruIndex) {
5004 // Imm, Mask, Rounding etc. are "control" data, hence we require that
5005 // they be fully initialized.
5006 assert(I.getOperand(i)->getType()->isIntegerTy());
5007 insertCheckShadowOf(I.getOperand(i), &I);
5008 }
5009 }
5010
5011 // The mask has 1 bit per element of A, but a minimum of 8 bits.
5012 if (Mask->getType()->getScalarSizeInBits() == 8 && ANumElements < 8)
5013 Mask = IRB.CreateTrunc(Mask, Type::getIntNTy(*MS.C, ANumElements));
5014 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
5015
5016 assert(I.getType() == WriteThru->getType());
5017
5018 Mask = IRB.CreateBitCast(
5019 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
5020
5021 Value *AShadow = getShadow(A);
5022
5023 // All-or-nothing shadow
5024 AShadow = IRB.CreateSExt(IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow)),
5025 AShadow->getType());
5026
5027 Value *WriteThruShadow = getShadow(WriteThru);
5028
5029 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThruShadow);
5030 setShadow(&I, Shadow);
5031
5032 setOriginForNaryOp(I);
5033 }
5034
5035 // For sh.* compiler intrinsics:
5036 // llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
5037 // (<8 x half>, <8 x half>, <8 x half>, i8, i32)
5038 // A B WriteThru Mask RoundingMode
5039 //
5040 // DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
5041 // DstShadow[1..7] = AShadow[1..7]
5042 void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
5043 IRBuilder<> IRB(&I);
5044
5045 assert(I.arg_size() == 5);
5046 Value *A = I.getOperand(0);
5047 Value *B = I.getOperand(1);
5048 Value *WriteThrough = I.getOperand(2);
5049 Value *Mask = I.getOperand(3);
5050 Value *RoundingMode = I.getOperand(4);
5051
5052 // Technically, we could probably just check whether the LSB is
5053 // initialized, but intuitively it feels like a partly uninitialized mask
5054 // is unintended, and we should warn the user immediately.
5055 insertCheckShadowOf(Mask, &I);
5056 insertCheckShadowOf(RoundingMode, &I);
5057
5058 assert(isa<FixedVectorType>(A->getType()));
5059 unsigned NumElements =
5060 cast<FixedVectorType>(A->getType())->getNumElements();
5061 assert(NumElements == 8);
5062 assert(A->getType() == B->getType());
5063 assert(B->getType() == WriteThrough->getType());
5064 assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
5065 assert(RoundingMode->getType()->isIntegerTy());
5066
5067 Value *ALowerShadow = extractLowerShadow(IRB, A);
5068 Value *BLowerShadow = extractLowerShadow(IRB, B);
5069
5070 Value *ABLowerShadow = IRB.CreateOr(ALowerShadow, BLowerShadow);
5071
5072 Value *WriteThroughLowerShadow = extractLowerShadow(IRB, WriteThrough);
5073
5074 Mask = IRB.CreateBitCast(
5075 Mask, FixedVectorType::get(IRB.getInt1Ty(), NumElements));
5076 Value *MaskLower =
5077 IRB.CreateExtractElement(Mask, ConstantInt::get(IRB.getInt32Ty(), 0));
5078
5079 Value *AShadow = getShadow(A);
5080 Value *DstLowerShadow =
5081 IRB.CreateSelect(MaskLower, ABLowerShadow, WriteThroughLowerShadow);
5082 Value *DstShadow = IRB.CreateInsertElement(
5083 AShadow, DstLowerShadow, ConstantInt::get(IRB.getInt32Ty(), 0),
5084 "_msprop");
5085
5086 setShadow(&I, DstShadow);
5087 setOriginForNaryOp(I);
5088 }
5089
5090 // Approximately handle AVX Galois Field Affine Transformation
5091 //
5092 // e.g.,
5093 // <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
5094 // <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
5095 // <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
5096 // Out A x b
5097 // where A and x are packed matrices, b is a vector,
5098 // Out = A * x + b in GF(2)
5099 //
5100 // Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix
5101 // computation also includes a parity calculation.
5102 //
5103 // For the bitwise AND of bits V1 and V2, the exact shadow is:
5104 // Out_Shadow = (V1_Shadow & V2_Shadow)
5105 // | (V1 & V2_Shadow)
5106 // | (V1_Shadow & V2 )
5107 //
5108 // We approximate the shadow of gf2p8affineqb using:
5109 // Out_Shadow = gf2p8affineqb(x_Shadow, A_shadow, 0)
5110 // | gf2p8affineqb(x, A_shadow, 0)
5111 // | gf2p8affineqb(x_Shadow, A, 0)
5112 // | set1_epi8(b_Shadow)
5113 //
5114 // This approximation has false negatives: if an intermediate dot-product
5115 // contains an even number of 1's, the parity is 0.
5116 // It has no false positives.
5117 void handleAVXGF2P8Affine(IntrinsicInst &I) {
5118 IRBuilder<> IRB(&I);
5119
5120 assert(I.arg_size() == 3);
5121 Value *A = I.getOperand(0);
5122 Value *X = I.getOperand(1);
5123 Value *B = I.getOperand(2);
5124
5125 assert(isFixedIntVector(A));
5126 assert(cast<VectorType>(A->getType())
5127 ->getElementType()
5128 ->getScalarSizeInBits() == 8);
5129
5130 assert(A->getType() == X->getType());
5131
5132 assert(B->getType()->isIntegerTy());
5133 assert(B->getType()->getScalarSizeInBits() == 8);
5134
5135 assert(I.getType() == A->getType());
5136
5137 Value *AShadow = getShadow(A);
5138 Value *XShadow = getShadow(X);
5139 Value *BZeroShadow = getCleanShadow(B);
5140
5141 CallInst *AShadowXShadow = IRB.CreateIntrinsic(
5142 I.getType(), I.getIntrinsicID(), {XShadow, AShadow, BZeroShadow});
5143 CallInst *AShadowX = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5144 {X, AShadow, BZeroShadow});
5145 CallInst *XShadowA = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5146 {XShadow, A, BZeroShadow});
5147
5148 unsigned NumElements = cast<FixedVectorType>(I.getType())->getNumElements();
5149 Value *BShadow = getShadow(B);
5150 Value *BBroadcastShadow = getCleanShadow(AShadow);
5151 // There is no LLVM IR intrinsic for _mm512_set1_epi8.
5152 // This loop generates a lot of LLVM IR, which we expect that CodeGen will
5153 // lower appropriately (e.g., VPBROADCASTB).
5154 // Besides, b is often a constant, in which case it is fully initialized.
5155 for (unsigned i = 0; i < NumElements; i++)
5156 BBroadcastShadow = IRB.CreateInsertElement(BBroadcastShadow, BShadow, i);
5157
5158 setShadow(&I, IRB.CreateOr(
5159 {AShadowXShadow, AShadowX, XShadowA, BBroadcastShadow}));
5160 setOriginForNaryOp(I);
5161 }
5162
5163 // Handle Arm NEON vector load intrinsics (vld*).
5164 //
5165 // The WithLane instructions (ld[234]lane) are similar to:
5166 // call {<4 x i32>, <4 x i32>, <4 x i32>}
5167 // @llvm.aarch64.neon.ld3lane.v4i32.p0
5168 // (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
5169 // %A)
5170 //
5171 // The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
5172 // to:
5173 // call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
5174 void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
5175 unsigned int numArgs = I.arg_size();
5176
5177 // Return type is a struct of vectors of integers or floating-point
5178 assert(I.getType()->isStructTy());
5179 [[maybe_unused]] StructType *RetTy = cast<StructType>(I.getType());
5180 assert(RetTy->getNumElements() > 0);
5182 RetTy->getElementType(0)->isFPOrFPVectorTy());
5183 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5184 assert(RetTy->getElementType(i) == RetTy->getElementType(0));
5185
5186 if (WithLane) {
5187 // 2, 3 or 4 vectors, plus lane number, plus input pointer
5188 assert(4 <= numArgs && numArgs <= 6);
5189
5190 // Return type is a struct of the input vectors
5191 assert(RetTy->getNumElements() + 2 == numArgs);
5192 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5193 assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
5194 } else {
5195 assert(numArgs == 1);
5196 }
5197
5198 IRBuilder<> IRB(&I);
5199
5200 SmallVector<Value *, 6> ShadowArgs;
5201 if (WithLane) {
5202 for (unsigned int i = 0; i < numArgs - 2; i++)
5203 ShadowArgs.push_back(getShadow(I.getArgOperand(i)));
5204
5205 // Lane number, passed verbatim
5206 Value *LaneNumber = I.getArgOperand(numArgs - 2);
5207 ShadowArgs.push_back(LaneNumber);
5208
5209 // TODO: blend shadow of lane number into output shadow?
5210 insertCheckShadowOf(LaneNumber, &I);
5211 }
5212
5213 Value *Src = I.getArgOperand(numArgs - 1);
5214 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
5215
5216 Type *SrcShadowTy = getShadowTy(Src);
5217 auto [SrcShadowPtr, SrcOriginPtr] =
5218 getShadowOriginPtr(Src, IRB, SrcShadowTy, Align(1), /*isStore*/ false);
5219 ShadowArgs.push_back(SrcShadowPtr);
5220
5221 // The NEON vector load instructions handled by this function all have
5222 // integer variants. It is easier to use those rather than trying to cast
5223 // a struct of vectors of floats into a struct of vectors of integers.
5224 CallInst *CI =
5225 IRB.CreateIntrinsic(getShadowTy(&I), I.getIntrinsicID(), ShadowArgs);
5226 setShadow(&I, CI);
5227
5228 if (!MS.TrackOrigins)
5229 return;
5230
5231 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
5232 setOrigin(&I, PtrSrcOrigin);
5233 }
5234
5235 /// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
5236 /// and vst{2,3,4}lane).
5237 ///
5238 /// Arm NEON vector store intrinsics have the output address (pointer) as the
5239 /// last argument, with the initial arguments being the inputs (and lane
5240 /// number for vst{2,3,4}lane). They return void.
5241 ///
5242 /// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
5243 /// abcdabcdabcdabcd... into *outP
5244 /// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
5245 /// writes aaaa...bbbb...cccc...dddd... into *outP
5246 /// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
5247 /// These instructions can all be instrumented with essentially the same
5248 /// MSan logic, simply by applying the corresponding intrinsic to the shadow.
5249 void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
5250 IRBuilder<> IRB(&I);
5251
5252 // Don't use getNumOperands() because it includes the callee
5253 int numArgOperands = I.arg_size();
5254
5255 // The last arg operand is the output (pointer)
5256 assert(numArgOperands >= 1);
5257 Value *Addr = I.getArgOperand(numArgOperands - 1);
5258 assert(Addr->getType()->isPointerTy());
5259 int skipTrailingOperands = 1;
5260
5262 insertCheckShadowOf(Addr, &I);
5263
5264 // Second-last operand is the lane number (for vst{2,3,4}lane)
5265 if (useLane) {
5266 skipTrailingOperands++;
5267 assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
5269 I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
5270 }
5271
5272 SmallVector<Value *, 8> ShadowArgs;
5273 // All the initial operands are the inputs
5274 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
5275 assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
5276 Value *Shadow = getShadow(&I, i);
5277 ShadowArgs.append(1, Shadow);
5278 }
5279
5280 // MSan's GetShadowTy assumes the LHS is the type we want the shadow for
5281 // e.g., for:
5282 // [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
5283 // we know the type of the output (and its shadow) is <16 x i8>.
5284 //
5285 // Arm NEON VST is unusual because the last argument is the output address:
5286 // define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
5287 // call void @llvm.aarch64.neon.st2.v16i8.p0
5288 // (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
5289 // and we have no type information about P's operand. We must manually
5290 // compute the type (<16 x i8> x 2).
5291 FixedVectorType *OutputVectorTy = FixedVectorType::get(
5292 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getElementType(),
5293 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements() *
5294 (numArgOperands - skipTrailingOperands));
5295 Type *OutputShadowTy = getShadowTy(OutputVectorTy);
5296
5297 if (useLane)
5298 ShadowArgs.append(1,
5299 I.getArgOperand(numArgOperands - skipTrailingOperands));
5300
5301 Value *OutputShadowPtr, *OutputOriginPtr;
5302 // AArch64 NEON does not need alignment (unless OS requires it)
5303 std::tie(OutputShadowPtr, OutputOriginPtr) = getShadowOriginPtr(
5304 Addr, IRB, OutputShadowTy, Align(1), /*isStore*/ true);
5305 ShadowArgs.append(1, OutputShadowPtr);
5306
5307 CallInst *CI =
5308 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
5309 setShadow(&I, CI);
5310
5311 if (MS.TrackOrigins) {
5312 // TODO: if we modelled the vst* instruction more precisely, we could
5313 // more accurately track the origins (e.g., if both inputs are
5314 // uninitialized for vst2, we currently blame the second input, even
5315 // though part of the output depends only on the first input).
5316 //
5317 // This is particularly imprecise for vst{2,3,4}lane, since only one
5318 // lane of each input is actually copied to the output.
5319 OriginCombiner OC(this, IRB);
5320 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
5321 OC.Add(I.getArgOperand(i));
5322
5323 const DataLayout &DL = F.getDataLayout();
5324 OC.DoneAndStoreOrigin(DL.getTypeStoreSize(OutputVectorTy),
5325 OutputOriginPtr);
5326 }
5327 }
5328
5329 /// Handle intrinsics by applying the intrinsic to the shadows.
5330 ///
5331 /// The trailing arguments are passed verbatim to the intrinsic, though any
5332 /// uninitialized trailing arguments can also taint the shadow e.g., for an
5333 /// intrinsic with one trailing verbatim argument:
5334 /// out = intrinsic(var1, var2, opType)
5335 /// we compute:
5336 /// shadow[out] =
5337 /// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
5338 ///
5339 /// Typically, shadowIntrinsicID will be specified by the caller to be
5340 /// I.getIntrinsicID(), but the caller can choose to replace it with another
5341 /// intrinsic of the same type.
5342 ///
5343 /// CAUTION: this assumes that the intrinsic will handle arbitrary
5344 /// bit-patterns (for example, if the intrinsic accepts floats for
5345 /// var1, we require that it doesn't care if inputs are NaNs).
5346 ///
5347 /// For example, this can be applied to the Arm NEON vector table intrinsics
5348 /// (tbl{1,2,3,4}).
5349 ///
5350 /// The origin is approximated using setOriginForNaryOp.
5351 void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
5352 Intrinsic::ID shadowIntrinsicID,
5353 unsigned int trailingVerbatimArgs) {
5354 IRBuilder<> IRB(&I);
5355
5356 assert(trailingVerbatimArgs < I.arg_size());
5357
5358 SmallVector<Value *, 8> ShadowArgs;
5359 // Don't use getNumOperands() because it includes the callee
5360 for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
5361 Value *Shadow = getShadow(&I, i);
5362
5363 // Shadows are integer-ish types but some intrinsics require a
5364 // different (e.g., floating-point) type.
5365 ShadowArgs.push_back(
5366 IRB.CreateBitCast(Shadow, I.getArgOperand(i)->getType()));
5367 }
5368
5369 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5370 i++) {
5371 Value *Arg = I.getArgOperand(i);
5372 ShadowArgs.push_back(Arg);
5373 }
5374
5375 CallInst *CI =
5376 IRB.CreateIntrinsic(I.getType(), shadowIntrinsicID, ShadowArgs);
5377 Value *CombinedShadow = CI;
5378
5379 // Combine the computed shadow with the shadow of trailing args
5380 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5381 i++) {
5382 Value *Shadow =
5383 CreateShadowCast(IRB, getShadow(&I, i), CombinedShadow->getType());
5384 CombinedShadow = IRB.CreateOr(Shadow, CombinedShadow, "_msprop");
5385 }
5386
5387 setShadow(&I, IRB.CreateBitCast(CombinedShadow, getShadowTy(&I)));
5388
5389 setOriginForNaryOp(I);
5390 }
5391
5392 // Approximation only
5393 //
5394 // e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
5395 void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
5396 assert(I.arg_size() == 2);
5397
5398 handleShadowOr(I);
5399 }
5400
5401 bool maybeHandleCrossPlatformIntrinsic(IntrinsicInst &I) {
5402 switch (I.getIntrinsicID()) {
5403 case Intrinsic::uadd_with_overflow:
5404 case Intrinsic::sadd_with_overflow:
5405 case Intrinsic::usub_with_overflow:
5406 case Intrinsic::ssub_with_overflow:
5407 case Intrinsic::umul_with_overflow:
5408 case Intrinsic::smul_with_overflow:
5409 handleArithmeticWithOverflow(I);
5410 break;
5411 case Intrinsic::abs:
5412 handleAbsIntrinsic(I);
5413 break;
5414 case Intrinsic::bitreverse:
5415 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
5416 /*trailingVerbatimArgs*/ 0);
5417 break;
5418 case Intrinsic::is_fpclass:
5419 handleIsFpClass(I);
5420 break;
5421 case Intrinsic::lifetime_start:
5422 handleLifetimeStart(I);
5423 break;
5424 case Intrinsic::launder_invariant_group:
5425 case Intrinsic::strip_invariant_group:
5426 handleInvariantGroup(I);
5427 break;
5428 case Intrinsic::bswap:
5429 handleBswap(I);
5430 break;
5431 case Intrinsic::ctlz:
5432 case Intrinsic::cttz:
5433 handleCountLeadingTrailingZeros(I);
5434 break;
5435 case Intrinsic::masked_compressstore:
5436 handleMaskedCompressStore(I);
5437 break;
5438 case Intrinsic::masked_expandload:
5439 handleMaskedExpandLoad(I);
5440 break;
5441 case Intrinsic::masked_gather:
5442 handleMaskedGather(I);
5443 break;
5444 case Intrinsic::masked_scatter:
5445 handleMaskedScatter(I);
5446 break;
5447 case Intrinsic::masked_store:
5448 handleMaskedStore(I);
5449 break;
5450 case Intrinsic::masked_load:
5451 handleMaskedLoad(I);
5452 break;
5453 case Intrinsic::vector_reduce_and:
5454 handleVectorReduceAndIntrinsic(I);
5455 break;
5456 case Intrinsic::vector_reduce_or:
5457 handleVectorReduceOrIntrinsic(I);
5458 break;
5459
5460 case Intrinsic::vector_reduce_add:
5461 case Intrinsic::vector_reduce_xor:
5462 case Intrinsic::vector_reduce_mul:
5463 // Signed/Unsigned Min/Max
5464 // TODO: handling similarly to AND/OR may be more precise.
5465 case Intrinsic::vector_reduce_smax:
5466 case Intrinsic::vector_reduce_smin:
5467 case Intrinsic::vector_reduce_umax:
5468 case Intrinsic::vector_reduce_umin:
5469 // TODO: this has no false positives, but arguably we should check that all
5470 // the bits are initialized.
5471 case Intrinsic::vector_reduce_fmax:
5472 case Intrinsic::vector_reduce_fmin:
5473 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
5474 break;
5475
5476 case Intrinsic::vector_reduce_fadd:
5477 case Intrinsic::vector_reduce_fmul:
5478 handleVectorReduceWithStarterIntrinsic(I);
5479 break;
5480
5481 case Intrinsic::scmp:
5482 case Intrinsic::ucmp: {
5483 handleShadowOr(I);
5484 break;
5485 }
5486
5487 case Intrinsic::fshl:
5488 case Intrinsic::fshr:
5489 handleFunnelShift(I);
5490 break;
5491
5492 case Intrinsic::is_constant:
5493 // The result of llvm.is.constant() is always defined.
5494 setShadow(&I, getCleanShadow(&I));
5495 setOrigin(&I, getCleanOrigin());
5496 break;
5497
5498 default:
5499 return false;
5500 }
5501
5502 return true;
5503 }
5504
5505 bool maybeHandleX86SIMDIntrinsic(IntrinsicInst &I) {
5506 switch (I.getIntrinsicID()) {
5507 case Intrinsic::x86_sse_stmxcsr:
5508 handleStmxcsr(I);
5509 break;
5510 case Intrinsic::x86_sse_ldmxcsr:
5511 handleLdmxcsr(I);
5512 break;
5513
5514 // Convert Scalar Double Precision Floating-Point Value
5515 // to Unsigned Doubleword Integer
5516 // etc.
5517 case Intrinsic::x86_avx512_vcvtsd2usi64:
5518 case Intrinsic::x86_avx512_vcvtsd2usi32:
5519 case Intrinsic::x86_avx512_vcvtss2usi64:
5520 case Intrinsic::x86_avx512_vcvtss2usi32:
5521 case Intrinsic::x86_avx512_cvttss2usi64:
5522 case Intrinsic::x86_avx512_cvttss2usi:
5523 case Intrinsic::x86_avx512_cvttsd2usi64:
5524 case Intrinsic::x86_avx512_cvttsd2usi:
5525 case Intrinsic::x86_avx512_cvtusi2ss:
5526 case Intrinsic::x86_avx512_cvtusi642sd:
5527 case Intrinsic::x86_avx512_cvtusi642ss:
5528 handleSSEVectorConvertIntrinsic(I, 1, true);
5529 break;
5530 case Intrinsic::x86_sse2_cvtsd2si64:
5531 case Intrinsic::x86_sse2_cvtsd2si:
5532 case Intrinsic::x86_sse2_cvtsd2ss:
5533 case Intrinsic::x86_sse2_cvttsd2si64:
5534 case Intrinsic::x86_sse2_cvttsd2si:
5535 case Intrinsic::x86_sse_cvtss2si64:
5536 case Intrinsic::x86_sse_cvtss2si:
5537 case Intrinsic::x86_sse_cvttss2si64:
5538 case Intrinsic::x86_sse_cvttss2si:
5539 handleSSEVectorConvertIntrinsic(I, 1);
5540 break;
5541 case Intrinsic::x86_sse_cvtps2pi:
5542 case Intrinsic::x86_sse_cvttps2pi:
5543 handleSSEVectorConvertIntrinsic(I, 2);
5544 break;
5545
5546 // TODO:
5547 // <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
5548 // <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
5549 // <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
5550
5551 case Intrinsic::x86_vcvtps2ph_128:
5552 case Intrinsic::x86_vcvtps2ph_256: {
5553 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
5554 break;
5555 }
5556
5557 // Convert Packed Single Precision Floating-Point Values
5558 // to Packed Signed Doubleword Integer Values
5559 //
5560 // <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
5561 // (<16 x float>, <16 x i32>, i16, i32)
5562 case Intrinsic::x86_avx512_mask_cvtps2dq_512:
5563 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/false);
5564 break;
5565
5566 // Convert Packed Double Precision Floating-Point Values
5567 // to Packed Single Precision Floating-Point Values
5568 case Intrinsic::x86_sse2_cvtpd2ps:
5569 case Intrinsic::x86_sse2_cvtps2dq:
5570 case Intrinsic::x86_sse2_cvtpd2dq:
5571 case Intrinsic::x86_sse2_cvttps2dq:
5572 case Intrinsic::x86_sse2_cvttpd2dq:
5573 case Intrinsic::x86_avx_cvt_pd2_ps_256:
5574 case Intrinsic::x86_avx_cvt_ps2dq_256:
5575 case Intrinsic::x86_avx_cvt_pd2dq_256:
5576 case Intrinsic::x86_avx_cvtt_ps2dq_256:
5577 case Intrinsic::x86_avx_cvtt_pd2dq_256: {
5578 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
5579 break;
5580 }
5581
5582 // Convert Single-Precision FP Value to 16-bit FP Value
5583 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
5584 // (<16 x float>, i32, <16 x i16>, i16)
5585 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
5586 // (<4 x float>, i32, <8 x i16>, i8)
5587 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256
5588 // (<8 x float>, i32, <8 x i16>, i8)
5589 case Intrinsic::x86_avx512_mask_vcvtps2ph_512:
5590 case Intrinsic::x86_avx512_mask_vcvtps2ph_256:
5591 case Intrinsic::x86_avx512_mask_vcvtps2ph_128:
5592 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/true);
5593 break;
5594
5595 // Shift Packed Data (Left Logical, Right Arithmetic, Right Logical)
5596 case Intrinsic::x86_avx512_psll_w_512:
5597 case Intrinsic::x86_avx512_psll_d_512:
5598 case Intrinsic::x86_avx512_psll_q_512:
5599 case Intrinsic::x86_avx512_pslli_w_512:
5600 case Intrinsic::x86_avx512_pslli_d_512:
5601 case Intrinsic::x86_avx512_pslli_q_512:
5602 case Intrinsic::x86_avx512_psrl_w_512:
5603 case Intrinsic::x86_avx512_psrl_d_512:
5604 case Intrinsic::x86_avx512_psrl_q_512:
5605 case Intrinsic::x86_avx512_psra_w_512:
5606 case Intrinsic::x86_avx512_psra_d_512:
5607 case Intrinsic::x86_avx512_psra_q_512:
5608 case Intrinsic::x86_avx512_psrli_w_512:
5609 case Intrinsic::x86_avx512_psrli_d_512:
5610 case Intrinsic::x86_avx512_psrli_q_512:
5611 case Intrinsic::x86_avx512_psrai_w_512:
5612 case Intrinsic::x86_avx512_psrai_d_512:
5613 case Intrinsic::x86_avx512_psrai_q_512:
5614 case Intrinsic::x86_avx512_psra_q_256:
5615 case Intrinsic::x86_avx512_psra_q_128:
5616 case Intrinsic::x86_avx512_psrai_q_256:
5617 case Intrinsic::x86_avx512_psrai_q_128:
5618 case Intrinsic::x86_avx2_psll_w:
5619 case Intrinsic::x86_avx2_psll_d:
5620 case Intrinsic::x86_avx2_psll_q:
5621 case Intrinsic::x86_avx2_pslli_w:
5622 case Intrinsic::x86_avx2_pslli_d:
5623 case Intrinsic::x86_avx2_pslli_q:
5624 case Intrinsic::x86_avx2_psrl_w:
5625 case Intrinsic::x86_avx2_psrl_d:
5626 case Intrinsic::x86_avx2_psrl_q:
5627 case Intrinsic::x86_avx2_psra_w:
5628 case Intrinsic::x86_avx2_psra_d:
5629 case Intrinsic::x86_avx2_psrli_w:
5630 case Intrinsic::x86_avx2_psrli_d:
5631 case Intrinsic::x86_avx2_psrli_q:
5632 case Intrinsic::x86_avx2_psrai_w:
5633 case Intrinsic::x86_avx2_psrai_d:
5634 case Intrinsic::x86_sse2_psll_w:
5635 case Intrinsic::x86_sse2_psll_d:
5636 case Intrinsic::x86_sse2_psll_q:
5637 case Intrinsic::x86_sse2_pslli_w:
5638 case Intrinsic::x86_sse2_pslli_d:
5639 case Intrinsic::x86_sse2_pslli_q:
5640 case Intrinsic::x86_sse2_psrl_w:
5641 case Intrinsic::x86_sse2_psrl_d:
5642 case Intrinsic::x86_sse2_psrl_q:
5643 case Intrinsic::x86_sse2_psra_w:
5644 case Intrinsic::x86_sse2_psra_d:
5645 case Intrinsic::x86_sse2_psrli_w:
5646 case Intrinsic::x86_sse2_psrli_d:
5647 case Intrinsic::x86_sse2_psrli_q:
5648 case Intrinsic::x86_sse2_psrai_w:
5649 case Intrinsic::x86_sse2_psrai_d:
5650 case Intrinsic::x86_mmx_psll_w:
5651 case Intrinsic::x86_mmx_psll_d:
5652 case Intrinsic::x86_mmx_psll_q:
5653 case Intrinsic::x86_mmx_pslli_w:
5654 case Intrinsic::x86_mmx_pslli_d:
5655 case Intrinsic::x86_mmx_pslli_q:
5656 case Intrinsic::x86_mmx_psrl_w:
5657 case Intrinsic::x86_mmx_psrl_d:
5658 case Intrinsic::x86_mmx_psrl_q:
5659 case Intrinsic::x86_mmx_psra_w:
5660 case Intrinsic::x86_mmx_psra_d:
5661 case Intrinsic::x86_mmx_psrli_w:
5662 case Intrinsic::x86_mmx_psrli_d:
5663 case Intrinsic::x86_mmx_psrli_q:
5664 case Intrinsic::x86_mmx_psrai_w:
5665 case Intrinsic::x86_mmx_psrai_d:
5666 handleVectorShiftIntrinsic(I, /* Variable */ false);
5667 break;
5668 case Intrinsic::x86_avx2_psllv_d:
5669 case Intrinsic::x86_avx2_psllv_d_256:
5670 case Intrinsic::x86_avx512_psllv_d_512:
5671 case Intrinsic::x86_avx2_psllv_q:
5672 case Intrinsic::x86_avx2_psllv_q_256:
5673 case Intrinsic::x86_avx512_psllv_q_512:
5674 case Intrinsic::x86_avx2_psrlv_d:
5675 case Intrinsic::x86_avx2_psrlv_d_256:
5676 case Intrinsic::x86_avx512_psrlv_d_512:
5677 case Intrinsic::x86_avx2_psrlv_q:
5678 case Intrinsic::x86_avx2_psrlv_q_256:
5679 case Intrinsic::x86_avx512_psrlv_q_512:
5680 case Intrinsic::x86_avx2_psrav_d:
5681 case Intrinsic::x86_avx2_psrav_d_256:
5682 case Intrinsic::x86_avx512_psrav_d_512:
5683 case Intrinsic::x86_avx512_psrav_q_128:
5684 case Intrinsic::x86_avx512_psrav_q_256:
5685 case Intrinsic::x86_avx512_psrav_q_512:
5686 handleVectorShiftIntrinsic(I, /* Variable */ true);
5687 break;
5688
5689 // Pack with Signed/Unsigned Saturation
5690 case Intrinsic::x86_sse2_packsswb_128:
5691 case Intrinsic::x86_sse2_packssdw_128:
5692 case Intrinsic::x86_sse2_packuswb_128:
5693 case Intrinsic::x86_sse41_packusdw:
5694 case Intrinsic::x86_avx2_packsswb:
5695 case Intrinsic::x86_avx2_packssdw:
5696 case Intrinsic::x86_avx2_packuswb:
5697 case Intrinsic::x86_avx2_packusdw:
5698 // e.g., <64 x i8> @llvm.x86.avx512.packsswb.512
5699 // (<32 x i16> %a, <32 x i16> %b)
5700 // <32 x i16> @llvm.x86.avx512.packssdw.512
5701 // (<16 x i32> %a, <16 x i32> %b)
5702 // Note: AVX512 masked variants are auto-upgraded by LLVM.
5703 case Intrinsic::x86_avx512_packsswb_512:
5704 case Intrinsic::x86_avx512_packssdw_512:
5705 case Intrinsic::x86_avx512_packuswb_512:
5706 case Intrinsic::x86_avx512_packusdw_512:
5707 handleVectorPackIntrinsic(I);
5708 break;
5709
5710 case Intrinsic::x86_sse41_pblendvb:
5711 case Intrinsic::x86_sse41_blendvpd:
5712 case Intrinsic::x86_sse41_blendvps:
5713 case Intrinsic::x86_avx_blendv_pd_256:
5714 case Intrinsic::x86_avx_blendv_ps_256:
5715 case Intrinsic::x86_avx2_pblendvb:
5716 handleBlendvIntrinsic(I);
5717 break;
5718
5719 case Intrinsic::x86_avx_dp_ps_256:
5720 case Intrinsic::x86_sse41_dppd:
5721 case Intrinsic::x86_sse41_dpps:
5722 handleDppIntrinsic(I);
5723 break;
5724
5725 case Intrinsic::x86_mmx_packsswb:
5726 case Intrinsic::x86_mmx_packuswb:
5727 handleVectorPackIntrinsic(I, 16);
5728 break;
5729
5730 case Intrinsic::x86_mmx_packssdw:
5731 handleVectorPackIntrinsic(I, 32);
5732 break;
5733
5734 case Intrinsic::x86_mmx_psad_bw:
5735 handleVectorSadIntrinsic(I, true);
5736 break;
5737 case Intrinsic::x86_sse2_psad_bw:
5738 case Intrinsic::x86_avx2_psad_bw:
5739 handleVectorSadIntrinsic(I);
5740 break;
5741
5742 // Multiply and Add Packed Words
5743 // < 4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>)
5744 // < 8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>)
5745 // <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>)
5746 //
5747 // Multiply and Add Packed Signed and Unsigned Bytes
5748 // < 8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>)
5749 // <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>)
5750 // <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)
5751 //
5752 // These intrinsics are auto-upgraded into non-masked forms:
5753 // < 4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128
5754 // (<8 x i16>, <8 x i16>, <4 x i32>, i8)
5755 // < 8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256
5756 // (<16 x i16>, <16 x i16>, <8 x i32>, i8)
5757 // <16 x i32> @llvm.x86.avx512.mask.pmaddw.d.512
5758 // (<32 x i16>, <32 x i16>, <16 x i32>, i16)
5759 // < 8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128
5760 // (<16 x i8>, <16 x i8>, <8 x i16>, i8)
5761 // <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256
5762 // (<32 x i8>, <32 x i8>, <16 x i16>, i16)
5763 // <32 x i16> @llvm.x86.avx512.mask.pmaddubs.w.512
5764 // (<64 x i8>, <64 x i8>, <32 x i16>, i32)
5765 case Intrinsic::x86_sse2_pmadd_wd:
5766 case Intrinsic::x86_avx2_pmadd_wd:
5767 case Intrinsic::x86_avx512_pmaddw_d_512:
5768 case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
5769 case Intrinsic::x86_avx2_pmadd_ub_sw:
5770 case Intrinsic::x86_avx512_pmaddubs_w_512:
5771 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5772 /*ZeroPurifies=*/true);
5773 break;
5774
5775 // <1 x i64> @llvm.x86.ssse3.pmadd.ub.sw(<1 x i64>, <1 x i64>)
5776 case Intrinsic::x86_ssse3_pmadd_ub_sw:
5777 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5778 /*ZeroPurifies=*/true, /*EltSizeInBits=*/8);
5779 break;
5780
5781 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64>, <1 x i64>)
5782 case Intrinsic::x86_mmx_pmadd_wd:
5783 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5784 /*ZeroPurifies=*/true, /*EltSizeInBits=*/16);
5785 break;
5786
5787 // AVX Vector Neural Network Instructions: bytes
5788 //
5789 // Multiply and Add Packed Signed and Unsigned Bytes
5790 // < 4 x i32> @llvm.x86.avx512.vpdpbusd.128
5791 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5792 // < 8 x i32> @llvm.x86.avx512.vpdpbusd.256
5793 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5794 // <16 x i32> @llvm.x86.avx512.vpdpbusd.512
5795 // (<16 x i32>, <64 x i8>, <64 x i8>)
5796 //
5797 // Multiply and Add Unsigned and Signed Bytes With Saturation
5798 // < 4 x i32> @llvm.x86.avx512.vpdpbusds.128
5799 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5800 // < 8 x i32> @llvm.x86.avx512.vpdpbusds.256
5801 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5802 // <16 x i32> @llvm.x86.avx512.vpdpbusds.512
5803 // (<16 x i32>, <64 x i8>, <64 x i8>)
5804 //
5805 // < 4 x i32> @llvm.x86.avx2.vpdpbssd.128
5806 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5807 // < 8 x i32> @llvm.x86.avx2.vpdpbssd.256
5808 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5809 //
5810 // < 4 x i32> @llvm.x86.avx2.vpdpbssds.128
5811 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5812 // < 8 x i32> @llvm.x86.avx2.vpdpbssds.256
5813 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5814 //
5815 // <16 x i32> @llvm.x86.avx10.vpdpbssd.512
5816 // (<16 x i32>, <16 x i32>, <16 x i32>)
5817 // <16 x i32> @llvm.x86.avx10.vpdpbssds.512
5818 // (<16 x i32>, <16 x i32>, <16 x i32>)
5819 //
5820 // These intrinsics are auto-upgraded into non-masked forms:
5821 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusd.128
5822 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5823 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusd.128
5824 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5825 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusd.256
5826 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5827 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusd.256
5828 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5829 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusd.512
5830 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5831 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusd.512
5832 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5833 //
5834 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusds.128
5835 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5836 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusds.128
5837 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5838 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusds.256
5839 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5840 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusds.256
5841 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5842 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusds.512
5843 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5844 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusds.512
5845 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5846 case Intrinsic::x86_avx512_vpdpbusd_128:
5847 case Intrinsic::x86_avx512_vpdpbusd_256:
5848 case Intrinsic::x86_avx512_vpdpbusd_512:
5849 case Intrinsic::x86_avx512_vpdpbusds_128:
5850 case Intrinsic::x86_avx512_vpdpbusds_256:
5851 case Intrinsic::x86_avx512_vpdpbusds_512:
5852 case Intrinsic::x86_avx2_vpdpbssd_128:
5853 case Intrinsic::x86_avx2_vpdpbssd_256:
5854 case Intrinsic::x86_avx10_vpdpbssd_512:
5855 case Intrinsic::x86_avx2_vpdpbssds_128:
5856 case Intrinsic::x86_avx2_vpdpbssds_256:
5857 case Intrinsic::x86_avx10_vpdpbssds_512:
5858 case Intrinsic::x86_avx2_vpdpbsud_128:
5859 case Intrinsic::x86_avx2_vpdpbsud_256:
5860 case Intrinsic::x86_avx10_vpdpbsud_512:
5861 case Intrinsic::x86_avx2_vpdpbsuds_128:
5862 case Intrinsic::x86_avx2_vpdpbsuds_256:
5863 case Intrinsic::x86_avx10_vpdpbsuds_512:
5864 case Intrinsic::x86_avx2_vpdpbuud_128:
5865 case Intrinsic::x86_avx2_vpdpbuud_256:
5866 case Intrinsic::x86_avx10_vpdpbuud_512:
5867 case Intrinsic::x86_avx2_vpdpbuuds_128:
5868 case Intrinsic::x86_avx2_vpdpbuuds_256:
5869 case Intrinsic::x86_avx10_vpdpbuuds_512:
5870 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/4,
5871 /*ZeroPurifies=*/true, /*EltSizeInBits=*/8);
5872 break;
5873
5874 // AVX Vector Neural Network Instructions: words
5875 //
5876 // Multiply and Add Signed Word Integers
5877 // < 4 x i32> @llvm.x86.avx512.vpdpwssd.128
5878 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5879 // < 8 x i32> @llvm.x86.avx512.vpdpwssd.256
5880 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5881 // <16 x i32> @llvm.x86.avx512.vpdpwssd.512
5882 // (<16 x i32>, <16 x i32>, <16 x i32>)
5883 //
5884 // Multiply and Add Signed Word Integers With Saturation
5885 // < 4 x i32> @llvm.x86.avx512.vpdpwssds.128
5886 // (< 4 x i32>, < 4 x i32>, < 4 x i32>)
5887 // < 8 x i32> @llvm.x86.avx512.vpdpwssds.256
5888 // (< 8 x i32>, < 8 x i32>, < 8 x i32>)
5889 // <16 x i32> @llvm.x86.avx512.vpdpwssds.512
5890 // (<16 x i32>, <16 x i32>, <16 x i32>)
5891 //
5892 // These intrinsics are auto-upgraded into non-masked forms:
5893 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssd.128
5894 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5895 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssd.128
5896 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5897 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssd.256
5898 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5899 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssd.256
5900 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5901 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssd.512
5902 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5903 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssd.512
5904 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5905 //
5906 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssds.128
5907 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5908 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssds.128
5909 // (<4 x i32>, <4 x i32>, <4 x i32>, i8)
5910 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssds.256
5911 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5912 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssds.256
5913 // (<8 x i32>, <8 x i32>, <8 x i32>, i8)
5914 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssds.512
5915 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5916 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssds.512
5917 // (<16 x i32>, <16 x i32>, <16 x i32>, i16)
5918 case Intrinsic::x86_avx512_vpdpwssd_128:
5919 case Intrinsic::x86_avx512_vpdpwssd_256:
5920 case Intrinsic::x86_avx512_vpdpwssd_512:
5921 case Intrinsic::x86_avx512_vpdpwssds_128:
5922 case Intrinsic::x86_avx512_vpdpwssds_256:
5923 case Intrinsic::x86_avx512_vpdpwssds_512:
5924 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5925 /*ZeroPurifies=*/true, /*EltSizeInBits=*/16);
5926 break;
5927
5928 // TODO: Dot Product of BF16 Pairs Accumulated Into Packed Single
5929 // Precision
5930 // <4 x float> @llvm.x86.avx512bf16.dpbf16ps.128
5931 // (<4 x float>, <8 x bfloat>, <8 x bfloat>)
5932 // <8 x float> @llvm.x86.avx512bf16.dpbf16ps.256
5933 // (<8 x float>, <16 x bfloat>, <16 x bfloat>)
5934 // <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512
5935 // (<16 x float>, <32 x bfloat>, <32 x bfloat>)
5936 // handleVectorPmaddIntrinsic() currently only handles integer types.
5937
5938 case Intrinsic::x86_sse_cmp_ss:
5939 case Intrinsic::x86_sse2_cmp_sd:
5940 case Intrinsic::x86_sse_comieq_ss:
5941 case Intrinsic::x86_sse_comilt_ss:
5942 case Intrinsic::x86_sse_comile_ss:
5943 case Intrinsic::x86_sse_comigt_ss:
5944 case Intrinsic::x86_sse_comige_ss:
5945 case Intrinsic::x86_sse_comineq_ss:
5946 case Intrinsic::x86_sse_ucomieq_ss:
5947 case Intrinsic::x86_sse_ucomilt_ss:
5948 case Intrinsic::x86_sse_ucomile_ss:
5949 case Intrinsic::x86_sse_ucomigt_ss:
5950 case Intrinsic::x86_sse_ucomige_ss:
5951 case Intrinsic::x86_sse_ucomineq_ss:
5952 case Intrinsic::x86_sse2_comieq_sd:
5953 case Intrinsic::x86_sse2_comilt_sd:
5954 case Intrinsic::x86_sse2_comile_sd:
5955 case Intrinsic::x86_sse2_comigt_sd:
5956 case Intrinsic::x86_sse2_comige_sd:
5957 case Intrinsic::x86_sse2_comineq_sd:
5958 case Intrinsic::x86_sse2_ucomieq_sd:
5959 case Intrinsic::x86_sse2_ucomilt_sd:
5960 case Intrinsic::x86_sse2_ucomile_sd:
5961 case Intrinsic::x86_sse2_ucomigt_sd:
5962 case Intrinsic::x86_sse2_ucomige_sd:
5963 case Intrinsic::x86_sse2_ucomineq_sd:
5964 handleVectorCompareScalarIntrinsic(I);
5965 break;
5966
5967 case Intrinsic::x86_avx_cmp_pd_256:
5968 case Intrinsic::x86_avx_cmp_ps_256:
5969 case Intrinsic::x86_sse2_cmp_pd:
5970 case Intrinsic::x86_sse_cmp_ps:
5971 handleVectorComparePackedIntrinsic(I);
5972 break;
5973
5974 case Intrinsic::x86_bmi_bextr_32:
5975 case Intrinsic::x86_bmi_bextr_64:
5976 case Intrinsic::x86_bmi_bzhi_32:
5977 case Intrinsic::x86_bmi_bzhi_64:
5978 case Intrinsic::x86_bmi_pdep_32:
5979 case Intrinsic::x86_bmi_pdep_64:
5980 case Intrinsic::x86_bmi_pext_32:
5981 case Intrinsic::x86_bmi_pext_64:
5982 handleBmiIntrinsic(I);
5983 break;
5984
5985 case Intrinsic::x86_pclmulqdq:
5986 case Intrinsic::x86_pclmulqdq_256:
5987 case Intrinsic::x86_pclmulqdq_512:
5988 handlePclmulIntrinsic(I);
5989 break;
5990
5991 case Intrinsic::x86_avx_round_pd_256:
5992 case Intrinsic::x86_avx_round_ps_256:
5993 case Intrinsic::x86_sse41_round_pd:
5994 case Intrinsic::x86_sse41_round_ps:
5995 handleRoundPdPsIntrinsic(I);
5996 break;
5997
5998 case Intrinsic::x86_sse41_round_sd:
5999 case Intrinsic::x86_sse41_round_ss:
6000 handleUnarySdSsIntrinsic(I);
6001 break;
6002
6003 case Intrinsic::x86_sse2_max_sd:
6004 case Intrinsic::x86_sse_max_ss:
6005 case Intrinsic::x86_sse2_min_sd:
6006 case Intrinsic::x86_sse_min_ss:
6007 handleBinarySdSsIntrinsic(I);
6008 break;
6009
6010 case Intrinsic::x86_avx_vtestc_pd:
6011 case Intrinsic::x86_avx_vtestc_pd_256:
6012 case Intrinsic::x86_avx_vtestc_ps:
6013 case Intrinsic::x86_avx_vtestc_ps_256:
6014 case Intrinsic::x86_avx_vtestnzc_pd:
6015 case Intrinsic::x86_avx_vtestnzc_pd_256:
6016 case Intrinsic::x86_avx_vtestnzc_ps:
6017 case Intrinsic::x86_avx_vtestnzc_ps_256:
6018 case Intrinsic::x86_avx_vtestz_pd:
6019 case Intrinsic::x86_avx_vtestz_pd_256:
6020 case Intrinsic::x86_avx_vtestz_ps:
6021 case Intrinsic::x86_avx_vtestz_ps_256:
6022 case Intrinsic::x86_avx_ptestc_256:
6023 case Intrinsic::x86_avx_ptestnzc_256:
6024 case Intrinsic::x86_avx_ptestz_256:
6025 case Intrinsic::x86_sse41_ptestc:
6026 case Intrinsic::x86_sse41_ptestnzc:
6027 case Intrinsic::x86_sse41_ptestz:
6028 handleVtestIntrinsic(I);
6029 break;
6030
6031 // Packed Horizontal Add/Subtract
6032 case Intrinsic::x86_ssse3_phadd_w:
6033 case Intrinsic::x86_ssse3_phadd_w_128:
6034 case Intrinsic::x86_avx2_phadd_w:
6035 case Intrinsic::x86_ssse3_phsub_w:
6036 case Intrinsic::x86_ssse3_phsub_w_128:
6037 case Intrinsic::x86_avx2_phsub_w: {
6038 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
6039 break;
6040 }
6041
6042 // Packed Horizontal Add/Subtract
6043 case Intrinsic::x86_ssse3_phadd_d:
6044 case Intrinsic::x86_ssse3_phadd_d_128:
6045 case Intrinsic::x86_avx2_phadd_d:
6046 case Intrinsic::x86_ssse3_phsub_d:
6047 case Intrinsic::x86_ssse3_phsub_d_128:
6048 case Intrinsic::x86_avx2_phsub_d: {
6049 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/32);
6050 break;
6051 }
6052
6053 // Packed Horizontal Add/Subtract and Saturate
6054 case Intrinsic::x86_ssse3_phadd_sw:
6055 case Intrinsic::x86_ssse3_phadd_sw_128:
6056 case Intrinsic::x86_avx2_phadd_sw:
6057 case Intrinsic::x86_ssse3_phsub_sw:
6058 case Intrinsic::x86_ssse3_phsub_sw_128:
6059 case Intrinsic::x86_avx2_phsub_sw: {
6060 handlePairwiseShadowOrIntrinsic(I, /*ReinterpretElemWidth=*/16);
6061 break;
6062 }
6063
6064 // Packed Single/Double Precision Floating-Point Horizontal Add
6065 case Intrinsic::x86_sse3_hadd_ps:
6066 case Intrinsic::x86_sse3_hadd_pd:
6067 case Intrinsic::x86_avx_hadd_pd_256:
6068 case Intrinsic::x86_avx_hadd_ps_256:
6069 case Intrinsic::x86_sse3_hsub_ps:
6070 case Intrinsic::x86_sse3_hsub_pd:
6071 case Intrinsic::x86_avx_hsub_pd_256:
6072 case Intrinsic::x86_avx_hsub_ps_256: {
6073 handlePairwiseShadowOrIntrinsic(I);
6074 break;
6075 }
6076
6077 case Intrinsic::x86_avx_maskstore_ps:
6078 case Intrinsic::x86_avx_maskstore_pd:
6079 case Intrinsic::x86_avx_maskstore_ps_256:
6080 case Intrinsic::x86_avx_maskstore_pd_256:
6081 case Intrinsic::x86_avx2_maskstore_d:
6082 case Intrinsic::x86_avx2_maskstore_q:
6083 case Intrinsic::x86_avx2_maskstore_d_256:
6084 case Intrinsic::x86_avx2_maskstore_q_256: {
6085 handleAVXMaskedStore(I);
6086 break;
6087 }
6088
6089 case Intrinsic::x86_avx_maskload_ps:
6090 case Intrinsic::x86_avx_maskload_pd:
6091 case Intrinsic::x86_avx_maskload_ps_256:
6092 case Intrinsic::x86_avx_maskload_pd_256:
6093 case Intrinsic::x86_avx2_maskload_d:
6094 case Intrinsic::x86_avx2_maskload_q:
6095 case Intrinsic::x86_avx2_maskload_d_256:
6096 case Intrinsic::x86_avx2_maskload_q_256: {
6097 handleAVXMaskedLoad(I);
6098 break;
6099 }
6100
6101 // Packed
6102 case Intrinsic::x86_avx512fp16_add_ph_512:
6103 case Intrinsic::x86_avx512fp16_sub_ph_512:
6104 case Intrinsic::x86_avx512fp16_mul_ph_512:
6105 case Intrinsic::x86_avx512fp16_div_ph_512:
6106 case Intrinsic::x86_avx512fp16_max_ph_512:
6107 case Intrinsic::x86_avx512fp16_min_ph_512:
6108 case Intrinsic::x86_avx512_min_ps_512:
6109 case Intrinsic::x86_avx512_min_pd_512:
6110 case Intrinsic::x86_avx512_max_ps_512:
6111 case Intrinsic::x86_avx512_max_pd_512: {
6112 // These AVX512 variants contain the rounding mode as a trailing flag.
6113 // Earlier variants do not have a trailing flag and are already handled
6114 // by maybeHandleSimpleNomemIntrinsic(I, 0) via
6115 // maybeHandleUnknownIntrinsic.
6116 [[maybe_unused]] bool Success =
6117 maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
6118 assert(Success);
6119 break;
6120 }
6121
6122 case Intrinsic::x86_avx_vpermilvar_pd:
6123 case Intrinsic::x86_avx_vpermilvar_pd_256:
6124 case Intrinsic::x86_avx512_vpermilvar_pd_512:
6125 case Intrinsic::x86_avx_vpermilvar_ps:
6126 case Intrinsic::x86_avx_vpermilvar_ps_256:
6127 case Intrinsic::x86_avx512_vpermilvar_ps_512: {
6128 handleAVXVpermilvar(I);
6129 break;
6130 }
6131
6132 case Intrinsic::x86_avx512_vpermi2var_d_128:
6133 case Intrinsic::x86_avx512_vpermi2var_d_256:
6134 case Intrinsic::x86_avx512_vpermi2var_d_512:
6135 case Intrinsic::x86_avx512_vpermi2var_hi_128:
6136 case Intrinsic::x86_avx512_vpermi2var_hi_256:
6137 case Intrinsic::x86_avx512_vpermi2var_hi_512:
6138 case Intrinsic::x86_avx512_vpermi2var_pd_128:
6139 case Intrinsic::x86_avx512_vpermi2var_pd_256:
6140 case Intrinsic::x86_avx512_vpermi2var_pd_512:
6141 case Intrinsic::x86_avx512_vpermi2var_ps_128:
6142 case Intrinsic::x86_avx512_vpermi2var_ps_256:
6143 case Intrinsic::x86_avx512_vpermi2var_ps_512:
6144 case Intrinsic::x86_avx512_vpermi2var_q_128:
6145 case Intrinsic::x86_avx512_vpermi2var_q_256:
6146 case Intrinsic::x86_avx512_vpermi2var_q_512:
6147 case Intrinsic::x86_avx512_vpermi2var_qi_128:
6148 case Intrinsic::x86_avx512_vpermi2var_qi_256:
6149 case Intrinsic::x86_avx512_vpermi2var_qi_512:
6150 handleAVXVpermi2var(I);
6151 break;
6152
6153 // Packed Shuffle
6154 // llvm.x86.sse.pshuf.w(<1 x i64>, i8)
6155 // llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
6156 // llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)
6157 // llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>)
6158 // llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>)
6159 //
6160 // The following intrinsics are auto-upgraded:
6161 // llvm.x86.sse2.pshuf.d(<4 x i32>, i8)
6162 // llvm.x86.sse2.gpshufh.w(<8 x i16>, i8)
6163 // llvm.x86.sse2.pshufl.w(<8 x i16>, i8)
6164 case Intrinsic::x86_avx2_pshuf_b:
6165 case Intrinsic::x86_sse_pshuf_w:
6166 case Intrinsic::x86_ssse3_pshuf_b_128:
6167 case Intrinsic::x86_ssse3_pshuf_b:
6168 case Intrinsic::x86_avx512_pshuf_b_512:
6169 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6170 /*trailingVerbatimArgs=*/1);
6171 break;
6172
6173 // AVX512 PMOV: Packed MOV, with truncation
6174 // Precisely handled by applying the same intrinsic to the shadow
6175 case Intrinsic::x86_avx512_mask_pmov_dw_512:
6176 case Intrinsic::x86_avx512_mask_pmov_db_512:
6177 case Intrinsic::x86_avx512_mask_pmov_qb_512:
6178 case Intrinsic::x86_avx512_mask_pmov_qw_512: {
6179 // Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 were removed in
6180 // f608dc1f5775ee880e8ea30e2d06ab5a4a935c22
6181 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6182 /*trailingVerbatimArgs=*/1);
6183 break;
6184 }
6185
6186 // AVX512 PMVOV{S,US}: Packed MOV, with signed/unsigned saturation
6187 // Approximately handled using the corresponding truncation intrinsic
6188 // TODO: improve handleAVX512VectorDownConvert to precisely model saturation
6189 case Intrinsic::x86_avx512_mask_pmovs_dw_512:
6190 case Intrinsic::x86_avx512_mask_pmovus_dw_512: {
6191 handleIntrinsicByApplyingToShadow(I,
6192 Intrinsic::x86_avx512_mask_pmov_dw_512,
6193 /* trailingVerbatimArgs=*/1);
6194 break;
6195 }
6196
6197 case Intrinsic::x86_avx512_mask_pmovs_db_512:
6198 case Intrinsic::x86_avx512_mask_pmovus_db_512: {
6199 handleIntrinsicByApplyingToShadow(I,
6200 Intrinsic::x86_avx512_mask_pmov_db_512,
6201 /* trailingVerbatimArgs=*/1);
6202 break;
6203 }
6204
6205 case Intrinsic::x86_avx512_mask_pmovs_qb_512:
6206 case Intrinsic::x86_avx512_mask_pmovus_qb_512: {
6207 handleIntrinsicByApplyingToShadow(I,
6208 Intrinsic::x86_avx512_mask_pmov_qb_512,
6209 /* trailingVerbatimArgs=*/1);
6210 break;
6211 }
6212
6213 case Intrinsic::x86_avx512_mask_pmovs_qw_512:
6214 case Intrinsic::x86_avx512_mask_pmovus_qw_512: {
6215 handleIntrinsicByApplyingToShadow(I,
6216 Intrinsic::x86_avx512_mask_pmov_qw_512,
6217 /* trailingVerbatimArgs=*/1);
6218 break;
6219 }
6220
6221 case Intrinsic::x86_avx512_mask_pmovs_qd_512:
6222 case Intrinsic::x86_avx512_mask_pmovus_qd_512:
6223 case Intrinsic::x86_avx512_mask_pmovs_wb_512:
6224 case Intrinsic::x86_avx512_mask_pmovus_wb_512: {
6225 // Since Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 do not exist, we
6226 // cannot use handleIntrinsicByApplyingToShadow. Instead, we call the
6227 // slow-path handler.
6228 handleAVX512VectorDownConvert(I);
6229 break;
6230 }
6231
6232 // AVX512/AVX10 Reciprocal
6233 // <16 x float> @llvm.x86.avx512.rsqrt14.ps.512
6234 // (<16 x float>, <16 x float>, i16)
6235 // <8 x float> @llvm.x86.avx512.rsqrt14.ps.256
6236 // (<8 x float>, <8 x float>, i8)
6237 // <4 x float> @llvm.x86.avx512.rsqrt14.ps.128
6238 // (<4 x float>, <4 x float>, i8)
6239 //
6240 // <8 x double> @llvm.x86.avx512.rsqrt14.pd.512
6241 // (<8 x double>, <8 x double>, i8)
6242 // <4 x double> @llvm.x86.avx512.rsqrt14.pd.256
6243 // (<4 x double>, <4 x double>, i8)
6244 // <2 x double> @llvm.x86.avx512.rsqrt14.pd.128
6245 // (<2 x double>, <2 x double>, i8)
6246 //
6247 // <32 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.512
6248 // (<32 x bfloat>, <32 x bfloat>, i32)
6249 // <16 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.256
6250 // (<16 x bfloat>, <16 x bfloat>, i16)
6251 // <8 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.128
6252 // (<8 x bfloat>, <8 x bfloat>, i8)
6253 //
6254 // <32 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.512
6255 // (<32 x half>, <32 x half>, i32)
6256 // <16 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.256
6257 // (<16 x half>, <16 x half>, i16)
6258 // <8 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.128
6259 // (<8 x half>, <8 x half>, i8)
6260 //
6261 // TODO: 3-operand variants are not handled:
6262 // <2 x double> @llvm.x86.avx512.rsqrt14.sd
6263 // (<2 x double>, <2 x double>, <2 x double>, i8)
6264 // <4 x float> @llvm.x86.avx512.rsqrt14.ss
6265 // (<4 x float>, <4 x float>, <4 x float>, i8)
6266 // <8 x half> @llvm.x86.avx512fp16.mask.rsqrt.sh
6267 // (<8 x half>, <8 x half>, <8 x half>, i8)
6268 case Intrinsic::x86_avx512_rsqrt14_ps_512:
6269 case Intrinsic::x86_avx512_rsqrt14_ps_256:
6270 case Intrinsic::x86_avx512_rsqrt14_ps_128:
6271 case Intrinsic::x86_avx512_rsqrt14_pd_512:
6272 case Intrinsic::x86_avx512_rsqrt14_pd_256:
6273 case Intrinsic::x86_avx512_rsqrt14_pd_128:
6274 case Intrinsic::x86_avx10_mask_rsqrt_bf16_512:
6275 case Intrinsic::x86_avx10_mask_rsqrt_bf16_256:
6276 case Intrinsic::x86_avx10_mask_rsqrt_bf16_128:
6277 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_512:
6278 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_256:
6279 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_128:
6280 handleAVX512VectorGenericMaskedFP(I, /*AIndex=*/0, /*WriteThruIndex=*/1,
6281 /*MaskIndex=*/2);
6282 break;
6283
6284 // AVX512/AVX10 Reciprocal Square Root
6285 // <16 x float> @llvm.x86.avx512.rcp14.ps.512
6286 // (<16 x float>, <16 x float>, i16)
6287 // <8 x float> @llvm.x86.avx512.rcp14.ps.256
6288 // (<8 x float>, <8 x float>, i8)
6289 // <4 x float> @llvm.x86.avx512.rcp14.ps.128
6290 // (<4 x float>, <4 x float>, i8)
6291 //
6292 // <8 x double> @llvm.x86.avx512.rcp14.pd.512
6293 // (<8 x double>, <8 x double>, i8)
6294 // <4 x double> @llvm.x86.avx512.rcp14.pd.256
6295 // (<4 x double>, <4 x double>, i8)
6296 // <2 x double> @llvm.x86.avx512.rcp14.pd.128
6297 // (<2 x double>, <2 x double>, i8)
6298 //
6299 // <32 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.512
6300 // (<32 x bfloat>, <32 x bfloat>, i32)
6301 // <16 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.256
6302 // (<16 x bfloat>, <16 x bfloat>, i16)
6303 // <8 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.128
6304 // (<8 x bfloat>, <8 x bfloat>, i8)
6305 //
6306 // <32 x half> @llvm.x86.avx512fp16.mask.rcp.ph.512
6307 // (<32 x half>, <32 x half>, i32)
6308 // <16 x half> @llvm.x86.avx512fp16.mask.rcp.ph.256
6309 // (<16 x half>, <16 x half>, i16)
6310 // <8 x half> @llvm.x86.avx512fp16.mask.rcp.ph.128
6311 // (<8 x half>, <8 x half>, i8)
6312 //
6313 // TODO: 3-operand variants are not handled:
6314 // <2 x double> @llvm.x86.avx512.rcp14.sd
6315 // (<2 x double>, <2 x double>, <2 x double>, i8)
6316 // <4 x float> @llvm.x86.avx512.rcp14.ss
6317 // (<4 x float>, <4 x float>, <4 x float>, i8)
6318 // <8 x half> @llvm.x86.avx512fp16.mask.rcp.sh
6319 // (<8 x half>, <8 x half>, <8 x half>, i8)
6320 case Intrinsic::x86_avx512_rcp14_ps_512:
6321 case Intrinsic::x86_avx512_rcp14_ps_256:
6322 case Intrinsic::x86_avx512_rcp14_ps_128:
6323 case Intrinsic::x86_avx512_rcp14_pd_512:
6324 case Intrinsic::x86_avx512_rcp14_pd_256:
6325 case Intrinsic::x86_avx512_rcp14_pd_128:
6326 case Intrinsic::x86_avx10_mask_rcp_bf16_512:
6327 case Intrinsic::x86_avx10_mask_rcp_bf16_256:
6328 case Intrinsic::x86_avx10_mask_rcp_bf16_128:
6329 case Intrinsic::x86_avx512fp16_mask_rcp_ph_512:
6330 case Intrinsic::x86_avx512fp16_mask_rcp_ph_256:
6331 case Intrinsic::x86_avx512fp16_mask_rcp_ph_128:
6332 handleAVX512VectorGenericMaskedFP(I, /*AIndex=*/0, /*WriteThruIndex=*/1,
6333 /*MaskIndex=*/2);
6334 break;
6335
6336 // <32 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.512
6337 // (<32 x half>, i32, <32 x half>, i32, i32)
6338 // <16 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.256
6339 // (<16 x half>, i32, <16 x half>, i32, i16)
6340 // <8 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.128
6341 // (<8 x half>, i32, <8 x half>, i32, i8)
6342 //
6343 // <16 x float> @llvm.x86.avx512.mask.rndscale.ps.512
6344 // (<16 x float>, i32, <16 x float>, i16, i32)
6345 // <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256
6346 // (<8 x float>, i32, <8 x float>, i8)
6347 // <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128
6348 // (<4 x float>, i32, <4 x float>, i8)
6349 //
6350 // <8 x double> @llvm.x86.avx512.mask.rndscale.pd.512
6351 // (<8 x double>, i32, <8 x double>, i8, i32)
6352 // A Imm WriteThru Mask Rounding
6353 // <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256
6354 // (<4 x double>, i32, <4 x double>, i8)
6355 // <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128
6356 // (<2 x double>, i32, <2 x double>, i8)
6357 // A Imm WriteThru Mask
6358 //
6359 // <32 x bfloat> @llvm.x86.avx10.mask.rndscale.bf16.512
6360 // (<32 x bfloat>, i32, <32 x bfloat>, i32)
6361 // <16 x bfloat> @llvm.x86.avx10.mask.rndscale.bf16.256
6362 // (<16 x bfloat>, i32, <16 x bfloat>, i16)
6363 // <8 x bfloat> @llvm.x86.avx10.mask.rndscale.bf16.128
6364 // (<8 x bfloat>, i32, <8 x bfloat>, i8)
6365 //
6366 // Not supported: three vectors
6367 // - <8 x half> @llvm.x86.avx512fp16.mask.rndscale.sh
6368 // (<8 x half>, <8 x half>,<8 x half>, i8, i32, i32)
6369 // - <4 x float> @llvm.x86.avx512.mask.rndscale.ss
6370 // (<4 x float>, <4 x float>, <4 x float>, i8, i32, i32)
6371 // - <2 x double> @llvm.x86.avx512.mask.rndscale.sd
6372 // (<2 x double>, <2 x double>, <2 x double>, i8, i32,
6373 // i32)
6374 // A B WriteThru Mask Imm
6375 // Rounding
6376 case Intrinsic::x86_avx512fp16_mask_rndscale_ph_512:
6377 case Intrinsic::x86_avx512fp16_mask_rndscale_ph_256:
6378 case Intrinsic::x86_avx512fp16_mask_rndscale_ph_128:
6379 case Intrinsic::x86_avx512_mask_rndscale_ps_512:
6380 case Intrinsic::x86_avx512_mask_rndscale_ps_256:
6381 case Intrinsic::x86_avx512_mask_rndscale_ps_128:
6382 case Intrinsic::x86_avx512_mask_rndscale_pd_512:
6383 case Intrinsic::x86_avx512_mask_rndscale_pd_256:
6384 case Intrinsic::x86_avx512_mask_rndscale_pd_128:
6385 case Intrinsic::x86_avx10_mask_rndscale_bf16_512:
6386 case Intrinsic::x86_avx10_mask_rndscale_bf16_256:
6387 case Intrinsic::x86_avx10_mask_rndscale_bf16_128:
6388 handleAVX512VectorGenericMaskedFP(I, /*AIndex=*/0, /*WriteThruIndex=*/2,
6389 /*MaskIndex=*/3);
6390 break;
6391
6392 // AVX512 FP16 Arithmetic
6393 case Intrinsic::x86_avx512fp16_mask_add_sh_round:
6394 case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
6395 case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
6396 case Intrinsic::x86_avx512fp16_mask_div_sh_round:
6397 case Intrinsic::x86_avx512fp16_mask_max_sh_round:
6398 case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
6399 visitGenericScalarHalfwordInst(I);
6400 break;
6401 }
6402
6403 // AVX Galois Field New Instructions
6404 case Intrinsic::x86_vgf2p8affineqb_128:
6405 case Intrinsic::x86_vgf2p8affineqb_256:
6406 case Intrinsic::x86_vgf2p8affineqb_512:
6407 handleAVXGF2P8Affine(I);
6408 break;
6409
6410 default:
6411 return false;
6412 }
6413
6414 return true;
6415 }
6416
6417 bool maybeHandleArmSIMDIntrinsic(IntrinsicInst &I) {
6418 switch (I.getIntrinsicID()) {
6419 case Intrinsic::aarch64_neon_rshrn:
6420 case Intrinsic::aarch64_neon_sqrshl:
6421 case Intrinsic::aarch64_neon_sqrshrn:
6422 case Intrinsic::aarch64_neon_sqrshrun:
6423 case Intrinsic::aarch64_neon_sqshl:
6424 case Intrinsic::aarch64_neon_sqshlu:
6425 case Intrinsic::aarch64_neon_sqshrn:
6426 case Intrinsic::aarch64_neon_sqshrun:
6427 case Intrinsic::aarch64_neon_srshl:
6428 case Intrinsic::aarch64_neon_sshl:
6429 case Intrinsic::aarch64_neon_uqrshl:
6430 case Intrinsic::aarch64_neon_uqrshrn:
6431 case Intrinsic::aarch64_neon_uqshl:
6432 case Intrinsic::aarch64_neon_uqshrn:
6433 case Intrinsic::aarch64_neon_urshl:
6434 case Intrinsic::aarch64_neon_ushl:
6435 // Not handled here: aarch64_neon_vsli (vector shift left and insert)
6436 handleVectorShiftIntrinsic(I, /* Variable */ false);
6437 break;
6438
6439 // TODO: handling max/min similarly to AND/OR may be more precise
6440 // Floating-Point Maximum/Minimum Pairwise
6441 case Intrinsic::aarch64_neon_fmaxp:
6442 case Intrinsic::aarch64_neon_fminp:
6443 // Floating-Point Maximum/Minimum Number Pairwise
6444 case Intrinsic::aarch64_neon_fmaxnmp:
6445 case Intrinsic::aarch64_neon_fminnmp:
6446 // Signed/Unsigned Maximum/Minimum Pairwise
6447 case Intrinsic::aarch64_neon_smaxp:
6448 case Intrinsic::aarch64_neon_sminp:
6449 case Intrinsic::aarch64_neon_umaxp:
6450 case Intrinsic::aarch64_neon_uminp:
6451 // Add Pairwise
6452 case Intrinsic::aarch64_neon_addp:
6453 // Floating-point Add Pairwise
6454 case Intrinsic::aarch64_neon_faddp:
6455 // Add Long Pairwise
6456 case Intrinsic::aarch64_neon_saddlp:
6457 case Intrinsic::aarch64_neon_uaddlp: {
6458 handlePairwiseShadowOrIntrinsic(I);
6459 break;
6460 }
6461
6462 // Floating-point Convert to integer, rounding to nearest with ties to Away
6463 case Intrinsic::aarch64_neon_fcvtas:
6464 case Intrinsic::aarch64_neon_fcvtau:
6465 // Floating-point convert to integer, rounding toward minus infinity
6466 case Intrinsic::aarch64_neon_fcvtms:
6467 case Intrinsic::aarch64_neon_fcvtmu:
6468 // Floating-point convert to integer, rounding to nearest with ties to even
6469 case Intrinsic::aarch64_neon_fcvtns:
6470 case Intrinsic::aarch64_neon_fcvtnu:
6471 // Floating-point convert to integer, rounding toward plus infinity
6472 case Intrinsic::aarch64_neon_fcvtps:
6473 case Intrinsic::aarch64_neon_fcvtpu:
6474 // Floating-point Convert to integer, rounding toward Zero
6475 case Intrinsic::aarch64_neon_fcvtzs:
6476 case Intrinsic::aarch64_neon_fcvtzu:
6477 // Floating-point convert to lower precision narrow, rounding to odd
6478 case Intrinsic::aarch64_neon_fcvtxn: {
6479 handleNEONVectorConvertIntrinsic(I);
6480 break;
6481 }
6482
6483 // Add reduction to scalar
6484 case Intrinsic::aarch64_neon_faddv:
6485 case Intrinsic::aarch64_neon_saddv:
6486 case Intrinsic::aarch64_neon_uaddv:
6487 // Signed/Unsigned min/max (Vector)
6488 // TODO: handling similarly to AND/OR may be more precise.
6489 case Intrinsic::aarch64_neon_smaxv:
6490 case Intrinsic::aarch64_neon_sminv:
6491 case Intrinsic::aarch64_neon_umaxv:
6492 case Intrinsic::aarch64_neon_uminv:
6493 // Floating-point min/max (vector)
6494 // The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
6495 // but our shadow propagation is the same.
6496 case Intrinsic::aarch64_neon_fmaxv:
6497 case Intrinsic::aarch64_neon_fminv:
6498 case Intrinsic::aarch64_neon_fmaxnmv:
6499 case Intrinsic::aarch64_neon_fminnmv:
6500 // Sum long across vector
6501 case Intrinsic::aarch64_neon_saddlv:
6502 case Intrinsic::aarch64_neon_uaddlv:
6503 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
6504 break;
6505
6506 case Intrinsic::aarch64_neon_ld1x2:
6507 case Intrinsic::aarch64_neon_ld1x3:
6508 case Intrinsic::aarch64_neon_ld1x4:
6509 case Intrinsic::aarch64_neon_ld2:
6510 case Intrinsic::aarch64_neon_ld3:
6511 case Intrinsic::aarch64_neon_ld4:
6512 case Intrinsic::aarch64_neon_ld2r:
6513 case Intrinsic::aarch64_neon_ld3r:
6514 case Intrinsic::aarch64_neon_ld4r: {
6515 handleNEONVectorLoad(I, /*WithLane=*/false);
6516 break;
6517 }
6518
6519 case Intrinsic::aarch64_neon_ld2lane:
6520 case Intrinsic::aarch64_neon_ld3lane:
6521 case Intrinsic::aarch64_neon_ld4lane: {
6522 handleNEONVectorLoad(I, /*WithLane=*/true);
6523 break;
6524 }
6525
6526 // Saturating extract narrow
6527 case Intrinsic::aarch64_neon_sqxtn:
6528 case Intrinsic::aarch64_neon_sqxtun:
6529 case Intrinsic::aarch64_neon_uqxtn:
6530 // These only have one argument, but we (ab)use handleShadowOr because it
6531 // does work on single argument intrinsics and will typecast the shadow
6532 // (and update the origin).
6533 handleShadowOr(I);
6534 break;
6535
6536 case Intrinsic::aarch64_neon_st1x2:
6537 case Intrinsic::aarch64_neon_st1x3:
6538 case Intrinsic::aarch64_neon_st1x4:
6539 case Intrinsic::aarch64_neon_st2:
6540 case Intrinsic::aarch64_neon_st3:
6541 case Intrinsic::aarch64_neon_st4: {
6542 handleNEONVectorStoreIntrinsic(I, false);
6543 break;
6544 }
6545
6546 case Intrinsic::aarch64_neon_st2lane:
6547 case Intrinsic::aarch64_neon_st3lane:
6548 case Intrinsic::aarch64_neon_st4lane: {
6549 handleNEONVectorStoreIntrinsic(I, true);
6550 break;
6551 }
6552
6553 // Arm NEON vector table intrinsics have the source/table register(s) as
6554 // arguments, followed by the index register. They return the output.
6555 //
6556 // 'TBL writes a zero if an index is out-of-range, while TBX leaves the
6557 // original value unchanged in the destination register.'
6558 // Conveniently, zero denotes a clean shadow, which means out-of-range
6559 // indices for TBL will initialize the user data with zero and also clean
6560 // the shadow. (For TBX, neither the user data nor the shadow will be
6561 // updated, which is also correct.)
6562 case Intrinsic::aarch64_neon_tbl1:
6563 case Intrinsic::aarch64_neon_tbl2:
6564 case Intrinsic::aarch64_neon_tbl3:
6565 case Intrinsic::aarch64_neon_tbl4:
6566 case Intrinsic::aarch64_neon_tbx1:
6567 case Intrinsic::aarch64_neon_tbx2:
6568 case Intrinsic::aarch64_neon_tbx3:
6569 case Intrinsic::aarch64_neon_tbx4: {
6570 // The last trailing argument (index register) should be handled verbatim
6571 handleIntrinsicByApplyingToShadow(
6572 I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
6573 /*trailingVerbatimArgs*/ 1);
6574 break;
6575 }
6576
6577 case Intrinsic::aarch64_neon_fmulx:
6578 case Intrinsic::aarch64_neon_pmul:
6579 case Intrinsic::aarch64_neon_pmull:
6580 case Intrinsic::aarch64_neon_smull:
6581 case Intrinsic::aarch64_neon_pmull64:
6582 case Intrinsic::aarch64_neon_umull: {
6583 handleNEONVectorMultiplyIntrinsic(I);
6584 break;
6585 }
6586
6587 default:
6588 return false;
6589 }
6590
6591 return true;
6592 }
6593
6594 void visitIntrinsicInst(IntrinsicInst &I) {
6595 if (maybeHandleCrossPlatformIntrinsic(I))
6596 return;
6597
6598 if (maybeHandleX86SIMDIntrinsic(I))
6599 return;
6600
6601 if (maybeHandleArmSIMDIntrinsic(I))
6602 return;
6603
6604 if (maybeHandleUnknownIntrinsic(I))
6605 return;
6606
6607 visitInstruction(I);
6608 }
6609
6610 void visitLibAtomicLoad(CallBase &CB) {
6611 // Since we use getNextNode here, we can't have CB terminate the BB.
6612 assert(isa<CallInst>(CB));
6613
6614 IRBuilder<> IRB(&CB);
6615 Value *Size = CB.getArgOperand(0);
6616 Value *SrcPtr = CB.getArgOperand(1);
6617 Value *DstPtr = CB.getArgOperand(2);
6618 Value *Ordering = CB.getArgOperand(3);
6619 // Convert the call to have at least Acquire ordering to make sure
6620 // the shadow operations aren't reordered before it.
6621 Value *NewOrdering =
6622 IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
6623 CB.setArgOperand(3, NewOrdering);
6624
6625 NextNodeIRBuilder NextIRB(&CB);
6626 Value *SrcShadowPtr, *SrcOriginPtr;
6627 std::tie(SrcShadowPtr, SrcOriginPtr) =
6628 getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6629 /*isStore*/ false);
6630 Value *DstShadowPtr =
6631 getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6632 /*isStore*/ true)
6633 .first;
6634
6635 NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
6636 if (MS.TrackOrigins) {
6637 Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
6639 Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
6640 NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
6641 }
6642 }
6643
6644 void visitLibAtomicStore(CallBase &CB) {
6645 IRBuilder<> IRB(&CB);
6646 Value *Size = CB.getArgOperand(0);
6647 Value *DstPtr = CB.getArgOperand(2);
6648 Value *Ordering = CB.getArgOperand(3);
6649 // Convert the call to have at least Release ordering to make sure
6650 // the shadow operations aren't reordered after it.
6651 Value *NewOrdering =
6652 IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
6653 CB.setArgOperand(3, NewOrdering);
6654
6655 Value *DstShadowPtr =
6656 getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
6657 /*isStore*/ true)
6658 .first;
6659
6660 // Atomic store always paints clean shadow/origin. See file header.
6661 IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
6662 Align(1));
6663 }
6664
6665 void visitCallBase(CallBase &CB) {
6666 assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
6667 if (CB.isInlineAsm()) {
6668 // For inline asm (either a call to asm function, or callbr instruction),
6669 // do the usual thing: check argument shadow and mark all outputs as
6670 // clean. Note that any side effects of the inline asm that are not
6671 // immediately visible in its constraints are not handled.
6673 visitAsmInstruction(CB);
6674 else
6675 visitInstruction(CB);
6676 return;
6677 }
6678 LibFunc LF;
6679 if (TLI->getLibFunc(CB, LF)) {
6680 // libatomic.a functions need to have special handling because there isn't
6681 // a good way to intercept them or compile the library with
6682 // instrumentation.
6683 switch (LF) {
6684 case LibFunc_atomic_load:
6685 if (!isa<CallInst>(CB)) {
6686 llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
6687 "Ignoring!\n";
6688 break;
6689 }
6690 visitLibAtomicLoad(CB);
6691 return;
6692 case LibFunc_atomic_store:
6693 visitLibAtomicStore(CB);
6694 return;
6695 default:
6696 break;
6697 }
6698 }
6699
6700 if (auto *Call = dyn_cast<CallInst>(&CB)) {
6701 assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
6702
6703 // We are going to insert code that relies on the fact that the callee
6704 // will become a non-readonly function after it is instrumented by us. To
6705 // prevent this code from being optimized out, mark that function
6706 // non-readonly in advance.
6707 // TODO: We can likely do better than dropping memory() completely here.
6708 AttributeMask B;
6709 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
6710
6712 if (Function *Func = Call->getCalledFunction()) {
6713 Func->removeFnAttrs(B);
6714 }
6715
6717 }
6718 IRBuilder<> IRB(&CB);
6719 bool MayCheckCall = MS.EagerChecks;
6720 if (Function *Func = CB.getCalledFunction()) {
6721 // __sanitizer_unaligned_{load,store} functions may be called by users
6722 // and always expects shadows in the TLS. So don't check them.
6723 MayCheckCall &= !Func->getName().starts_with("__sanitizer_unaligned_");
6724 }
6725
6726 unsigned ArgOffset = 0;
6727 LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
6728 for (const auto &[i, A] : llvm::enumerate(CB.args())) {
6729 if (!A->getType()->isSized()) {
6730 LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
6731 continue;
6732 }
6733
6734 if (A->getType()->isScalableTy()) {
6735 LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
6736 // Handle as noundef, but don't reserve tls slots.
6737 insertCheckShadowOf(A, &CB);
6738 continue;
6739 }
6740
6741 unsigned Size = 0;
6742 const DataLayout &DL = F.getDataLayout();
6743
6744 bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
6745 bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
6746 bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
6747
6748 if (EagerCheck) {
6749 insertCheckShadowOf(A, &CB);
6750 Size = DL.getTypeAllocSize(A->getType());
6751 } else {
6752 [[maybe_unused]] Value *Store = nullptr;
6753 // Compute the Shadow for arg even if it is ByVal, because
6754 // in that case getShadow() will copy the actual arg shadow to
6755 // __msan_param_tls.
6756 Value *ArgShadow = getShadow(A);
6757 Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
6758 LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
6759 << " Shadow: " << *ArgShadow << "\n");
6760 if (ByVal) {
6761 // ByVal requires some special handling as it's too big for a single
6762 // load
6763 assert(A->getType()->isPointerTy() &&
6764 "ByVal argument is not a pointer!");
6765 Size = DL.getTypeAllocSize(CB.getParamByValType(i));
6766 if (ArgOffset + Size > kParamTLSSize)
6767 break;
6768 const MaybeAlign ParamAlignment(CB.getParamAlign(i));
6769 MaybeAlign Alignment = std::nullopt;
6770 if (ParamAlignment)
6771 Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
6772 Value *AShadowPtr, *AOriginPtr;
6773 std::tie(AShadowPtr, AOriginPtr) =
6774 getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
6775 /*isStore*/ false);
6776 if (!PropagateShadow) {
6777 Store = IRB.CreateMemSet(ArgShadowBase,
6779 Size, Alignment);
6780 } else {
6781 Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
6782 Alignment, Size);
6783 if (MS.TrackOrigins) {
6784 Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
6785 // FIXME: OriginSize should be:
6786 // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
6787 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
6788 IRB.CreateMemCpy(
6789 ArgOriginBase,
6790 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
6791 AOriginPtr,
6792 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
6793 }
6794 }
6795 } else {
6796 // Any other parameters mean we need bit-grained tracking of uninit
6797 // data
6798 Size = DL.getTypeAllocSize(A->getType());
6799 if (ArgOffset + Size > kParamTLSSize)
6800 break;
6801 Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
6803 Constant *Cst = dyn_cast<Constant>(ArgShadow);
6804 if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
6805 IRB.CreateStore(getOrigin(A),
6806 getOriginPtrForArgument(IRB, ArgOffset));
6807 }
6808 }
6809 assert(Store != nullptr);
6810 LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
6811 }
6812 assert(Size != 0);
6813 ArgOffset += alignTo(Size, kShadowTLSAlignment);
6814 }
6815 LLVM_DEBUG(dbgs() << " done with call args\n");
6816
6817 FunctionType *FT = CB.getFunctionType();
6818 if (FT->isVarArg()) {
6819 VAHelper->visitCallBase(CB, IRB);
6820 }
6821
6822 // Now, get the shadow for the RetVal.
6823 if (!CB.getType()->isSized())
6824 return;
6825 // Don't emit the epilogue for musttail call returns.
6826 if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
6827 return;
6828
6829 if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
6830 setShadow(&CB, getCleanShadow(&CB));
6831 setOrigin(&CB, getCleanOrigin());
6832 return;
6833 }
6834
6835 IRBuilder<> IRBBefore(&CB);
6836 // Until we have full dynamic coverage, make sure the retval shadow is 0.
6837 Value *Base = getShadowPtrForRetval(IRBBefore);
6838 IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
6840 BasicBlock::iterator NextInsn;
6841 if (isa<CallInst>(CB)) {
6842 NextInsn = ++CB.getIterator();
6843 assert(NextInsn != CB.getParent()->end());
6844 } else {
6845 BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
6846 if (!NormalDest->getSinglePredecessor()) {
6847 // FIXME: this case is tricky, so we are just conservative here.
6848 // Perhaps we need to split the edge between this BB and NormalDest,
6849 // but a naive attempt to use SplitEdge leads to a crash.
6850 setShadow(&CB, getCleanShadow(&CB));
6851 setOrigin(&CB, getCleanOrigin());
6852 return;
6853 }
6854 // FIXME: NextInsn is likely in a basic block that has not been visited
6855 // yet. Anything inserted there will be instrumented by MSan later!
6856 NextInsn = NormalDest->getFirstInsertionPt();
6857 assert(NextInsn != NormalDest->end() &&
6858 "Could not find insertion point for retval shadow load");
6859 }
6860 IRBuilder<> IRBAfter(&*NextInsn);
6861 Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
6862 getShadowTy(&CB), getShadowPtrForRetval(IRBAfter), kShadowTLSAlignment,
6863 "_msret");
6864 setShadow(&CB, RetvalShadow);
6865 if (MS.TrackOrigins)
6866 setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy, getOriginPtrForRetval()));
6867 }
6868
6869 bool isAMustTailRetVal(Value *RetVal) {
6870 if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
6871 RetVal = I->getOperand(0);
6872 }
6873 if (auto *I = dyn_cast<CallInst>(RetVal)) {
6874 return I->isMustTailCall();
6875 }
6876 return false;
6877 }
6878
6879 void visitReturnInst(ReturnInst &I) {
6880 IRBuilder<> IRB(&I);
6881 Value *RetVal = I.getReturnValue();
6882 if (!RetVal)
6883 return;
6884 // Don't emit the epilogue for musttail call returns.
6885 if (isAMustTailRetVal(RetVal))
6886 return;
6887 Value *ShadowPtr = getShadowPtrForRetval(IRB);
6888 bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
6889 bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
6890 // FIXME: Consider using SpecialCaseList to specify a list of functions that
6891 // must always return fully initialized values. For now, we hardcode "main".
6892 bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
6893
6894 Value *Shadow = getShadow(RetVal);
6895 bool StoreOrigin = true;
6896 if (EagerCheck) {
6897 insertCheckShadowOf(RetVal, &I);
6898 Shadow = getCleanShadow(RetVal);
6899 StoreOrigin = false;
6900 }
6901
6902 // The caller may still expect information passed over TLS if we pass our
6903 // check
6904 if (StoreShadow) {
6905 IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
6906 if (MS.TrackOrigins && StoreOrigin)
6907 IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval());
6908 }
6909 }
6910
6911 void visitPHINode(PHINode &I) {
6912 IRBuilder<> IRB(&I);
6913 if (!PropagateShadow) {
6914 setShadow(&I, getCleanShadow(&I));
6915 setOrigin(&I, getCleanOrigin());
6916 return;
6917 }
6918
6919 ShadowPHINodes.push_back(&I);
6920 setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
6921 "_msphi_s"));
6922 if (MS.TrackOrigins)
6923 setOrigin(
6924 &I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
6925 }
6926
6927 Value *getLocalVarIdptr(AllocaInst &I) {
6928 ConstantInt *IntConst =
6929 ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
6930 return new GlobalVariable(*F.getParent(), IntConst->getType(),
6931 /*isConstant=*/false, GlobalValue::PrivateLinkage,
6932 IntConst);
6933 }
6934
6935 Value *getLocalVarDescription(AllocaInst &I) {
6936 return createPrivateConstGlobalForString(*F.getParent(), I.getName());
6937 }
6938
6939 void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6940 if (PoisonStack && ClPoisonStackWithCall) {
6941 IRB.CreateCall(MS.MsanPoisonStackFn, {&I, Len});
6942 } else {
6943 Value *ShadowBase, *OriginBase;
6944 std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
6945 &I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
6946
6947 Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
6948 IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
6949 }
6950
6951 if (PoisonStack && MS.TrackOrigins) {
6952 Value *Idptr = getLocalVarIdptr(I);
6953 if (ClPrintStackNames) {
6954 Value *Descr = getLocalVarDescription(I);
6955 IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
6956 {&I, Len, Idptr, Descr});
6957 } else {
6958 IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn, {&I, Len, Idptr});
6959 }
6960 }
6961 }
6962
6963 void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
6964 Value *Descr = getLocalVarDescription(I);
6965 if (PoisonStack) {
6966 IRB.CreateCall(MS.MsanPoisonAllocaFn, {&I, Len, Descr});
6967 } else {
6968 IRB.CreateCall(MS.MsanUnpoisonAllocaFn, {&I, Len});
6969 }
6970 }
6971
6972 void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
6973 if (!InsPoint)
6974 InsPoint = &I;
6975 NextNodeIRBuilder IRB(InsPoint);
6976 const DataLayout &DL = F.getDataLayout();
6977 TypeSize TS = DL.getTypeAllocSize(I.getAllocatedType());
6978 Value *Len = IRB.CreateTypeSize(MS.IntptrTy, TS);
6979 if (I.isArrayAllocation())
6980 Len = IRB.CreateMul(Len,
6981 IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
6982
6983 if (MS.CompileKernel)
6984 poisonAllocaKmsan(I, IRB, Len);
6985 else
6986 poisonAllocaUserspace(I, IRB, Len);
6987 }
6988
6989 void visitAllocaInst(AllocaInst &I) {
6990 setShadow(&I, getCleanShadow(&I));
6991 setOrigin(&I, getCleanOrigin());
6992 // We'll get to this alloca later unless it's poisoned at the corresponding
6993 // llvm.lifetime.start.
6994 AllocaSet.insert(&I);
6995 }
6996
6997 void visitSelectInst(SelectInst &I) {
6998 // a = select b, c, d
6999 Value *B = I.getCondition();
7000 Value *C = I.getTrueValue();
7001 Value *D = I.getFalseValue();
7002
7003 handleSelectLikeInst(I, B, C, D);
7004 }
7005
7006 void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
7007 IRBuilder<> IRB(&I);
7008
7009 Value *Sb = getShadow(B);
7010 Value *Sc = getShadow(C);
7011 Value *Sd = getShadow(D);
7012
7013 Value *Ob = MS.TrackOrigins ? getOrigin(B) : nullptr;
7014 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
7015 Value *Od = MS.TrackOrigins ? getOrigin(D) : nullptr;
7016
7017 // Result shadow if condition shadow is 0.
7018 Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
7019 Value *Sa1;
7020 if (I.getType()->isAggregateType()) {
7021 // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
7022 // an extra "select". This results in much more compact IR.
7023 // Sa = select Sb, poisoned, (select b, Sc, Sd)
7024 Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
7025 } else if (isScalableNonVectorType(I.getType())) {
7026 // This is intended to handle target("aarch64.svcount"), which can't be
7027 // handled in the else branch because of incompatibility with CreateXor
7028 // ("The supported LLVM operations on this type are limited to load,
7029 // store, phi, select and alloca instructions").
7030
7031 // TODO: this currently underapproximates. Use Arm SVE EOR in the else
7032 // branch as needed instead.
7033 Sa1 = getCleanShadow(getShadowTy(I.getType()));
7034 } else {
7035 // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
7036 // If Sb (condition is poisoned), look for bits in c and d that are equal
7037 // and both unpoisoned.
7038 // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
7039
7040 // Cast arguments to shadow-compatible type.
7041 C = CreateAppToShadowCast(IRB, C);
7042 D = CreateAppToShadowCast(IRB, D);
7043
7044 // Result shadow if condition shadow is 1.
7045 Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
7046 }
7047 Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
7048 setShadow(&I, Sa);
7049 if (MS.TrackOrigins) {
7050 // Origins are always i32, so any vector conditions must be flattened.
7051 // FIXME: consider tracking vector origins for app vectors?
7052 if (B->getType()->isVectorTy()) {
7053 B = convertToBool(B, IRB);
7054 Sb = convertToBool(Sb, IRB);
7055 }
7056 // a = select b, c, d
7057 // Oa = Sb ? Ob : (b ? Oc : Od)
7058 setOrigin(&I, IRB.CreateSelect(Sb, Ob, IRB.CreateSelect(B, Oc, Od)));
7059 }
7060 }
7061
7062 void visitLandingPadInst(LandingPadInst &I) {
7063 // Do nothing.
7064 // See https://github.com/google/sanitizers/issues/504
7065 setShadow(&I, getCleanShadow(&I));
7066 setOrigin(&I, getCleanOrigin());
7067 }
7068
7069 void visitCatchSwitchInst(CatchSwitchInst &I) {
7070 setShadow(&I, getCleanShadow(&I));
7071 setOrigin(&I, getCleanOrigin());
7072 }
7073
7074 void visitFuncletPadInst(FuncletPadInst &I) {
7075 setShadow(&I, getCleanShadow(&I));
7076 setOrigin(&I, getCleanOrigin());
7077 }
7078
7079 void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
7080
7081 void visitExtractValueInst(ExtractValueInst &I) {
7082 IRBuilder<> IRB(&I);
7083 Value *Agg = I.getAggregateOperand();
7084 LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
7085 Value *AggShadow = getShadow(Agg);
7086 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
7087 Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
7088 LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
7089 setShadow(&I, ResShadow);
7090 setOriginForNaryOp(I);
7091 }
7092
7093 void visitInsertValueInst(InsertValueInst &I) {
7094 IRBuilder<> IRB(&I);
7095 LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
7096 Value *AggShadow = getShadow(I.getAggregateOperand());
7097 Value *InsShadow = getShadow(I.getInsertedValueOperand());
7098 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
7099 LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
7100 Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
7101 LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
7102 setShadow(&I, Res);
7103 setOriginForNaryOp(I);
7104 }
7105
7106 void dumpInst(Instruction &I) {
7107 if (CallInst *CI = dyn_cast<CallInst>(&I)) {
7108 errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
7109 } else {
7110 errs() << "ZZZ " << I.getOpcodeName() << "\n";
7111 }
7112 errs() << "QQQ " << I << "\n";
7113 }
7114
7115 void visitResumeInst(ResumeInst &I) {
7116 LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
7117 // Nothing to do here.
7118 }
7119
7120 void visitCleanupReturnInst(CleanupReturnInst &CRI) {
7121 LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
7122 // Nothing to do here.
7123 }
7124
7125 void visitCatchReturnInst(CatchReturnInst &CRI) {
7126 LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
7127 // Nothing to do here.
7128 }
7129
7130 void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
7131 IRBuilder<> &IRB, const DataLayout &DL,
7132 bool isOutput) {
7133 // For each assembly argument, we check its value for being initialized.
7134 // If the argument is a pointer, we assume it points to a single element
7135 // of the corresponding type (or to a 8-byte word, if the type is unsized).
7136 // Each such pointer is instrumented with a call to the runtime library.
7137 Type *OpType = Operand->getType();
7138 // Check the operand value itself.
7139 insertCheckShadowOf(Operand, &I);
7140 if (!OpType->isPointerTy() || !isOutput) {
7141 assert(!isOutput);
7142 return;
7143 }
7144 if (!ElemTy->isSized())
7145 return;
7146 auto Size = DL.getTypeStoreSize(ElemTy);
7147 Value *SizeVal = IRB.CreateTypeSize(MS.IntptrTy, Size);
7148 if (MS.CompileKernel) {
7149 IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Operand, SizeVal});
7150 } else {
7151 // ElemTy, derived from elementtype(), does not encode the alignment of
7152 // the pointer. Conservatively assume that the shadow memory is unaligned.
7153 // When Size is large, avoid StoreInst as it would expand to many
7154 // instructions.
7155 auto [ShadowPtr, _] =
7156 getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
7157 if (Size <= 32)
7158 IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
7159 else
7160 IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
7161 SizeVal, Align(1));
7162 }
7163 }
7164
7165 /// Get the number of output arguments returned by pointers.
7166 int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
7167 int NumRetOutputs = 0;
7168 int NumOutputs = 0;
7169 Type *RetTy = cast<Value>(CB)->getType();
7170 if (!RetTy->isVoidTy()) {
7171 // Register outputs are returned via the CallInst return value.
7172 auto *ST = dyn_cast<StructType>(RetTy);
7173 if (ST)
7174 NumRetOutputs = ST->getNumElements();
7175 else
7176 NumRetOutputs = 1;
7177 }
7178 InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
7179 for (const InlineAsm::ConstraintInfo &Info : Constraints) {
7180 switch (Info.Type) {
7182 NumOutputs++;
7183 break;
7184 default:
7185 break;
7186 }
7187 }
7188 return NumOutputs - NumRetOutputs;
7189 }
7190
7191 void visitAsmInstruction(Instruction &I) {
7192 // Conservative inline assembly handling: check for poisoned shadow of
7193 // asm() arguments, then unpoison the result and all the memory locations
7194 // pointed to by those arguments.
7195 // An inline asm() statement in C++ contains lists of input and output
7196 // arguments used by the assembly code. These are mapped to operands of the
7197 // CallInst as follows:
7198 // - nR register outputs ("=r) are returned by value in a single structure
7199 // (SSA value of the CallInst);
7200 // - nO other outputs ("=m" and others) are returned by pointer as first
7201 // nO operands of the CallInst;
7202 // - nI inputs ("r", "m" and others) are passed to CallInst as the
7203 // remaining nI operands.
7204 // The total number of asm() arguments in the source is nR+nO+nI, and the
7205 // corresponding CallInst has nO+nI+1 operands (the last operand is the
7206 // function to be called).
7207 const DataLayout &DL = F.getDataLayout();
7208 CallBase *CB = cast<CallBase>(&I);
7209 IRBuilder<> IRB(&I);
7210 InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
7211 int OutputArgs = getNumOutputArgs(IA, CB);
7212 // The last operand of a CallInst is the function itself.
7213 int NumOperands = CB->getNumOperands() - 1;
7214
7215 // Check input arguments. Doing so before unpoisoning output arguments, so
7216 // that we won't overwrite uninit values before checking them.
7217 for (int i = OutputArgs; i < NumOperands; i++) {
7218 Value *Operand = CB->getOperand(i);
7219 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
7220 /*isOutput*/ false);
7221 }
7222 // Unpoison output arguments. This must happen before the actual InlineAsm
7223 // call, so that the shadow for memory published in the asm() statement
7224 // remains valid.
7225 for (int i = 0; i < OutputArgs; i++) {
7226 Value *Operand = CB->getOperand(i);
7227 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
7228 /*isOutput*/ true);
7229 }
7230
7231 setShadow(&I, getCleanShadow(&I));
7232 setOrigin(&I, getCleanOrigin());
7233 }
7234
7235 void visitFreezeInst(FreezeInst &I) {
7236 // Freeze always returns a fully defined value.
7237 setShadow(&I, getCleanShadow(&I));
7238 setOrigin(&I, getCleanOrigin());
7239 }
7240
7241 void visitInstruction(Instruction &I) {
7242 // Everything else: stop propagating and check for poisoned shadow.
7244 dumpInst(I);
7245 LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
7246 for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
7247 Value *Operand = I.getOperand(i);
7248 if (Operand->getType()->isSized())
7249 insertCheckShadowOf(Operand, &I);
7250 }
7251 setShadow(&I, getCleanShadow(&I));
7252 setOrigin(&I, getCleanOrigin());
7253 }
7254};
7255
7256struct VarArgHelperBase : public VarArgHelper {
7257 Function &F;
7258 MemorySanitizer &MS;
7259 MemorySanitizerVisitor &MSV;
7260 SmallVector<CallInst *, 16> VAStartInstrumentationList;
7261 const unsigned VAListTagSize;
7262
7263 VarArgHelperBase(Function &F, MemorySanitizer &MS,
7264 MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
7265 : F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
7266
7267 Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
7268 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
7269 return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
7270 }
7271
7272 /// Compute the shadow address for a given va_arg.
7273 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
7274 return IRB.CreatePtrAdd(
7275 MS.VAArgTLS, ConstantInt::get(MS.IntptrTy, ArgOffset), "_msarg_va_s");
7276 }
7277
7278 /// Compute the shadow address for a given va_arg.
7279 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
7280 unsigned ArgSize) {
7281 // Make sure we don't overflow __msan_va_arg_tls.
7282 if (ArgOffset + ArgSize > kParamTLSSize)
7283 return nullptr;
7284 return getShadowPtrForVAArgument(IRB, ArgOffset);
7285 }
7286
7287 /// Compute the origin address for a given va_arg.
7288 Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
7289 // getOriginPtrForVAArgument() is always called after
7290 // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
7291 // overflow.
7292 return IRB.CreatePtrAdd(MS.VAArgOriginTLS,
7293 ConstantInt::get(MS.IntptrTy, ArgOffset),
7294 "_msarg_va_o");
7295 }
7296
7297 void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
7298 unsigned BaseOffset) {
7299 // The tails of __msan_va_arg_tls is not large enough to fit full
7300 // value shadow, but it will be copied to backup anyway. Make it
7301 // clean.
7302 if (BaseOffset >= kParamTLSSize)
7303 return;
7304 Value *TailSize =
7305 ConstantInt::getSigned(IRB.getInt32Ty(), kParamTLSSize - BaseOffset);
7306 IRB.CreateMemSet(ShadowBase, ConstantInt::getNullValue(IRB.getInt8Ty()),
7307 TailSize, Align(8));
7308 }
7309
7310 void unpoisonVAListTagForInst(IntrinsicInst &I) {
7311 IRBuilder<> IRB(&I);
7312 Value *VAListTag = I.getArgOperand(0);
7313 const Align Alignment = Align(8);
7314 auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
7315 VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
7316 // Unpoison the whole __va_list_tag.
7317 IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
7318 VAListTagSize, Alignment, false);
7319 }
7320
7321 void visitVAStartInst(VAStartInst &I) override {
7322 if (F.getCallingConv() == CallingConv::Win64)
7323 return;
7324 VAStartInstrumentationList.push_back(&I);
7325 unpoisonVAListTagForInst(I);
7326 }
7327
7328 void visitVACopyInst(VACopyInst &I) override {
7329 if (F.getCallingConv() == CallingConv::Win64)
7330 return;
7331 unpoisonVAListTagForInst(I);
7332 }
7333};
7334
7335/// AMD64-specific implementation of VarArgHelper.
7336struct VarArgAMD64Helper : public VarArgHelperBase {
7337 // An unfortunate workaround for asymmetric lowering of va_arg stuff.
7338 // See a comment in visitCallBase for more details.
7339 static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
7340 static const unsigned AMD64FpEndOffsetSSE = 176;
7341 // If SSE is disabled, fp_offset in va_list is zero.
7342 static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
7343
7344 unsigned AMD64FpEndOffset;
7345 AllocaInst *VAArgTLSCopy = nullptr;
7346 AllocaInst *VAArgTLSOriginCopy = nullptr;
7347 Value *VAArgOverflowSize = nullptr;
7348
7349 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7350
7351 VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
7352 MemorySanitizerVisitor &MSV)
7353 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
7354 AMD64FpEndOffset = AMD64FpEndOffsetSSE;
7355 for (const auto &Attr : F.getAttributes().getFnAttrs()) {
7356 if (Attr.isStringAttribute() &&
7357 (Attr.getKindAsString() == "target-features")) {
7358 if (Attr.getValueAsString().contains("-sse"))
7359 AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
7360 break;
7361 }
7362 }
7363 }
7364
7365 ArgKind classifyArgument(Value *arg) {
7366 // A very rough approximation of X86_64 argument classification rules.
7367 Type *T = arg->getType();
7368 if (T->isX86_FP80Ty())
7369 return AK_Memory;
7370 if (T->isFPOrFPVectorTy())
7371 return AK_FloatingPoint;
7372 if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
7373 return AK_GeneralPurpose;
7374 if (T->isPointerTy())
7375 return AK_GeneralPurpose;
7376 return AK_Memory;
7377 }
7378
7379 // For VarArg functions, store the argument shadow in an ABI-specific format
7380 // that corresponds to va_list layout.
7381 // We do this because Clang lowers va_arg in the frontend, and this pass
7382 // only sees the low level code that deals with va_list internals.
7383 // A much easier alternative (provided that Clang emits va_arg instructions)
7384 // would have been to associate each live instance of va_list with a copy of
7385 // MSanParamTLS, and extract shadow on va_arg() call in the argument list
7386 // order.
7387 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7388 unsigned GpOffset = 0;
7389 unsigned FpOffset = AMD64GpEndOffset;
7390 unsigned OverflowOffset = AMD64FpEndOffset;
7391 const DataLayout &DL = F.getDataLayout();
7392
7393 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7394 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7395 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7396 if (IsByVal) {
7397 // ByVal arguments always go to the overflow area.
7398 // Fixed arguments passed through the overflow area will be stepped
7399 // over by va_start, so don't count them towards the offset.
7400 if (IsFixed)
7401 continue;
7402 assert(A->getType()->isPointerTy());
7403 Type *RealTy = CB.getParamByValType(ArgNo);
7404 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7405 uint64_t AlignedSize = alignTo(ArgSize, 8);
7406 unsigned BaseOffset = OverflowOffset;
7407 Value *ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7408 Value *OriginBase = nullptr;
7409 if (MS.TrackOrigins)
7410 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7411 OverflowOffset += AlignedSize;
7412
7413 if (OverflowOffset > kParamTLSSize) {
7414 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7415 continue; // We have no space to copy shadow there.
7416 }
7417
7418 Value *ShadowPtr, *OriginPtr;
7419 std::tie(ShadowPtr, OriginPtr) =
7420 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
7421 /*isStore*/ false);
7422 IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
7423 kShadowTLSAlignment, ArgSize);
7424 if (MS.TrackOrigins)
7425 IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
7426 kShadowTLSAlignment, ArgSize);
7427 } else {
7428 ArgKind AK = classifyArgument(A);
7429 if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
7430 AK = AK_Memory;
7431 if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
7432 AK = AK_Memory;
7433 Value *ShadowBase, *OriginBase = nullptr;
7434 switch (AK) {
7435 case AK_GeneralPurpose:
7436 ShadowBase = getShadowPtrForVAArgument(IRB, GpOffset);
7437 if (MS.TrackOrigins)
7438 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset);
7439 GpOffset += 8;
7440 assert(GpOffset <= kParamTLSSize);
7441 break;
7442 case AK_FloatingPoint:
7443 ShadowBase = getShadowPtrForVAArgument(IRB, FpOffset);
7444 if (MS.TrackOrigins)
7445 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7446 FpOffset += 16;
7447 assert(FpOffset <= kParamTLSSize);
7448 break;
7449 case AK_Memory:
7450 if (IsFixed)
7451 continue;
7452 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7453 uint64_t AlignedSize = alignTo(ArgSize, 8);
7454 unsigned BaseOffset = OverflowOffset;
7455 ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7456 if (MS.TrackOrigins) {
7457 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7458 }
7459 OverflowOffset += AlignedSize;
7460 if (OverflowOffset > kParamTLSSize) {
7461 // We have no space to copy shadow there.
7462 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7463 continue;
7464 }
7465 }
7466 // Take fixed arguments into account for GpOffset and FpOffset,
7467 // but don't actually store shadows for them.
7468 // TODO(glider): don't call get*PtrForVAArgument() for them.
7469 if (IsFixed)
7470 continue;
7471 Value *Shadow = MSV.getShadow(A);
7472 IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
7473 if (MS.TrackOrigins) {
7474 Value *Origin = MSV.getOrigin(A);
7475 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
7476 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
7478 }
7479 }
7480 }
7481 Constant *OverflowSize =
7482 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
7483 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7484 }
7485
7486 void finalizeInstrumentation() override {
7487 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7488 "finalizeInstrumentation called twice");
7489 if (!VAStartInstrumentationList.empty()) {
7490 // If there is a va_start in this function, make a backup copy of
7491 // va_arg_tls somewhere in the function entry block.
7492 IRBuilder<> IRB(MSV.FnPrologueEnd);
7493 VAArgOverflowSize =
7494 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7495 Value *CopySize = IRB.CreateAdd(
7496 ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
7497 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7498 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7499 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7500 CopySize, kShadowTLSAlignment, false);
7501
7502 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7503 Intrinsic::umin, CopySize,
7504 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7505 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7506 kShadowTLSAlignment, SrcSize);
7507 if (MS.TrackOrigins) {
7508 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7509 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
7510 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
7511 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
7512 }
7513 }
7514
7515 // Instrument va_start.
7516 // Copy va_list shadow from the backup copy of the TLS contents.
7517 for (CallInst *OrigInst : VAStartInstrumentationList) {
7518 NextNodeIRBuilder IRB(OrigInst);
7519 Value *VAListTag = OrigInst->getArgOperand(0);
7520
7521 Value *RegSaveAreaPtrPtr =
7522 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, 16));
7523 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7524 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7525 const Align Alignment = Align(16);
7526 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7527 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7528 Alignment, /*isStore*/ true);
7529 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7530 AMD64FpEndOffset);
7531 if (MS.TrackOrigins)
7532 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
7533 Alignment, AMD64FpEndOffset);
7534 Value *OverflowArgAreaPtrPtr =
7535 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, 8));
7536 Value *OverflowArgAreaPtr =
7537 IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
7538 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
7539 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
7540 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
7541 Alignment, /*isStore*/ true);
7542 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
7543 AMD64FpEndOffset);
7544 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
7545 VAArgOverflowSize);
7546 if (MS.TrackOrigins) {
7547 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
7548 AMD64FpEndOffset);
7549 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
7550 VAArgOverflowSize);
7551 }
7552 }
7553 }
7554};
7555
7556/// AArch64-specific implementation of VarArgHelper.
7557struct VarArgAArch64Helper : public VarArgHelperBase {
7558 static const unsigned kAArch64GrArgSize = 64;
7559 static const unsigned kAArch64VrArgSize = 128;
7560
7561 static const unsigned AArch64GrBegOffset = 0;
7562 static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
7563 // Make VR space aligned to 16 bytes.
7564 static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
7565 static const unsigned AArch64VrEndOffset =
7566 AArch64VrBegOffset + kAArch64VrArgSize;
7567 static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
7568
7569 AllocaInst *VAArgTLSCopy = nullptr;
7570 Value *VAArgOverflowSize = nullptr;
7571
7572 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7573
7574 VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
7575 MemorySanitizerVisitor &MSV)
7576 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
7577
7578 // A very rough approximation of aarch64 argument classification rules.
7579 std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
7580 if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
7581 return {AK_GeneralPurpose, 1};
7582 if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
7583 return {AK_FloatingPoint, 1};
7584
7585 if (T->isArrayTy()) {
7586 auto R = classifyArgument(T->getArrayElementType());
7587 R.second *= T->getScalarType()->getArrayNumElements();
7588 return R;
7589 }
7590
7591 if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(T)) {
7592 auto R = classifyArgument(FV->getScalarType());
7593 R.second *= FV->getNumElements();
7594 return R;
7595 }
7596
7597 LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
7598 return {AK_Memory, 0};
7599 }
7600
7601 // The instrumentation stores the argument shadow in a non ABI-specific
7602 // format because it does not know which argument is named (since Clang,
7603 // like x86_64 case, lowers the va_args in the frontend and this pass only
7604 // sees the low level code that deals with va_list internals).
7605 // The first seven GR registers are saved in the first 56 bytes of the
7606 // va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
7607 // the remaining arguments.
7608 // Using constant offset within the va_arg TLS array allows fast copy
7609 // in the finalize instrumentation.
7610 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7611 unsigned GrOffset = AArch64GrBegOffset;
7612 unsigned VrOffset = AArch64VrBegOffset;
7613 unsigned OverflowOffset = AArch64VAEndOffset;
7614
7615 const DataLayout &DL = F.getDataLayout();
7616 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7617 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7618 auto [AK, RegNum] = classifyArgument(A->getType());
7619 if (AK == AK_GeneralPurpose &&
7620 (GrOffset + RegNum * 8) > AArch64GrEndOffset)
7621 AK = AK_Memory;
7622 if (AK == AK_FloatingPoint &&
7623 (VrOffset + RegNum * 16) > AArch64VrEndOffset)
7624 AK = AK_Memory;
7625 Value *Base;
7626 switch (AK) {
7627 case AK_GeneralPurpose:
7628 Base = getShadowPtrForVAArgument(IRB, GrOffset);
7629 GrOffset += 8 * RegNum;
7630 break;
7631 case AK_FloatingPoint:
7632 Base = getShadowPtrForVAArgument(IRB, VrOffset);
7633 VrOffset += 16 * RegNum;
7634 break;
7635 case AK_Memory:
7636 // Don't count fixed arguments in the overflow area - va_start will
7637 // skip right over them.
7638 if (IsFixed)
7639 continue;
7640 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7641 uint64_t AlignedSize = alignTo(ArgSize, 8);
7642 unsigned BaseOffset = OverflowOffset;
7643 Base = getShadowPtrForVAArgument(IRB, BaseOffset);
7644 OverflowOffset += AlignedSize;
7645 if (OverflowOffset > kParamTLSSize) {
7646 // We have no space to copy shadow there.
7647 CleanUnusedTLS(IRB, Base, BaseOffset);
7648 continue;
7649 }
7650 break;
7651 }
7652 // Count Gp/Vr fixed arguments to their respective offsets, but don't
7653 // bother to actually store a shadow.
7654 if (IsFixed)
7655 continue;
7656 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7657 }
7658 Constant *OverflowSize =
7659 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
7660 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7661 }
7662
7663 // Retrieve a va_list field of 'void*' size.
7664 Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7665 Value *SaveAreaPtrPtr =
7666 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, offset));
7667 return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
7668 }
7669
7670 // Retrieve a va_list field of 'int' size.
7671 Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7672 Value *SaveAreaPtr =
7673 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, offset));
7674 Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
7675 return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
7676 }
7677
7678 void finalizeInstrumentation() override {
7679 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7680 "finalizeInstrumentation called twice");
7681 if (!VAStartInstrumentationList.empty()) {
7682 // If there is a va_start in this function, make a backup copy of
7683 // va_arg_tls somewhere in the function entry block.
7684 IRBuilder<> IRB(MSV.FnPrologueEnd);
7685 VAArgOverflowSize =
7686 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7687 Value *CopySize = IRB.CreateAdd(
7688 ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
7689 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7690 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7691 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7692 CopySize, kShadowTLSAlignment, false);
7693
7694 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7695 Intrinsic::umin, CopySize,
7696 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7697 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7698 kShadowTLSAlignment, SrcSize);
7699 }
7700
7701 Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
7702 Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
7703
7704 // Instrument va_start, copy va_list shadow from the backup copy of
7705 // the TLS contents.
7706 for (CallInst *OrigInst : VAStartInstrumentationList) {
7707 NextNodeIRBuilder IRB(OrigInst);
7708
7709 Value *VAListTag = OrigInst->getArgOperand(0);
7710
7711 // The variadic ABI for AArch64 creates two areas to save the incoming
7712 // argument registers (one for 64-bit general register xn-x7 and another
7713 // for 128-bit FP/SIMD vn-v7).
7714 // We need then to propagate the shadow arguments on both regions
7715 // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
7716 // The remaining arguments are saved on shadow for 'va::stack'.
7717 // One caveat is it requires only to propagate the non-named arguments,
7718 // however on the call site instrumentation 'all' the arguments are
7719 // saved. So to copy the shadow values from the va_arg TLS array
7720 // we need to adjust the offset for both GR and VR fields based on
7721 // the __{gr,vr}_offs value (since they are stores based on incoming
7722 // named arguments).
7723 Type *RegSaveAreaPtrTy = IRB.getPtrTy();
7724
7725 // Read the stack pointer from the va_list.
7726 Value *StackSaveAreaPtr =
7727 IRB.CreateIntToPtr(getVAField64(IRB, VAListTag, 0), RegSaveAreaPtrTy);
7728
7729 // Read both the __gr_top and __gr_off and add them up.
7730 Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
7731 Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
7732
7733 Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
7734 IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea), RegSaveAreaPtrTy);
7735
7736 // Read both the __vr_top and __vr_off and add them up.
7737 Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
7738 Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
7739
7740 Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
7741 IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea), RegSaveAreaPtrTy);
7742
7743 // It does not know how many named arguments is being used and, on the
7744 // callsite all the arguments were saved. Since __gr_off is defined as
7745 // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
7746 // argument by ignoring the bytes of shadow from named arguments.
7747 Value *GrRegSaveAreaShadowPtrOff =
7748 IRB.CreateAdd(GrArgSize, GrOffSaveArea);
7749
7750 Value *GrRegSaveAreaShadowPtr =
7751 MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7752 Align(8), /*isStore*/ true)
7753 .first;
7754
7755 Value *GrSrcPtr =
7756 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy, GrRegSaveAreaShadowPtrOff);
7757 Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
7758
7759 IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
7760 GrCopySize);
7761
7762 // Again, but for FP/SIMD values.
7763 Value *VrRegSaveAreaShadowPtrOff =
7764 IRB.CreateAdd(VrArgSize, VrOffSaveArea);
7765
7766 Value *VrRegSaveAreaShadowPtr =
7767 MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7768 Align(8), /*isStore*/ true)
7769 .first;
7770
7771 Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
7772 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy,
7773 IRB.getInt32(AArch64VrBegOffset)),
7774 VrRegSaveAreaShadowPtrOff);
7775 Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
7776
7777 IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
7778 VrCopySize);
7779
7780 // And finally for remaining arguments.
7781 Value *StackSaveAreaShadowPtr =
7782 MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
7783 Align(16), /*isStore*/ true)
7784 .first;
7785
7786 Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
7787 VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
7788
7789 IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
7790 Align(16), VAArgOverflowSize);
7791 }
7792 }
7793};
7794
7795/// PowerPC64-specific implementation of VarArgHelper.
7796struct VarArgPowerPC64Helper : public VarArgHelperBase {
7797 AllocaInst *VAArgTLSCopy = nullptr;
7798 Value *VAArgSize = nullptr;
7799
7800 VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
7801 MemorySanitizerVisitor &MSV)
7802 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
7803
7804 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7805 // For PowerPC, we need to deal with alignment of stack arguments -
7806 // they are mostly aligned to 8 bytes, but vectors and i128 arrays
7807 // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
7808 // For that reason, we compute current offset from stack pointer (which is
7809 // always properly aligned), and offset for the first vararg, then subtract
7810 // them.
7811 unsigned VAArgBase;
7812 Triple TargetTriple(F.getParent()->getTargetTriple());
7813 // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
7814 // and 32 bytes for ABIv2. This is usually determined by target
7815 // endianness, but in theory could be overridden by function attribute.
7816 if (TargetTriple.isPPC64ELFv2ABI())
7817 VAArgBase = 32;
7818 else
7819 VAArgBase = 48;
7820 unsigned VAArgOffset = VAArgBase;
7821 const DataLayout &DL = F.getDataLayout();
7822 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7823 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7824 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7825 if (IsByVal) {
7826 assert(A->getType()->isPointerTy());
7827 Type *RealTy = CB.getParamByValType(ArgNo);
7828 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7829 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
7830 if (ArgAlign < 8)
7831 ArgAlign = Align(8);
7832 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7833 if (!IsFixed) {
7834 Value *Base =
7835 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7836 if (Base) {
7837 Value *AShadowPtr, *AOriginPtr;
7838 std::tie(AShadowPtr, AOriginPtr) =
7839 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7840 kShadowTLSAlignment, /*isStore*/ false);
7841
7842 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7843 kShadowTLSAlignment, ArgSize);
7844 }
7845 }
7846 VAArgOffset += alignTo(ArgSize, Align(8));
7847 } else {
7848 Value *Base;
7849 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7850 Align ArgAlign = Align(8);
7851 if (A->getType()->isArrayTy()) {
7852 // Arrays are aligned to element size, except for long double
7853 // arrays, which are aligned to 8 bytes.
7854 Type *ElementTy = A->getType()->getArrayElementType();
7855 if (!ElementTy->isPPC_FP128Ty())
7856 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7857 } else if (A->getType()->isVectorTy()) {
7858 // Vectors are naturally aligned.
7859 ArgAlign = Align(ArgSize);
7860 }
7861 if (ArgAlign < 8)
7862 ArgAlign = Align(8);
7863 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7864 if (DL.isBigEndian()) {
7865 // Adjusting the shadow for argument with size < 8 to match the
7866 // placement of bits in big endian system
7867 if (ArgSize < 8)
7868 VAArgOffset += (8 - ArgSize);
7869 }
7870 if (!IsFixed) {
7871 Base =
7872 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7873 if (Base)
7874 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7875 }
7876 VAArgOffset += ArgSize;
7877 VAArgOffset = alignTo(VAArgOffset, Align(8));
7878 }
7879 if (IsFixed)
7880 VAArgBase = VAArgOffset;
7881 }
7882
7883 Constant *TotalVAArgSize =
7884 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
7885 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
7886 // a new class member i.e. it is the total size of all VarArgs.
7887 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
7888 }
7889
7890 void finalizeInstrumentation() override {
7891 assert(!VAArgSize && !VAArgTLSCopy &&
7892 "finalizeInstrumentation called twice");
7893 IRBuilder<> IRB(MSV.FnPrologueEnd);
7894 VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7895 Value *CopySize = VAArgSize;
7896
7897 if (!VAStartInstrumentationList.empty()) {
7898 // If there is a va_start in this function, make a backup copy of
7899 // va_arg_tls somewhere in the function entry block.
7900
7901 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7902 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7903 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7904 CopySize, kShadowTLSAlignment, false);
7905
7906 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7907 Intrinsic::umin, CopySize,
7908 ConstantInt::get(IRB.getInt64Ty(), kParamTLSSize));
7909 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7910 kShadowTLSAlignment, SrcSize);
7911 }
7912
7913 // Instrument va_start.
7914 // Copy va_list shadow from the backup copy of the TLS contents.
7915 for (CallInst *OrigInst : VAStartInstrumentationList) {
7916 NextNodeIRBuilder IRB(OrigInst);
7917 Value *VAListTag = OrigInst->getArgOperand(0);
7918 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
7919
7920 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
7921
7922 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7923 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7924 const DataLayout &DL = F.getDataLayout();
7925 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7926 const Align Alignment = Align(IntptrSize);
7927 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7928 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7929 Alignment, /*isStore*/ true);
7930 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7931 CopySize);
7932 }
7933 }
7934};
7935
7936/// PowerPC32-specific implementation of VarArgHelper.
7937struct VarArgPowerPC32Helper : public VarArgHelperBase {
7938 AllocaInst *VAArgTLSCopy = nullptr;
7939 Value *VAArgSize = nullptr;
7940
7941 VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
7942 MemorySanitizerVisitor &MSV)
7943 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
7944
7945 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7946 unsigned VAArgBase;
7947 // Parameter save area is 8 bytes from frame pointer in PPC32
7948 VAArgBase = 8;
7949 unsigned VAArgOffset = VAArgBase;
7950 const DataLayout &DL = F.getDataLayout();
7951 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
7952 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7953 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7954 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7955 if (IsByVal) {
7956 assert(A->getType()->isPointerTy());
7957 Type *RealTy = CB.getParamByValType(ArgNo);
7958 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7959 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
7960 if (ArgAlign < IntptrSize)
7961 ArgAlign = Align(IntptrSize);
7962 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7963 if (!IsFixed) {
7964 Value *Base =
7965 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7966 if (Base) {
7967 Value *AShadowPtr, *AOriginPtr;
7968 std::tie(AShadowPtr, AOriginPtr) =
7969 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7970 kShadowTLSAlignment, /*isStore*/ false);
7971
7972 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7973 kShadowTLSAlignment, ArgSize);
7974 }
7975 }
7976 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
7977 } else {
7978 Value *Base;
7979 Type *ArgTy = A->getType();
7980
7981 // On PPC 32 floating point variable arguments are stored in separate
7982 // area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
7983 // them as they will be found when checking call arguments.
7984 if (!ArgTy->isFloatingPointTy()) {
7985 uint64_t ArgSize = DL.getTypeAllocSize(ArgTy);
7986 Align ArgAlign = Align(IntptrSize);
7987 if (ArgTy->isArrayTy()) {
7988 // Arrays are aligned to element size, except for long double
7989 // arrays, which are aligned to 8 bytes.
7990 Type *ElementTy = ArgTy->getArrayElementType();
7991 if (!ElementTy->isPPC_FP128Ty())
7992 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7993 } else if (ArgTy->isVectorTy()) {
7994 // Vectors are naturally aligned.
7995 ArgAlign = Align(ArgSize);
7996 }
7997 if (ArgAlign < IntptrSize)
7998 ArgAlign = Align(IntptrSize);
7999 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8000 if (DL.isBigEndian()) {
8001 // Adjusting the shadow for argument with size < IntptrSize to match
8002 // the placement of bits in big endian system
8003 if (ArgSize < IntptrSize)
8004 VAArgOffset += (IntptrSize - ArgSize);
8005 }
8006 if (!IsFixed) {
8007 Base = getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase,
8008 ArgSize);
8009 if (Base)
8010 IRB.CreateAlignedStore(MSV.getShadow(A), Base,
8012 }
8013 VAArgOffset += ArgSize;
8014 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8015 }
8016 }
8017 }
8018
8019 Constant *TotalVAArgSize =
8020 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
8021 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8022 // a new class member i.e. it is the total size of all VarArgs.
8023 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8024 }
8025
8026 void finalizeInstrumentation() override {
8027 assert(!VAArgSize && !VAArgTLSCopy &&
8028 "finalizeInstrumentation called twice");
8029 IRBuilder<> IRB(MSV.FnPrologueEnd);
8030 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8031 Value *CopySize = VAArgSize;
8032
8033 if (!VAStartInstrumentationList.empty()) {
8034 // If there is a va_start in this function, make a backup copy of
8035 // va_arg_tls somewhere in the function entry block.
8036
8037 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8038 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8039 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8040 CopySize, kShadowTLSAlignment, false);
8041
8042 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8043 Intrinsic::umin, CopySize,
8044 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8045 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8046 kShadowTLSAlignment, SrcSize);
8047 }
8048
8049 // Instrument va_start.
8050 // Copy va_list shadow from the backup copy of the TLS contents.
8051 for (CallInst *OrigInst : VAStartInstrumentationList) {
8052 NextNodeIRBuilder IRB(OrigInst);
8053 Value *VAListTag = OrigInst->getArgOperand(0);
8054 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
8055 Value *RegSaveAreaSize = CopySize;
8056
8057 // In PPC32 va_list_tag is a struct
8058 RegSaveAreaPtrPtr =
8059 IRB.CreateAdd(RegSaveAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 8));
8060
8061 // On PPC 32 reg_save_area can only hold 32 bytes of data
8062 RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
8063 Intrinsic::umin, CopySize, ConstantInt::get(MS.IntptrTy, 32));
8064
8065 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
8066 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8067
8068 const DataLayout &DL = F.getDataLayout();
8069 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8070 const Align Alignment = Align(IntptrSize);
8071
8072 { // Copy reg save area
8073 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8074 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8075 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8076 Alignment, /*isStore*/ true);
8077 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy,
8078 Alignment, RegSaveAreaSize);
8079
8080 RegSaveAreaShadowPtr =
8081 IRB.CreatePtrToInt(RegSaveAreaShadowPtr, MS.IntptrTy);
8082 Value *FPSaveArea = IRB.CreateAdd(RegSaveAreaShadowPtr,
8083 ConstantInt::get(MS.IntptrTy, 32));
8084 FPSaveArea = IRB.CreateIntToPtr(FPSaveArea, MS.PtrTy);
8085 // We fill fp shadow with zeroes as uninitialized fp args should have
8086 // been found during call base check
8087 IRB.CreateMemSet(FPSaveArea, ConstantInt::getNullValue(IRB.getInt8Ty()),
8088 ConstantInt::get(MS.IntptrTy, 32), Alignment);
8089 }
8090
8091 { // Copy overflow area
8092 // RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
8093 Value *OverflowAreaSize = IRB.CreateSub(CopySize, RegSaveAreaSize);
8094
8095 Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
8096 OverflowAreaPtrPtr =
8097 IRB.CreateAdd(OverflowAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 4));
8098 OverflowAreaPtrPtr = IRB.CreateIntToPtr(OverflowAreaPtrPtr, MS.PtrTy);
8099
8100 Value *OverflowAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowAreaPtrPtr);
8101
8102 Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
8103 std::tie(OverflowAreaShadowPtr, OverflowAreaOriginPtr) =
8104 MSV.getShadowOriginPtr(OverflowAreaPtr, IRB, IRB.getInt8Ty(),
8105 Alignment, /*isStore*/ true);
8106
8107 Value *OverflowVAArgTLSCopyPtr =
8108 IRB.CreatePtrToInt(VAArgTLSCopy, MS.IntptrTy);
8109 OverflowVAArgTLSCopyPtr =
8110 IRB.CreateAdd(OverflowVAArgTLSCopyPtr, RegSaveAreaSize);
8111
8112 OverflowVAArgTLSCopyPtr =
8113 IRB.CreateIntToPtr(OverflowVAArgTLSCopyPtr, MS.PtrTy);
8114 IRB.CreateMemCpy(OverflowAreaShadowPtr, Alignment,
8115 OverflowVAArgTLSCopyPtr, Alignment, OverflowAreaSize);
8116 }
8117 }
8118 }
8119};
8120
8121/// SystemZ-specific implementation of VarArgHelper.
8122struct VarArgSystemZHelper : public VarArgHelperBase {
8123 static const unsigned SystemZGpOffset = 16;
8124 static const unsigned SystemZGpEndOffset = 56;
8125 static const unsigned SystemZFpOffset = 128;
8126 static const unsigned SystemZFpEndOffset = 160;
8127 static const unsigned SystemZMaxVrArgs = 8;
8128 static const unsigned SystemZRegSaveAreaSize = 160;
8129 static const unsigned SystemZOverflowOffset = 160;
8130 static const unsigned SystemZVAListTagSize = 32;
8131 static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
8132 static const unsigned SystemZRegSaveAreaPtrOffset = 24;
8133
8134 bool IsSoftFloatABI;
8135 AllocaInst *VAArgTLSCopy = nullptr;
8136 AllocaInst *VAArgTLSOriginCopy = nullptr;
8137 Value *VAArgOverflowSize = nullptr;
8138
8139 enum class ArgKind {
8140 GeneralPurpose,
8141 FloatingPoint,
8142 Vector,
8143 Memory,
8144 Indirect,
8145 };
8146
8147 enum class ShadowExtension { None, Zero, Sign };
8148
8149 VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
8150 MemorySanitizerVisitor &MSV)
8151 : VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
8152 IsSoftFloatABI(F.getFnAttribute("use-soft-float").getValueAsBool()) {}
8153
8154 ArgKind classifyArgument(Type *T) {
8155 // T is a SystemZABIInfo::classifyArgumentType() output, and there are
8156 // only a few possibilities of what it can be. In particular, enums, single
8157 // element structs and large types have already been taken care of.
8158
8159 // Some i128 and fp128 arguments are converted to pointers only in the
8160 // back end.
8161 if (T->isIntegerTy(128) || T->isFP128Ty())
8162 return ArgKind::Indirect;
8163 if (T->isFloatingPointTy())
8164 return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
8165 if (T->isIntegerTy() || T->isPointerTy())
8166 return ArgKind::GeneralPurpose;
8167 if (T->isVectorTy())
8168 return ArgKind::Vector;
8169 return ArgKind::Memory;
8170 }
8171
8172 ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
8173 // ABI says: "One of the simple integer types no more than 64 bits wide.
8174 // ... If such an argument is shorter than 64 bits, replace it by a full
8175 // 64-bit integer representing the same number, using sign or zero
8176 // extension". Shadow for an integer argument has the same type as the
8177 // argument itself, so it can be sign or zero extended as well.
8178 bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
8179 bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
8180 if (ZExt) {
8181 assert(!SExt);
8182 return ShadowExtension::Zero;
8183 }
8184 if (SExt) {
8185 assert(!ZExt);
8186 return ShadowExtension::Sign;
8187 }
8188 return ShadowExtension::None;
8189 }
8190
8191 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8192 unsigned GpOffset = SystemZGpOffset;
8193 unsigned FpOffset = SystemZFpOffset;
8194 unsigned VrIndex = 0;
8195 unsigned OverflowOffset = SystemZOverflowOffset;
8196 const DataLayout &DL = F.getDataLayout();
8197 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8198 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8199 // SystemZABIInfo does not produce ByVal parameters.
8200 assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
8201 Type *T = A->getType();
8202 ArgKind AK = classifyArgument(T);
8203 if (AK == ArgKind::Indirect) {
8204 T = MS.PtrTy;
8205 AK = ArgKind::GeneralPurpose;
8206 }
8207 if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
8208 AK = ArgKind::Memory;
8209 if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
8210 AK = ArgKind::Memory;
8211 if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
8212 AK = ArgKind::Memory;
8213 Value *ShadowBase = nullptr;
8214 Value *OriginBase = nullptr;
8215 ShadowExtension SE = ShadowExtension::None;
8216 switch (AK) {
8217 case ArgKind::GeneralPurpose: {
8218 // Always keep track of GpOffset, but store shadow only for varargs.
8219 uint64_t ArgSize = 8;
8220 if (GpOffset + ArgSize <= kParamTLSSize) {
8221 if (!IsFixed) {
8222 SE = getShadowExtension(CB, ArgNo);
8223 uint64_t GapSize = 0;
8224 if (SE == ShadowExtension::None) {
8225 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
8226 assert(ArgAllocSize <= ArgSize);
8227 GapSize = ArgSize - ArgAllocSize;
8228 }
8229 ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
8230 if (MS.TrackOrigins)
8231 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
8232 }
8233 GpOffset += ArgSize;
8234 } else {
8235 GpOffset = kParamTLSSize;
8236 }
8237 break;
8238 }
8239 case ArgKind::FloatingPoint: {
8240 // Always keep track of FpOffset, but store shadow only for varargs.
8241 uint64_t ArgSize = 8;
8242 if (FpOffset + ArgSize <= kParamTLSSize) {
8243 if (!IsFixed) {
8244 // PoP says: "A short floating-point datum requires only the
8245 // left-most 32 bit positions of a floating-point register".
8246 // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
8247 // don't extend shadow and don't mind the gap.
8248 ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
8249 if (MS.TrackOrigins)
8250 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
8251 }
8252 FpOffset += ArgSize;
8253 } else {
8254 FpOffset = kParamTLSSize;
8255 }
8256 break;
8257 }
8258 case ArgKind::Vector: {
8259 // Keep track of VrIndex. No need to store shadow, since vector varargs
8260 // go through AK_Memory.
8261 assert(IsFixed);
8262 VrIndex++;
8263 break;
8264 }
8265 case ArgKind::Memory: {
8266 // Keep track of OverflowOffset and store shadow only for varargs.
8267 // Ignore fixed args, since we need to copy only the vararg portion of
8268 // the overflow area shadow.
8269 if (!IsFixed) {
8270 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
8271 uint64_t ArgSize = alignTo(ArgAllocSize, 8);
8272 if (OverflowOffset + ArgSize <= kParamTLSSize) {
8273 SE = getShadowExtension(CB, ArgNo);
8274 uint64_t GapSize =
8275 SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
8276 ShadowBase =
8277 getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
8278 if (MS.TrackOrigins)
8279 OriginBase =
8280 getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
8281 OverflowOffset += ArgSize;
8282 } else {
8283 OverflowOffset = kParamTLSSize;
8284 }
8285 }
8286 break;
8287 }
8288 case ArgKind::Indirect:
8289 llvm_unreachable("Indirect must be converted to GeneralPurpose");
8290 }
8291 if (ShadowBase == nullptr)
8292 continue;
8293 Value *Shadow = MSV.getShadow(A);
8294 if (SE != ShadowExtension::None)
8295 Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
8296 /*Signed*/ SE == ShadowExtension::Sign);
8297 ShadowBase = IRB.CreateIntToPtr(ShadowBase, MS.PtrTy, "_msarg_va_s");
8298 IRB.CreateStore(Shadow, ShadowBase);
8299 if (MS.TrackOrigins) {
8300 Value *Origin = MSV.getOrigin(A);
8301 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
8302 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
8304 }
8305 }
8306 Constant *OverflowSize = ConstantInt::get(
8307 IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
8308 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
8309 }
8310
8311 void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
8312 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
8313 IRB.CreateAdd(
8314 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8315 ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
8316 MS.PtrTy);
8317 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8318 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8319 const Align Alignment = Align(8);
8320 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8321 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
8322 /*isStore*/ true);
8323 // TODO(iii): copy only fragments filled by visitCallBase()
8324 // TODO(iii): support packed-stack && !use-soft-float
8325 // For use-soft-float functions, it is enough to copy just the GPRs.
8326 unsigned RegSaveAreaSize =
8327 IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
8328 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8329 RegSaveAreaSize);
8330 if (MS.TrackOrigins)
8331 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
8332 Alignment, RegSaveAreaSize);
8333 }
8334
8335 // FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
8336 // don't know real overflow size and can't clear shadow beyond kParamTLSSize.
8337 void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
8338 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
8339 IRB.CreateAdd(
8340 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8341 ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
8342 MS.PtrTy);
8343 Value *OverflowArgAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
8344 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
8345 const Align Alignment = Align(8);
8346 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
8347 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
8348 Alignment, /*isStore*/ true);
8349 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
8350 SystemZOverflowOffset);
8351 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
8352 VAArgOverflowSize);
8353 if (MS.TrackOrigins) {
8354 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
8355 SystemZOverflowOffset);
8356 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
8357 VAArgOverflowSize);
8358 }
8359 }
8360
8361 void finalizeInstrumentation() override {
8362 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
8363 "finalizeInstrumentation called twice");
8364 if (!VAStartInstrumentationList.empty()) {
8365 // If there is a va_start in this function, make a backup copy of
8366 // va_arg_tls somewhere in the function entry block.
8367 IRBuilder<> IRB(MSV.FnPrologueEnd);
8368 VAArgOverflowSize =
8369 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
8370 Value *CopySize =
8371 IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
8372 VAArgOverflowSize);
8373 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8374 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8375 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8376 CopySize, kShadowTLSAlignment, false);
8377
8378 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8379 Intrinsic::umin, CopySize,
8380 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8381 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8382 kShadowTLSAlignment, SrcSize);
8383 if (MS.TrackOrigins) {
8384 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8385 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
8386 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
8387 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
8388 }
8389 }
8390
8391 // Instrument va_start.
8392 // Copy va_list shadow from the backup copy of the TLS contents.
8393 for (CallInst *OrigInst : VAStartInstrumentationList) {
8394 NextNodeIRBuilder IRB(OrigInst);
8395 Value *VAListTag = OrigInst->getArgOperand(0);
8396 copyRegSaveArea(IRB, VAListTag);
8397 copyOverflowArea(IRB, VAListTag);
8398 }
8399 }
8400};
8401
8402/// i386-specific implementation of VarArgHelper.
8403struct VarArgI386Helper : public VarArgHelperBase {
8404 AllocaInst *VAArgTLSCopy = nullptr;
8405 Value *VAArgSize = nullptr;
8406
8407 VarArgI386Helper(Function &F, MemorySanitizer &MS,
8408 MemorySanitizerVisitor &MSV)
8409 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
8410
8411 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8412 const DataLayout &DL = F.getDataLayout();
8413 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8414 unsigned VAArgOffset = 0;
8415 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8416 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8417 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
8418 if (IsByVal) {
8419 assert(A->getType()->isPointerTy());
8420 Type *RealTy = CB.getParamByValType(ArgNo);
8421 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
8422 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
8423 if (ArgAlign < IntptrSize)
8424 ArgAlign = Align(IntptrSize);
8425 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8426 if (!IsFixed) {
8427 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8428 if (Base) {
8429 Value *AShadowPtr, *AOriginPtr;
8430 std::tie(AShadowPtr, AOriginPtr) =
8431 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
8432 kShadowTLSAlignment, /*isStore*/ false);
8433
8434 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
8435 kShadowTLSAlignment, ArgSize);
8436 }
8437 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
8438 }
8439 } else {
8440 Value *Base;
8441 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8442 Align ArgAlign = Align(IntptrSize);
8443 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8444 if (DL.isBigEndian()) {
8445 // Adjusting the shadow for argument with size < IntptrSize to match
8446 // the placement of bits in big endian system
8447 if (ArgSize < IntptrSize)
8448 VAArgOffset += (IntptrSize - ArgSize);
8449 }
8450 if (!IsFixed) {
8451 Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8452 if (Base)
8453 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8454 VAArgOffset += ArgSize;
8455 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8456 }
8457 }
8458 }
8459
8460 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8461 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8462 // a new class member i.e. it is the total size of all VarArgs.
8463 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8464 }
8465
8466 void finalizeInstrumentation() override {
8467 assert(!VAArgSize && !VAArgTLSCopy &&
8468 "finalizeInstrumentation called twice");
8469 IRBuilder<> IRB(MSV.FnPrologueEnd);
8470 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8471 Value *CopySize = VAArgSize;
8472
8473 if (!VAStartInstrumentationList.empty()) {
8474 // If there is a va_start in this function, make a backup copy of
8475 // va_arg_tls somewhere in the function entry block.
8476 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8477 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8478 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8479 CopySize, kShadowTLSAlignment, false);
8480
8481 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8482 Intrinsic::umin, CopySize,
8483 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8484 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8485 kShadowTLSAlignment, SrcSize);
8486 }
8487
8488 // Instrument va_start.
8489 // Copy va_list shadow from the backup copy of the TLS contents.
8490 for (CallInst *OrigInst : VAStartInstrumentationList) {
8491 NextNodeIRBuilder IRB(OrigInst);
8492 Value *VAListTag = OrigInst->getArgOperand(0);
8493 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8494 Value *RegSaveAreaPtrPtr =
8495 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8496 PointerType::get(*MS.C, 0));
8497 Value *RegSaveAreaPtr =
8498 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8499 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8500 const DataLayout &DL = F.getDataLayout();
8501 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8502 const Align Alignment = Align(IntptrSize);
8503 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8504 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8505 Alignment, /*isStore*/ true);
8506 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8507 CopySize);
8508 }
8509 }
8510};
8511
8512/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
8513/// LoongArch64.
8514struct VarArgGenericHelper : public VarArgHelperBase {
8515 AllocaInst *VAArgTLSCopy = nullptr;
8516 Value *VAArgSize = nullptr;
8517
8518 VarArgGenericHelper(Function &F, MemorySanitizer &MS,
8519 MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
8520 : VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
8521
8522 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8523 unsigned VAArgOffset = 0;
8524 const DataLayout &DL = F.getDataLayout();
8525 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8526 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8527 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8528 if (IsFixed)
8529 continue;
8530 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8531 if (DL.isBigEndian()) {
8532 // Adjusting the shadow for argument with size < IntptrSize to match the
8533 // placement of bits in big endian system
8534 if (ArgSize < IntptrSize)
8535 VAArgOffset += (IntptrSize - ArgSize);
8536 }
8537 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8538 VAArgOffset += ArgSize;
8539 VAArgOffset = alignTo(VAArgOffset, IntptrSize);
8540 if (!Base)
8541 continue;
8542 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8543 }
8544
8545 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8546 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8547 // a new class member i.e. it is the total size of all VarArgs.
8548 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8549 }
8550
8551 void finalizeInstrumentation() override {
8552 assert(!VAArgSize && !VAArgTLSCopy &&
8553 "finalizeInstrumentation called twice");
8554 IRBuilder<> IRB(MSV.FnPrologueEnd);
8555 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8556 Value *CopySize = VAArgSize;
8557
8558 if (!VAStartInstrumentationList.empty()) {
8559 // If there is a va_start in this function, make a backup copy of
8560 // va_arg_tls somewhere in the function entry block.
8561 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8562 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8563 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8564 CopySize, kShadowTLSAlignment, false);
8565
8566 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8567 Intrinsic::umin, CopySize,
8568 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8569 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8570 kShadowTLSAlignment, SrcSize);
8571 }
8572
8573 // Instrument va_start.
8574 // Copy va_list shadow from the backup copy of the TLS contents.
8575 for (CallInst *OrigInst : VAStartInstrumentationList) {
8576 NextNodeIRBuilder IRB(OrigInst);
8577 Value *VAListTag = OrigInst->getArgOperand(0);
8578 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8579 Value *RegSaveAreaPtrPtr =
8580 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8581 PointerType::get(*MS.C, 0));
8582 Value *RegSaveAreaPtr =
8583 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8584 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8585 const DataLayout &DL = F.getDataLayout();
8586 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8587 const Align Alignment = Align(IntptrSize);
8588 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8589 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8590 Alignment, /*isStore*/ true);
8591 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8592 CopySize);
8593 }
8594 }
8595};
8596
8597// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
8598// regarding VAArgs.
8599using VarArgARM32Helper = VarArgGenericHelper;
8600using VarArgRISCVHelper = VarArgGenericHelper;
8601using VarArgMIPSHelper = VarArgGenericHelper;
8602using VarArgLoongArch64Helper = VarArgGenericHelper;
8603
8604/// A no-op implementation of VarArgHelper.
8605struct VarArgNoOpHelper : public VarArgHelper {
8606 VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
8607 MemorySanitizerVisitor &MSV) {}
8608
8609 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
8610
8611 void visitVAStartInst(VAStartInst &I) override {}
8612
8613 void visitVACopyInst(VACopyInst &I) override {}
8614
8615 void finalizeInstrumentation() override {}
8616};
8617
8618} // end anonymous namespace
8619
8620static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
8621 MemorySanitizerVisitor &Visitor) {
8622 // VarArg handling is only implemented on AMD64. False positives are possible
8623 // on other platforms.
8624 Triple TargetTriple(Func.getParent()->getTargetTriple());
8625
8626 if (TargetTriple.getArch() == Triple::x86)
8627 return new VarArgI386Helper(Func, Msan, Visitor);
8628
8629 if (TargetTriple.getArch() == Triple::x86_64)
8630 return new VarArgAMD64Helper(Func, Msan, Visitor);
8631
8632 if (TargetTriple.isARM())
8633 return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8634
8635 if (TargetTriple.isAArch64())
8636 return new VarArgAArch64Helper(Func, Msan, Visitor);
8637
8638 if (TargetTriple.isSystemZ())
8639 return new VarArgSystemZHelper(Func, Msan, Visitor);
8640
8641 // On PowerPC32 VAListTag is a struct
8642 // {char, char, i16 padding, char *, char *}
8643 if (TargetTriple.isPPC32())
8644 return new VarArgPowerPC32Helper(Func, Msan, Visitor);
8645
8646 if (TargetTriple.isPPC64())
8647 return new VarArgPowerPC64Helper(Func, Msan, Visitor);
8648
8649 if (TargetTriple.isRISCV32())
8650 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8651
8652 if (TargetTriple.isRISCV64())
8653 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8654
8655 if (TargetTriple.isMIPS32())
8656 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8657
8658 if (TargetTriple.isMIPS64())
8659 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8660
8661 if (TargetTriple.isLoongArch64())
8662 return new VarArgLoongArch64Helper(Func, Msan, Visitor,
8663 /*VAListTagSize=*/8);
8664
8665 return new VarArgNoOpHelper(Func, Msan, Visitor);
8666}
8667
8668bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
8669 if (!CompileKernel && F.getName() == kMsanModuleCtorName)
8670 return false;
8671
8672 if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
8673 return false;
8674
8675 MemorySanitizerVisitor Visitor(F, *this, TLI);
8676
8677 // Clear out memory attributes.
8679 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
8680 F.removeFnAttrs(B);
8681
8682 return Visitor.runOnFunction();
8683}
#define Success
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
constexpr LLT S1
AMDGPU Uniform Intrinsic Combine
This file implements a class to represent arbitrary precision integral constant values and operations...
static bool isStore(int Opcode)
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static cl::opt< ITMode > IT(cl::desc("IT block support"), cl::Hidden, cl::init(DefaultIT), cl::values(clEnumValN(DefaultIT, "arm-default-it", "Generate any type of IT block"), clEnumValN(RestrictedIT, "arm-restrict-it", "Disallow complex IT blocks")))
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClWithComdat("asan-with-comdat", cl::desc("Place ASan constructors in comdat sections"), cl::Hidden, cl::init(true))
VarLocInsertPt getNextNode(const DbgRecord *DVR)
Atomic ordering constants.
This file contains the simple types necessary to represent the attributes associated with functions a...
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
static GCRegistry::Add< StatepointGC > D("statepoint-example", "an example strategy for statepoint")
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
Analysis containing CSE Info
Definition CSEInfo.cpp:27
This file contains the declarations for the subclasses of Constant, which represent the different fla...
const MemoryMapParams Linux_LoongArch64_MemoryMapParams
const MemoryMapParams Linux_X86_64_MemoryMapParams
static cl::opt< int > ClTrackOrigins("dfsan-track-origins", cl::desc("Track origins of labels"), cl::Hidden, cl::init(0))
static AtomicOrdering addReleaseOrdering(AtomicOrdering AO)
static AtomicOrdering addAcquireOrdering(AtomicOrdering AO)
const MemoryMapParams Linux_AArch64_MemoryMapParams
static bool isAMustTailRetVal(Value *RetVal)
This file provides an implementation of debug counters.
#define DEBUG_COUNTER(VARNAME, COUNTERNAME, DESC)
This file defines the DenseMap class.
This file builds on the ADT/GraphTraits.h file to build generic depth first graph iterator.
@ Default
static bool runOnFunction(Function &F, bool PostInlining)
This is the interface for a simple mod/ref and alias analysis over globals.
static size_t TypeSizeToSizeIndex(uint32_t TypeSize)
#define op(i)
Hexagon Common GEP
#define _
Module.h This file contains the declarations for the Module class.
static LVOptions Options
Definition LVOptions.cpp:25
#define F(x, y, z)
Definition MD5.cpp:55
#define I(x, y, z)
Definition MD5.cpp:58
Machine Check Debug Module
static const PlatformMemoryMapParams Linux_S390_MemoryMapParams
static const Align kMinOriginAlignment
static cl::opt< uint64_t > ClShadowBase("msan-shadow-base", cl::desc("Define custom MSan ShadowBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClPoisonUndef("msan-poison-undef", cl::desc("Poison fully undef temporary values. " "Partially undefined constant vectors " "are unaffected by this flag (see " "-msan-poison-undef-vectors)."), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_X86_MemoryMapParams
static cl::opt< uint64_t > ClOriginBase("msan-origin-base", cl::desc("Define custom MSan OriginBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClCheckConstantShadow("msan-check-constant-shadow", cl::desc("Insert checks for constant shadow values"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams
static const MemoryMapParams NetBSD_X86_64_MemoryMapParams
static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams
static const unsigned kOriginSize
static cl::opt< bool > ClWithComdat("msan-with-comdat", cl::desc("Place MSan constructors in comdat sections"), cl::Hidden, cl::init(false))
static cl::opt< int > ClTrackOrigins("msan-track-origins", cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden, cl::init(0))
Track origins of uninitialized values.
static cl::opt< int > ClInstrumentationWithCallThreshold("msan-instrumentation-with-call-threshold", cl::desc("If the function being instrumented requires more than " "this number of checks and origin stores, use callbacks instead of " "inline checks (-1 means never use callbacks)."), cl::Hidden, cl::init(3500))
static cl::opt< int > ClPoisonStackPattern("msan-poison-stack-pattern", cl::desc("poison uninitialized stack variables with the given pattern"), cl::Hidden, cl::init(0xff))
static const Align kShadowTLSAlignment
static cl::opt< bool > ClHandleICmpExact("msan-handle-icmp-exact", cl::desc("exact handling of relational integer ICmp"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams
static cl::opt< bool > ClDumpStrictInstructions("msan-dump-strict-instructions", cl::desc("print out instructions with default strict semantics i.e.," "check that all the inputs are fully initialized, and mark " "the output as fully initialized. These semantics are applied " "to instructions that could not be handled explicitly nor " "heuristically."), cl::Hidden, cl::init(false))
static Constant * getOrInsertGlobal(Module &M, StringRef Name, Type *Ty)
static cl::opt< bool > ClPreciseDisjointOr("msan-precise-disjoint-or", cl::desc("Precisely poison disjoint OR. If false (legacy behavior), " "disjointedness is ignored (i.e., 1|1 is initialized)."), cl::Hidden, cl::init(false))
static const MemoryMapParams Linux_S390X_MemoryMapParams
static cl::opt< bool > ClPoisonStack("msan-poison-stack", cl::desc("poison uninitialized stack variables"), cl::Hidden, cl::init(true))
static const MemoryMapParams Linux_I386_MemoryMapParams
const char kMsanInitName[]
static cl::opt< bool > ClPoisonUndefVectors("msan-poison-undef-vectors", cl::desc("Precisely poison partially undefined constant vectors. " "If false (legacy behavior), the entire vector is " "considered fully initialized, which may lead to false " "negatives. Fully undefined constant vectors are " "unaffected by this flag (see -msan-poison-undef)."), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPrintStackNames("msan-print-stack-names", cl::desc("Print name of local stack variable"), cl::Hidden, cl::init(true))
static cl::opt< uint64_t > ClAndMask("msan-and-mask", cl::desc("Define custom MSan AndMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleLifetimeIntrinsics("msan-handle-lifetime-intrinsics", cl::desc("when possible, poison scoped variables at the beginning of the scope " "(slower, but more precise)"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClKeepGoing("msan-keep-going", cl::desc("keep going after reporting a UMR"), cl::Hidden, cl::init(false))
static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams
static GlobalVariable * createPrivateConstGlobalForString(Module &M, StringRef Str)
Create a non-const global initialized with the given string.
static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClEagerChecks("msan-eager-checks", cl::desc("check arguments and return values at function call boundaries"), cl::Hidden, cl::init(false))
static cl::opt< int > ClDisambiguateWarning("msan-disambiguate-warning-threshold", cl::desc("Define threshold for number of checks per " "debug location to force origin update."), cl::Hidden, cl::init(3))
static VarArgHelper * CreateVarArgHelper(Function &Func, MemorySanitizer &Msan, MemorySanitizerVisitor &Visitor)
static const MemoryMapParams Linux_MIPS64_MemoryMapParams
static const MemoryMapParams Linux_PowerPC64_MemoryMapParams
static cl::opt< uint64_t > ClXorMask("msan-xor-mask", cl::desc("Define custom MSan XorMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleAsmConservative("msan-handle-asm-conservative", cl::desc("conservative handling of inline assembly"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams
static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams
static const unsigned kParamTLSSize
static cl::opt< bool > ClHandleICmp("msan-handle-icmp", cl::desc("propagate shadow through ICmpEQ and ICmpNE"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClEnableKmsan("msan-kernel", cl::desc("Enable KernelMemorySanitizer instrumentation"), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPoisonStackWithCall("msan-poison-stack-with-call", cl::desc("poison uninitialized stack variables with a call"), cl::Hidden, cl::init(false))
static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams
static cl::opt< bool > ClDumpHeuristicInstructions("msan-dump-heuristic-instructions", cl::desc("Prints 'unknown' instructions that were handled heuristically. " "Use -msan-dump-strict-instructions to print instructions that " "could not be handled explicitly nor heuristically."), cl::Hidden, cl::init(false))
static const unsigned kRetvalTLSSize
static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams
const char kMsanModuleCtorName[]
static const MemoryMapParams FreeBSD_I386_MemoryMapParams
static cl::opt< bool > ClCheckAccessAddress("msan-check-access-address", cl::desc("report accesses through a pointer which has poisoned shadow"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClDisableChecks("msan-disable-checks", cl::desc("Apply no_sanitize to the whole file"), cl::Hidden, cl::init(false))
#define T
FunctionAnalysisManager FAM
if(PassOpts->AAPipeline)
const SmallVectorImpl< MachineOperand > & Cond
static const char * name
void visit(MachineFunction &MF, MachineBasicBlock &Start, std::function< void(MachineBasicBlock *)> op)
This file implements a set that has insertion order iteration characteristics.
This file defines the SmallPtrSet class.
This file defines the SmallVector class.
This file contains some functions that are useful when dealing with strings.
#define LLVM_DEBUG(...)
Definition Debug.h:114
static TableGen::Emitter::OptClass< SkeletonEmitter > X("gen-skeleton-class", "Generate example skeleton class")
static SymbolRef::Type getType(const Symbol *Sym)
Definition TapiFile.cpp:39
Value * RHS
Value * LHS
static APInt getSignedMinValue(unsigned numBits)
Gets minimum signed value of APInt for a specific bit width.
Definition APInt.h:220
void setAlignment(Align Align)
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
const T & front() const
front - Get the first element.
Definition ArrayRef.h:146
static LLVM_ABI ArrayType * get(Type *ElementType, uint64_t NumElements)
This static method is the primary way to construct an ArrayType.
This class stores enough information to efficiently remove some attributes from an existing AttrBuild...
AttributeMask & addAttribute(Attribute::AttrKind Val)
Add an attribute to the mask.
iterator end()
Definition BasicBlock.h:472
LLVM_ABI const_iterator getFirstInsertionPt() const
Returns an iterator to the first instruction in this block that is suitable for inserting a non-PHI i...
LLVM_ABI const BasicBlock * getSinglePredecessor() const
Return the predecessor of this block if it has a single predecessor block.
InstListType::iterator iterator
Instruction iterators...
Definition BasicBlock.h:170
bool isInlineAsm() const
Check if this call is an inline asm statement.
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation or the function signa...
bool hasRetAttr(Attribute::AttrKind Kind) const
Determine whether the return value has the given attribute.
LLVM_ABI bool paramHasAttr(unsigned ArgNo, Attribute::AttrKind Kind) const
Determine whether the argument or parameter has the given attribute.
void removeFnAttrs(const AttributeMask &AttrsToRemove)
Removes the attributes from the function.
void setCannotMerge()
MaybeAlign getParamAlign(unsigned ArgNo) const
Extract the alignment for a call or parameter (0=unknown).
Type * getParamByValType(unsigned ArgNo) const
Extract the byval type for a call or parameter.
Value * getCalledOperand() const
Type * getParamElementType(unsigned ArgNo) const
Extract the elementtype type for a parameter.
Value * getArgOperand(unsigned i) const
void setArgOperand(unsigned i, Value *v)
FunctionType * getFunctionType() const
iterator_range< User::op_iterator > args()
Iteration adapter for range-for loops.
void addParamAttr(unsigned ArgNo, Attribute::AttrKind Kind)
Adds the attribute to the indicated argument.
Predicate
This enumeration lists the possible predicates for CmpInst subclasses.
Definition InstrTypes.h:676
@ ICMP_SLT
signed less than
Definition InstrTypes.h:705
@ ICMP_SLE
signed less or equal
Definition InstrTypes.h:706
@ ICMP_SGT
signed greater than
Definition InstrTypes.h:703
@ ICMP_SGE
signed greater or equal
Definition InstrTypes.h:704
static LLVM_ABI Constant * get(ArrayType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getString(LLVMContext &Context, StringRef Initializer, bool AddNull=true)
This method constructs a CDS and initializes it with a text string.
static LLVM_ABI Constant * get(LLVMContext &Context, ArrayRef< uint8_t > Elts)
get() constructors - Return a constant with vector type with an element count and element type matchi...
static ConstantInt * getSigned(IntegerType *Ty, int64_t V)
Return a ConstantInt with the specified value for the specified type.
Definition Constants.h:131
static LLVM_ABI ConstantInt * getBool(LLVMContext &Context, bool V)
static LLVM_ABI Constant * get(StructType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getSplat(ElementCount EC, Constant *Elt)
Return a ConstantVector with the specified constant in each element.
static LLVM_ABI Constant * get(ArrayRef< Constant * > V)
This is an important base class in LLVM.
Definition Constant.h:43
static LLVM_ABI Constant * getAllOnesValue(Type *Ty)
LLVM_ABI bool isAllOnesValue() const
Return true if this is the value that would be returned by getAllOnesValue.
static LLVM_ABI Constant * getNullValue(Type *Ty)
Constructor to create a '0' constant of arbitrary type.
LLVM_ABI Constant * getAggregateElement(unsigned Elt) const
For aggregates (struct/array/vector) return the constant that corresponds to the specified element if...
LLVM_ABI bool isZeroValue() const
Return true if the value is negative zero or null value.
Definition Constants.cpp:76
LLVM_ABI bool isNullValue() const
Return true if this is the value that would be returned by getNullValue.
Definition Constants.cpp:90
static bool shouldExecute(unsigned CounterName)
bool empty() const
Definition DenseMap.h:109
unsigned getNumElements() const
static LLVM_ABI FixedVectorType * get(Type *ElementType, unsigned NumElts)
Definition Type.cpp:803
static FixedVectorType * getHalfElementsVectorType(FixedVectorType *VTy)
A handy container for a FunctionType+Callee-pointer pair, which can be passed around as a single enti...
unsigned getNumParams() const
Return the number of fixed parameters this function type requires.
LLVM_ABI void setComdat(Comdat *C)
Definition Globals.cpp:214
@ PrivateLinkage
Like Internal, but omit from symbol table.
Definition GlobalValue.h:61
@ ExternalLinkage
Externally visible function.
Definition GlobalValue.h:53
Analysis pass providing a never-invalidated alias analysis result.
ConstantInt * getInt1(bool V)
Get a constant value representing either true or false.
Definition IRBuilder.h:497
Value * CreateInsertElement(Type *VecTy, Value *NewElt, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2579
Value * CreateConstGEP1_32(Type *Ty, Value *Ptr, unsigned Idx0, const Twine &Name="")
Definition IRBuilder.h:1939
AllocaInst * CreateAlloca(Type *Ty, unsigned AddrSpace, Value *ArraySize=nullptr, const Twine &Name="")
Definition IRBuilder.h:1833
IntegerType * getInt1Ty()
Fetch the type representing a single bit.
Definition IRBuilder.h:547
LLVM_ABI CallInst * CreateMaskedCompressStore(Value *Val, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr)
Create a call to Masked Compress Store intrinsic.
Value * CreateInsertValue(Value *Agg, Value *Val, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2633
Value * CreateExtractElement(Value *Vec, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2567
IntegerType * getIntNTy(unsigned N)
Fetch the type representing an N-bit integer.
Definition IRBuilder.h:575
LoadInst * CreateAlignedLoad(Type *Ty, Value *Ptr, MaybeAlign Align, const char *Name)
Definition IRBuilder.h:1867
Value * CreateZExtOrTrunc(Value *V, Type *DestTy, const Twine &Name="")
Create a ZExt or Trunc from the integer value V to DestTy.
Definition IRBuilder.h:2103
CallInst * CreateMemCpy(Value *Dst, MaybeAlign DstAlign, Value *Src, MaybeAlign SrcAlign, uint64_t Size, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memcpy between the specified pointers.
Definition IRBuilder.h:687
LLVM_ABI CallInst * CreateAndReduce(Value *Src)
Create a vector int AND reduction intrinsic of the source vector.
Value * CreatePointerCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2254
Value * CreateExtractValue(Value *Agg, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2626
LLVM_ABI CallInst * CreateMaskedLoad(Type *Ty, Value *Ptr, Align Alignment, Value *Mask, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Load intrinsic.
LLVM_ABI Value * CreateSelect(Value *C, Value *True, Value *False, const Twine &Name="", Instruction *MDFrom=nullptr)
BasicBlock::iterator GetInsertPoint() const
Definition IRBuilder.h:202
Value * CreateSExt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2097
Value * CreateIntToPtr(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2202
Value * CreateLShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1513
IntegerType * getInt32Ty()
Fetch the type representing a 32-bit integer.
Definition IRBuilder.h:562
ConstantInt * getInt8(uint8_t C)
Get a constant 8-bit value.
Definition IRBuilder.h:512
Value * CreatePtrAdd(Value *Ptr, Value *Offset, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:2039
IntegerType * getInt64Ty()
Fetch the type representing a 64-bit integer.
Definition IRBuilder.h:567
Value * CreateUDiv(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1454
Value * CreateICmpNE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2336
Value * CreateGEP(Type *Ty, Value *Ptr, ArrayRef< Value * > IdxList, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:1926
Value * CreateNeg(Value *V, const Twine &Name="", bool HasNSW=false)
Definition IRBuilder.h:1784
LLVM_ABI CallInst * CreateOrReduce(Value *Src)
Create a vector int OR reduction intrinsic of the source vector.
LLVM_ABI Value * CreateBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with 2 operands which is mangled on the first type.
LLVM_ABI CallInst * CreateIntrinsic(Intrinsic::ID ID, ArrayRef< Type * > Types, ArrayRef< Value * > Args, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with Args, mangled using Types.
ConstantInt * getInt32(uint32_t C)
Get a constant 32-bit value.
Definition IRBuilder.h:522
PHINode * CreatePHI(Type *Ty, unsigned NumReservedValues, const Twine &Name="")
Definition IRBuilder.h:2497
Value * CreateNot(Value *V, const Twine &Name="")
Definition IRBuilder.h:1808
Value * CreateICmpEQ(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2332
LLVM_ABI DebugLoc getCurrentDebugLocation() const
Get location information used by debugging information.
Definition IRBuilder.cpp:64
Value * CreateSub(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1420
Value * CreateBitCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2207
ConstantInt * getIntN(unsigned N, uint64_t C)
Get a constant N-bit value, zero extended or truncated from a 64-bit value.
Definition IRBuilder.h:533
LoadInst * CreateLoad(Type *Ty, Value *Ptr, const char *Name)
Provided to resolve 'CreateLoad(Ty, Ptr, "...")' correctly, instead of converting the string to 'bool...
Definition IRBuilder.h:1850
Value * CreateShl(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1492
CallInst * CreateMemSet(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memset to the specified pointer and the specified value.
Definition IRBuilder.h:630
Value * CreateZExt(Value *V, Type *DestTy, const Twine &Name="", bool IsNonNeg=false)
Definition IRBuilder.h:2085
Value * CreateShuffleVector(Value *V1, Value *V2, Value *Mask, const Twine &Name="")
Definition IRBuilder.h:2601
LLVMContext & getContext() const
Definition IRBuilder.h:203
Value * CreateAnd(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1551
StoreInst * CreateStore(Value *Val, Value *Ptr, bool isVolatile=false)
Definition IRBuilder.h:1863
LLVM_ABI CallInst * CreateMaskedStore(Value *Val, Value *Ptr, Align Alignment, Value *Mask)
Create a call to Masked Store intrinsic.
Value * CreateAdd(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1403
Value * CreatePtrToInt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2197
Value * CreateIsNotNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg != 0.
Definition IRBuilder.h:2659
CallInst * CreateCall(FunctionType *FTy, Value *Callee, ArrayRef< Value * > Args={}, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:2511
Value * CreateTrunc(Value *V, Type *DestTy, const Twine &Name="", bool IsNUW=false, bool IsNSW=false)
Definition IRBuilder.h:2071
PointerType * getPtrTy(unsigned AddrSpace=0)
Fetch the type representing a pointer.
Definition IRBuilder.h:605
Value * CreateBinOp(Instruction::BinaryOps Opc, Value *LHS, Value *RHS, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:1708
Value * CreateICmpSLT(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2364
LLVM_ABI Value * CreateTypeSize(Type *Ty, TypeSize Size)
Create an expression which evaluates to the number of units in Size at runtime.
Value * CreateICmpUGE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2344
Value * CreateIntCast(Value *V, Type *DestTy, bool isSigned, const Twine &Name="")
Definition IRBuilder.h:2280
Value * CreateIsNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg == 0.
Definition IRBuilder.h:2654
void SetInsertPoint(BasicBlock *TheBB)
This specifies that created instructions should be appended to the end of the specified block.
Definition IRBuilder.h:207
Type * getVoidTy()
Fetch the type representing void.
Definition IRBuilder.h:600
StoreInst * CreateAlignedStore(Value *Val, Value *Ptr, MaybeAlign Align, bool isVolatile=false)
Definition IRBuilder.h:1886
LLVM_ABI CallInst * CreateMaskedExpandLoad(Type *Ty, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Expand Load intrinsic.
Value * CreateInBoundsPtrAdd(Value *Ptr, Value *Offset, const Twine &Name="")
Definition IRBuilder.h:2044
Value * CreateAShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1532
Value * CreateXor(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1599
Value * CreateICmp(CmpInst::Predicate P, Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2442
Value * CreateOr(Value *LHS, Value *RHS, const Twine &Name="", bool IsDisjoint=false)
Definition IRBuilder.h:1573
IntegerType * getInt8Ty()
Fetch the type representing an 8-bit integer.
Definition IRBuilder.h:552
Value * CreateMul(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1437
LLVM_ABI CallInst * CreateMaskedScatter(Value *Val, Value *Ptrs, Align Alignment, Value *Mask=nullptr)
Create a call to Masked Scatter intrinsic.
LLVM_ABI CallInst * CreateMaskedGather(Type *Ty, Value *Ptrs, Align Alignment, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Gather intrinsic.
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition IRBuilder.h:2788
std::vector< ConstraintInfo > ConstraintInfoVector
Definition InlineAsm.h:123
void visit(Iterator Start, Iterator End)
Definition InstVisitor.h:87
const DebugLoc & getDebugLoc() const
Return the debug location for this node as a DebugLoc.
LLVM_ABI InstListType::iterator eraseFromParent()
This method unlinks 'this' from the containing basic block and deletes it.
MDNode * getMetadata(unsigned KindID) const
Get the metadata of given kind attached to this Instruction.
LLVM_ABI bool comesBefore(const Instruction *Other) const
Given an instruction Other in the same basic block as this instruction, return true if this instructi...
static LLVM_ABI IntegerType * get(LLVMContext &C, unsigned NumBits)
This static method is the primary way of constructing an IntegerType.
Definition Type.cpp:319
LLVM_ABI MDNode * createUnlikelyBranchWeights()
Return metadata containing two branch weights, with significant bias towards false destination.
Definition MDBuilder.cpp:48
A Module instance is used to store all the information related to an LLVM module.
Definition Module.h:67
void addIncoming(Value *V, BasicBlock *BB)
Add an incoming value to the end of the PHI list.
static LLVM_ABI PoisonValue * get(Type *T)
Static factory methods - Return an 'poison' object of the specified type.
A set of analyses that are preserved following a run of a transformation pass.
Definition Analysis.h:112
static PreservedAnalyses none()
Convenience factory function for the empty preserved set.
Definition Analysis.h:115
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition Analysis.h:118
PreservedAnalyses & abandon()
Mark an analysis as abandoned.
Definition Analysis.h:171
bool remove(const value_type &X)
Remove an item from the set vector.
Definition SetVector.h:180
bool insert(const value_type &X)
Insert a new element into the SetVector.
Definition SetVector.h:150
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
void push_back(const T &Elt)
StringRef - Represent a constant reference to a string, i.e.
Definition StringRef.h:55
static LLVM_ABI StructType * get(LLVMContext &Context, ArrayRef< Type * > Elements, bool isPacked=false)
This static method is the primary way to create a literal StructType.
Definition Type.cpp:414
unsigned getNumElements() const
Random access to the elements.
Type * getElementType(unsigned N) const
Analysis pass providing the TargetLibraryInfo.
Provides information about what library functions are available for the current target.
AttributeList getAttrList(LLVMContext *C, ArrayRef< unsigned > ArgNos, bool Signed, bool Ret=false, AttributeList AL=AttributeList()) const
bool getLibFunc(StringRef funcName, LibFunc &F) const
Searches for a particular function name.
Triple - Helper class for working with autoconf configuration names.
Definition Triple.h:47
bool isMIPS64() const
Tests whether the target is MIPS 64-bit (little and big endian).
Definition Triple.h:1040
@ loongarch64
Definition Triple.h:65
bool isRISCV32() const
Tests whether the target is 32-bit RISC-V.
Definition Triple.h:1083
bool isPPC32() const
Tests whether the target is 32-bit PowerPC (little and big endian).
Definition Triple.h:1056
ArchType getArch() const
Get the parsed architecture type of this triple.
Definition Triple.h:413
bool isRISCV64() const
Tests whether the target is 64-bit RISC-V.
Definition Triple.h:1088
bool isLoongArch64() const
Tests whether the target is 64-bit LoongArch.
Definition Triple.h:1029
bool isMIPS32() const
Tests whether the target is MIPS 32-bit (little and big endian).
Definition Triple.h:1035
bool isARM() const
Tests whether the target is ARM (little and big endian).
Definition Triple.h:923
bool isPPC64() const
Tests whether the target is 64-bit PowerPC (little and big endian).
Definition Triple.h:1061
bool isAArch64() const
Tests whether the target is AArch64 (little and big endian).
Definition Triple.h:1008
bool isSystemZ() const
Tests whether the target is SystemZ.
Definition Triple.h:1107
The instances of the Type class are immutable: once they are created, they are never changed.
Definition Type.h:45
LLVM_ABI unsigned getIntegerBitWidth() const
bool isVectorTy() const
True if this is an instance of VectorType.
Definition Type.h:273
bool isArrayTy() const
True if this is an instance of ArrayType.
Definition Type.h:264
LLVM_ABI bool isScalableTy(SmallPtrSetImpl< const Type * > &Visited) const
Return true if this is a type whose size is a known multiple of vscale.
Definition Type.cpp:62
bool isIntOrIntVectorTy() const
Return true if this is an integer type or a vector of integer types.
Definition Type.h:246
bool isPointerTy() const
True if this is an instance of PointerType.
Definition Type.h:267
Type * getArrayElementType() const
Definition Type.h:408
bool isPPC_FP128Ty() const
Return true if this is powerpc long double.
Definition Type.h:165
static LLVM_ABI Type * getVoidTy(LLVMContext &C)
Definition Type.cpp:281
Type * getScalarType() const
If this is a vector type, return the element type, otherwise return 'this'.
Definition Type.h:352
LLVM_ABI TypeSize getPrimitiveSizeInBits() const LLVM_READONLY
Return the basic size of this type if it is a primitive type.
Definition Type.cpp:198
bool isSized(SmallPtrSetImpl< Type * > *Visited=nullptr) const
Return true if it makes sense to take the size of this type.
Definition Type.h:311
LLVM_ABI unsigned getScalarSizeInBits() const LLVM_READONLY
If this is a vector type, return the getPrimitiveSizeInBits value for the element type.
Definition Type.cpp:231
bool isFloatingPointTy() const
Return true if this is one of the floating-point types.
Definition Type.h:184
bool isIntOrPtrTy() const
Return true if this is an integer type or a pointer type.
Definition Type.h:255
bool isIntegerTy() const
True if this is an instance of IntegerType.
Definition Type.h:240
bool isFPOrFPVectorTy() const
Return true if this is a FP type or a vector of FP.
Definition Type.h:225
bool isVoidTy() const
Return true if this is 'void'.
Definition Type.h:139
Value * getOperand(unsigned i) const
Definition User.h:232
unsigned getNumOperands() const
Definition User.h:254
size_type count(const KeyT &Val) const
Return 1 if the specified key is in the map, 0 otherwise.
Definition ValueMap.h:156
Type * getType() const
All values are typed, get the type of this value.
Definition Value.h:256
LLVM_ABI void setName(const Twine &Name)
Change the name of the value.
Definition Value.cpp:382
LLVM_ABI StringRef getName() const
Return a constant reference to the value's name.
Definition Value.cpp:314
ElementCount getElementCount() const
Return an ElementCount instance to represent the (possibly scalable) number of elements in the vector...
Type * getElementType() const
int getNumOccurrences() const
constexpr ScalarTy getFixedValue() const
Definition TypeSize.h:201
constexpr bool isScalable() const
Returns whether the quantity is scaled by a runtime quantity (vscale).
Definition TypeSize.h:169
An efficient, type-erasing, non-owning reference to a callable.
const ParentTy * getParent() const
Definition ilist_node.h:34
self_iterator getIterator()
Definition ilist_node.h:123
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition raw_ostream.h:53
CallInst * Call
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
constexpr char Align[]
Key for Kernel::Arg::Metadata::mAlign.
constexpr std::underlying_type_t< E > Mask()
Get a bitmask with 1s in all places up to the high-order bit of E's largest value.
@ C
The default llvm calling convention, compatible with C.
Definition CallingConv.h:34
@ BasicBlock
Various leaf nodes.
Definition ISDOpcodes.h:81
initializer< Ty > init(const Ty &Val)
Function * Kernel
Summary of a kernel (=entry point for target offloading).
Definition OpenMPOpt.h:21
NodeAddr< FuncNode * > Func
Definition RDFGraph.h:393
friend class Instruction
Iterator for Instructions in a `BasicBlock.
Definition BasicBlock.h:73
This is an optimization pass for GlobalISel generic memory operations.
unsigned Log2_32_Ceil(uint32_t Value)
Return the ceil log base 2 of the specified value, 32 if the value is zero.
Definition MathExtras.h:344
FunctionAddr VTableAddr Value
Definition InstrProf.h:137
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition STLExtras.h:1655
auto enumerate(FirstRange &&First, RestRanges &&...Rest)
Given two or more input ranges, returns a new range whose values are tuples (A, B,...
Definition STLExtras.h:2472
decltype(auto) dyn_cast(const From &Val)
dyn_cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:643
@ Done
Definition Threading.h:60
bool isAligned(Align Lhs, uint64_t SizeInBytes)
Checks that SizeInBytes is a multiple of the alignment.
Definition Alignment.h:134
LLVM_ABI std::pair< Instruction *, Value * > SplitBlockAndInsertSimpleForLoop(Value *End, BasicBlock::iterator SplitBefore)
Insert a for (int i = 0; i < End; i++) loop structure (with the exception that End is assumed > 0,...
InnerAnalysisManagerProxy< FunctionAnalysisManager, Module > FunctionAnalysisManagerModuleProxy
Provide the FunctionAnalysisManager to Module proxy.
constexpr bool isPowerOf2_64(uint64_t Value)
Return true if the argument is a power of two > 0 (64 bit edition.)
Definition MathExtras.h:284
unsigned Log2_64(uint64_t Value)
Return the floor log base 2 of the specified value, -1 if the value is zero.
Definition MathExtras.h:337
auto dyn_cast_or_null(const Y &Val)
Definition Casting.h:753
LLVM_ABI std::pair< Function *, FunctionCallee > getOrCreateSanitizerCtorAndInitFunctions(Module &M, StringRef CtorName, StringRef InitName, ArrayRef< Type * > InitArgTypes, ArrayRef< Value * > InitArgs, function_ref< void(Function *, FunctionCallee)> FunctionsCreatedCallback, StringRef VersionCheckName=StringRef(), bool Weak=false)
Creates sanitizer constructor function lazily.
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition Debug.cpp:207
LLVM_ABI void report_fatal_error(Error Err, bool gen_crash_diag=true)
Definition Error.cpp:167
class LLVM_GSL_OWNER SmallVector
Forward declaration of SmallVector so that calculateSmallVectorDefaultInlinedElements can reference s...
bool isa(const From &Val)
isa<X> - Return true if the parameter to the template is an instance of one of the template type argu...
Definition Casting.h:547
LLVM_ABI bool isKnownNonZero(const Value *V, const SimplifyQuery &Q, unsigned Depth=0)
Return true if the given value is known to be non-zero when defined.
LLVM_ABI raw_fd_ostream & errs()
This returns a reference to a raw_ostream for standard error.
AtomicOrdering
Atomic ordering for LLVM's memory model.
@ First
Helpers to iterate all locations in the MemoryEffectsBase class.
Definition ModRef.h:71
IRBuilder(LLVMContext &, FolderTy, InserterTy, MDNode *, ArrayRef< OperandBundleDef >) -> IRBuilder< FolderTy, InserterTy >
@ Or
Bitwise or logical OR of integers.
@ And
Bitwise or logical AND of integers.
@ Add
Sum of integers.
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition Alignment.h:144
DWARFExpression::Operation Op
RoundingMode
Rounding mode.
ArrayRef(const T &OneElt) -> ArrayRef< T >
constexpr unsigned BitWidth
LLVM_ABI void appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data=nullptr)
Append F to the list of global ctors of module M with the given Priority.
decltype(auto) cast(const From &Val)
cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:559
iterator_range< df_iterator< T > > depth_first(const T &G)
LLVM_ABI Instruction * SplitBlockAndInsertIfThen(Value *Cond, BasicBlock::iterator SplitBefore, bool Unreachable, MDNode *BranchWeights=nullptr, DomTreeUpdater *DTU=nullptr, LoopInfo *LI=nullptr, BasicBlock *ThenBlock=nullptr)
Split the containing block at the specified instruction - everything before SplitBefore stays in the ...
LLVM_ABI void maybeMarkSanitizerLibraryCallNoBuiltin(CallInst *CI, const TargetLibraryInfo *TLI)
Given a CallInst, check if it calls a string function known to CodeGen, and mark it with NoBuiltin if...
Definition Local.cpp:3865
LLVM_ABI bool removeUnreachableBlocks(Function &F, DomTreeUpdater *DTU=nullptr, MemorySSAUpdater *MSSAU=nullptr)
Remove all blocks that can not be reached from the function's entry.
Definition Local.cpp:2883
LLVM_ABI bool checkIfAlreadyInstrumented(Module &M, StringRef Flag)
Check if module has flag attached, if not add the flag.
std::string itostr(int64_t X)
AnalysisManager< Module > ModuleAnalysisManager
Convenience typedef for the Module analysis manager.
Definition MIRParser.h:39
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition Alignment.h:39
constexpr uint64_t value() const
This is a hole in the type system and should not be abused.
Definition Alignment.h:77
LLVM_ABI void printPipeline(raw_ostream &OS, function_ref< StringRef(StringRef)> MapClassName2PassName)
LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition PassManager.h:70