The LLVM Instrumentor Pass¶

Introduction ¶

The Instrumentor is a highly configurable instrumentation pass for LLVM-IR that allows users to insert custom runtime function calls at various program points. Unlike traditional instrumentation tools that are hardcoded for specific purposes (like sanitizers or profilers), the Instrumentor provides a flexible, configuration-driven approach where users can specify:

What to instrument (loads, stores, allocations, function calls, etc.)
Where to instrument (before or after operations)
What information to pass to the runtime (pointers, values, sizes, types, etc.)
Whether to modify program behavior (replace values, redirect pointers, etc.)

The Instrumentor is designed to support a wide variety of use cases including:

Custom memory profilers and trackers
Performance analysis tools
Dynamic program analysis
Debugging and tracing utilities
Stack usage monitoring
Custom sanitizers and checkers

To use the Instrumentor it is recommended to run the wizard script located at ./llvm/utils/instrumentor-config-wizard.py. The script will interactively create a configuration file and a stub runtime which is required to be linked into the instrumented program.

Key Features ¶

Configurable Instrumentation Opportunities ¶

The Instrumentor supports instrumentation at multiple levels:

Instruction-level:

Load instructions: Instrument memory reads with access to pointer, loaded value, alignment, size, atomicity, etc.
Store instructions: Instrument memory writes with access to pointer, stored value, alignment, size, atomicity, etc.
Alloca instructions: Instrument stack allocations with access to size, alignment, and allocated address

Function-level:

Function entry: Instrument at function start with access to function name, address, arguments, etc.
Function exit: Instrument at function return

Future extensions:

Basic block entry/exit
Module-level initialization
Global variable access

PRE and POST Instrumentation ¶

Each instrumentation opportunity supports two positions:

PRE: Insert instrumentation before the operation occurs
- For loads: can inspect/modify the pointer before reading
- For stores: can inspect/modify the pointer and value before writing
- For allocas: can modify the allocation size
- For functions: instrument at function entry, inspect/replace arguments
POST: Insert instrumentation after the operation occurs
- For loads: can inspect/modify the loaded value
- For stores: instrument after the write completes
- For allocas: can inspect/modify the allocated address
- For functions: instrument at function exit

Selective Argument Passing ¶

For each instrumentation opportunity, users can individually enable/disable specific arguments to control:

What information is passed to the runtime function
The signature of the generated runtime function
Performance overhead (fewer arguments = faster calls)

For example, for load instrumentation, you can choose to pass:

Pointer address
Pointer address space
Loaded value
Value size
Alignment
Value type ID
Atomicity ordering
Synchronization scope
Volatility flag
Unique instrumentation ID

Value Replacement ¶

The Instrumentor supports replacing values returned from the runtime:

Load replacement: The runtime can provide a different value than what was loaded from memory
Store replacement: The runtime can modify the pointer or value being stored
Alloca replacement: The runtime can provide a different allocation size or replace the allocated address
Argument replacement: The runtime can modify the arguments passed to a function

This enables use cases like:

Value redirection for debugging
Custom memory allocators
Fault injection
Taint tracking

Instrumentation Filtering ¶

The Instrumentor provides fine-grained control over what gets instrumented:

Target regex: Match against the target triple (e.g., x86_64-.*-linux)
Host/GPU toggle: Separately enable/disable CPU and GPU instrumentation
Function filtering: Exclude runtime functions from instrumentation via a regular expression
Property filters: Filter individual instrumentation points based on static properties (see Property Filtering below)

Property Filtering ¶

The Instrumentor supports fine-grained filtering of individual instrumentation opportunities based on their static properties. This allows you to instrument only specific operations that meet certain criteria, such as:

Only atomic loads with specific orderings
Only volatile memory accesses
Functions with specific name patterns
Allocations above a certain size threshold

Property filters are specified using the filter field in the configuration JSON for each instrumentation opportunity.

Filter Syntax ¶

The filter expression language supports:

Integer comparisons:

== (equal)
!= (not equal)
< (less than)
> (greater than)
<= (less than or equal)
>= (greater than or equal)

String comparisons:

== (equal, with quoted string)
!= (not equal, with quoted string)
.startswith(\"prefix\") (prefix check, with quoted string)

Pointer comparisons:

==null (null pointer check)
!=null (non-null pointer check)

Logical operators:

&& (logical AND)
|| (logical OR)

Important notes:

If a property value is dynamic (not a compile-time constant), the filter is assumed to pass
Empty filters always pass
String literals must be enclosed in double quotes (with proper escaping)
Property names are specific to each instrumentation opportunity (see Available Properties below)

Filter Examples ¶

Filter only atomic loads:

{
  "instruction_post": {
    "load": {
      "enabled": true,
      "filter": "atomicity_ordering>0",
      "pointer": true,
      "value": true
    }
  }
}

Filter volatile stores or acquire and release operations:

{
  "instruction_post": {
    "store": {
      "enabled": true,
      "filter": "is_volatile==1 || atomicity_ordering==6",
      "pointer": true,
      "value": true
    }
  }
}

Filter functions by name prefix:

{
  "function_pre": {
    "function": {
      "enabled": true,
      "filter": "name.startswith(\"test_\")",
      "name": true
    }
  }
}

Complex filter with multiple conditions:

{
  "instruction_post": {
    "load": {
      "enabled": true,
      "filter": "(atomicity_ordering==4 || atomicity_ordering==7) && sync_scope_id==0",
      "pointer": true,
      "value": true,
      "atomicity_ordering": true
    }
  }
}

Available Properties ¶

The properties available for filtering depend on the instrumentation opportunity type but generally include all values that can be passed to the runtime, filtered using their respective name.

Load/Store instructions:

atomicity_ordering (integer): 0=non-atomic, 1=Unordered, 2=Monotonic, 4=Acquire, 5=Release, 6=AcquireRelease, 7=SequentiallyConsistent
sync_scope_id (integer): synchronization scope identifier
is_volatile (integer): 1 if volatile, 0 otherwise
alignment (integer): alignment in bytes
value_size (integer): size of loaded value in bytes

Function instrumentation:

name (string): function name
num_arguments (integer): number of function arguments
is_main (integer): 1 if this is the main function, 0 otherwise

Alloca instructions:

size (integer): allocation size in bytes (if constant)
alignment (integer): allocation alignment in bytes

Configuration System ¶

The Instrumentor uses a JSON-based configuration system that allows users to:

Generate a default configuration showing all available options
Interactively customize the configuration using the wizard
Load and modify existing configurations
Generate runtime stub implementations

Configuration File Format ¶

The configuration file is a JSON document with the following structure:

{
  "configuration": {
    "runtime_prefix": "__instrumentor_",
    "target_regex": "",
    "host_enabled": true,
    "gpu_enabled": true
  },
  "function_pre": {
    "function": {
      "enabled": true,
      "address": true,
      "name": true,
      "id": true
    }
  },
  "instruction_pre": {
    "load": {
      "enabled": true,
      "pointer": true,
      "pointer.replace": false,
      "value_size": true,
      "id": true
    },
    "store": {
      "enabled": true,
      "pointer": true,
      "value": true,
      "value_size": true
    }
  },
  "instruction_post": {
    "load": {
      "enabled": true,
      "value": true,
      "value.replace": false
    }
  }
}

Configuration Sections ¶

configuration

Global settings that apply to all instrumentation:

runtime_prefix: Prefix for all runtime function names (default: __instrumentor_)
target_regex: Regular expression to filter targets (empty = all targets)
host_enabled: Enable instrumentation for CPU targets (default: true)
gpu_enabled: Enable instrumentation for GPU targets (default: true)

function_pre / function_post

Function-level instrumentation configuration.

instruction_pre / instruction_post

Instruction-level instrumentation configuration, with subsections for each instruction type (load, store, alloca, etc.).

Argument Configuration ¶

For each instrumentation opportunity, arguments are configured with:

enabled: Boolean to enable/disable the entire opportunity
filter: Optional string expression to filter instrumentation based on static properties (see Property Filtering)
<argument_name>: Boolean to enable/disable passing this argument
<argument_name>.replace: Boolean to enable value replacement (only for replaceable arguments)
<argument_name>.description: Human-readable description of the argument

The Configuration Wizard ¶

The Instrumentor includes an interactive configuration wizard that simplifies the process of creating and modifying configurations.

Running the Wizard ¶

# Run the wizard interactively
./llvm/utils/instrumentor-config-wizard.py

# Specify output location
./llvm/utils/instrumentor-config-wizard.py -o my_config.json

# Use specific opt binary
./llvm/utils/instrumentor-config-wizard.py --opt-path /path/to/opt

# Load and modify existing configuration
./llvm/utils/instrumentor-config-wizard.py --input existing.json -o modified.json

Wizard Workflow ¶

The wizard guides you through five steps:

Step 1: Select Instrumentation Types

Choose which types of operations to instrument (load, store, alloca, function, etc.). This is a high-level selection - you can configure individual arguments later.

Step 2: PRE vs POST Configuration

Decide whether PRE and POST instrumentation should use the same configuration or different configurations. This saves time when you want both positions to have identical settings.

Step 3: Base Configuration

Configure global settings:

Runtime prefix for function names
Target regex for filtering
Enable/disable host (CPU) instrumentation
Enable/disable GPU instrumentation

Step 4: Configure Arguments

For each enabled instrumentation type, select which arguments to pass to the runtime function. You can:

Toggle individual arguments on/off
Enable value replacement for replaceable arguments
Enable all or disable all arguments
Configure PRE and POST separately (if selected in Step 2)

Step 5: Review and Save

Review your configuration and optionally generate runtime stub implementations. The wizard displays a summary and provides commands for using the configuration with opt and clang.

Generating Runtime Stubs ¶

The wizard can automatically generate C stub implementations of your runtime functions:

In Step 5, select ‘g’ to generate stubs
Specify the output file path (default: <config_name>_stubs.c)
The wizard creates a C file with stub implementations that print their arguments

The generated stubs are useful as:

Starting templates for implementing your runtime
Documentation of the expected function signatures
Quick prototypes for testing instrumentation

Example stub output:

void __instrumentor_pre_load(void *pointer, int32_t pointer_as,
                              uint64_t value_size, int32_t id) {
  printf("load pre -- pointer: %p, pointer_as: %i, "
         "value_size: %lu, id: %i\n",
         pointer, pointer_as, value_size, id);
}

Usage Examples ¶

Basic Usage with opt ¶

Step 1: (Optional) Generate a default configuration

opt -passes=instrumentor \
    -instrumentor-write-config-file=config.json \
    -disable-output \
    input.ll

This creates config.json with all available instrumentation opportunities and their arguments.

Step 2: Customize the configuration

Edit config.json manually or use the wizard (no input needed):

./llvm/utils/instrumentor-config-wizard.py --input config.json -o custom.json

Step 3: Apply instrumentation

opt -passes=instrumentor \
    -instrumentor-read-config-file=custom.json \
    input.ll -S -o instrumented.ll

The instrumented output contains calls to your runtime functions at the configured program points.

Using with Clang ¶

To instrument during compilation:

clang -mllvm -enable-instrumentor \
      -mllvm -instrumentor-read-config-file=config.json \
      source.c -o program

Complete Workflow Example ¶

Here’s a complete example for creating a simple memory access profiler:

1. Create configuration with the wizard:

./llvm/utils/instrumentor-config-wizard.py -o memory_profiler.json

# In the wizard:
# - Enable: load, store
# - Use same config for PRE/POST: yes
# - Base config: keep defaults
# - For load/store: enable pointer, value_size, id
# - Generate stubs: yes (memory_profiler_stubs.c)

2. Implement the runtime:

// memory_runtime.c
#include <stdio.h>
#include <stdint.h>

static uint64_t load_count = 0;
static uint64_t store_count = 0;

void __instrumentor_pre_load(void *pointer, uint64_t value_size,
                               int32_t id) {
  load_count++;
  printf("Load from %p (size: %lu, id: %d)\n",
         pointer, value_size, id);
}

void __instrumentor_pre_store(void *pointer, uint64_t value_size,
                                int32_t id) {
  store_count++;
  printf("Store to %p (size: %lu, id: %d)\n",
         pointer, value_size, id);
}

__attribute__((destructor))
void print_stats(void) {
  printf("Total loads: %lu\n", load_count);
  printf("Total stores: %lu\n", store_count);
}

3. Instrument and compile:

# Instrument the program
clang -emit-llvm -S -o program.ll program.c
opt -passes=instrumentor \
    -instrumentor-read-config-file=memory_profiler.json \
    program.ll -S -o program_inst.ll

# Compile with runtime
clang program_inst.ll memory_runtime.c -o program

4. Run and observe:

./program
# Output includes:
# Load from 0x7ffc12345678 (size: 4, id: 1)
# Store to 0x7ffc12345680 (size: 8, id: 2)
# ...
# Total loads: 42
# Total stores: 27

Advanced Use Cases ¶

Stack Usage Profiling ¶

Configure alloca instrumentation to track stack allocations:

{
  "instruction_pre": {
    "alloca": {
      "enabled": true,
      "size": true,
      "alignment": true,
      "id": true
    }
  },
  "instruction_post": {
    "alloca": {
      "enabled": true,
      "address": true,
      "size": true
    }
  }
}

Runtime implementation:

static uint64_t total_stack_usage = 0;
static uint64_t peak_stack_usage = 0;
static uint64_t current_stack_usage = 0;

void __instrumentor_post_alloca(void *address, uint64_t size,
                                  int32_t id) {
  current_stack_usage += size;
  total_stack_usage += size;
  if (current_stack_usage > peak_stack_usage) {
    peak_stack_usage = current_stack_usage;
  }
}

Value Replacement for Fault Injection ¶

Use value replacement to inject faults:

{
  "instruction_post": {
    "load": {
      "enabled": true,
      "value": true,
      "value.replace": true,
      "pointer": true
    }
  }
}

Runtime implementation:

// Replace every 1000th loaded value with zero
static uint64_t load_counter = 0;

uint64_t __instrumentor_post_load(uint64_t value, void *pointer) {
  if (++load_counter % 1000 == 0) {
    printf("Injecting fault at %p\n", pointer);
    return 0;  // Return fault value
  }
  return value;  // Return original value
}

Function-Level Tracing ¶

Instrument function entry and exit:

{
  "function_pre": {
    "function": {
      "enabled": true,
      "name": true,
      "address": true,
      "num_arguments": true
    }
  },
  "function_post": {
    "function": {
      "enabled": true,
      "name": true
    }
  }
}

Runtime implementation:

static int call_depth = 0;

void __instrumentor_pre_function(char *name, void *address,
                                   int32_t num_args, int32_t id) {
  printf("%*sEntering %s (%p) with %d args\n",
         call_depth * 2, "", name, address, num_args);
  call_depth++;
}

void __instrumentor_post_function(char *name, int32_t id) {
  call_depth--;
  printf("%*sExiting %s\n", call_depth * 2, "", name);
}

GPU Instrumentation ¶

The Instrumentor supports GPU targets (AMDGPU and NVPTX). Configure GPU-specific instrumentation:

{
  "configuration": {
    "runtime_prefix": "__gpu_runtime_",
    "target_regex": "(amdgcn|nvptx).*",
    "host_enabled": false,
    "gpu_enabled": true
  },
  "instruction_pre": {
    "load": {
      "enabled": true,
      "pointer": true,
      "pointer_as": true
    }
  }
}

Note that GPU runtime functions must be implemented with appropriate device attributes.

Implementation Details ¶

Generated Runtime Function Signatures ¶

The Instrumentor generates runtime function names following this pattern:

<runtime_prefix><position>_<opportunity_name>[_ind]

Where:

<runtime_prefix>: Configurable prefix (default: __instrumentor_)
<position>: Either pre or post
<opportunity_name>: Name of the instrumentation opportunity (load, store, function, etc.)
_ind: Optional suffix when indirection is used (see below)

Examples:

__instrumentor_pre_load
__instrumentor_post_store
__instrumentor_pre_function
__instrumentor_pre_load_ind (with indirection)

Direct vs Indirect Arguments ¶

The Instrumentor uses two modes for passing arguments:

Direct mode (default):

Arguments are passed by value. This is efficient but requires that all arguments fit in registers or can be passed through the stack efficiently.

Indirect mode:

Arguments are passed by pointer. This is used automatically when:

Multiple replaceable arguments are enabled (requires indirection for all replaceable args)
An argument’s value is too large (aggregate types, large values)

When indirect mode is used, a separate function with the _ind suffix is generated:

// Direct mode
void __instrumentor_pre_load(void *pointer, uint64_t value_size);

// Indirect mode (automatically generated when needed)
void __instrumentor_pre_load_ind(void **pointer, uint32_t pointer_size,
                                  void *value_size, uint32_t value_size_size);

Users typically don’t need to worry about this - the Instrumentor handles it automatically and the wizard-generated stubs show the correct signatures.

Unique IDs ¶

When the id argument is enabled, the Instrumentor assigns a unique 32-bit integer to each instrumentation call site:

PRE positions get positive IDs (1, 2, 3, …)
POST positions get negative IDs (-1, -2, -3, …)
IDs are consistent across multiple runs

Caching ¶

The Instrumentor caches certain argument values between PRE and POST calls when possible:

Values computed in PRE are reused in POST (e.g., pointer value)
This reduces overhead and ensures consistency

Runtime Function Requirements ¶

Runtime functions must be:

Defined with external linkage
Fast and non-blocking (to minimize instrumentation overhead)
Thread-safe if the program is multi-threaded

Runtime functions must not:

Call back into instrumented code (to avoid infinite recursion)

Performance Considerations ¶

Overhead Factors ¶

Instrumentation overhead depends on:

Number of instrumentation points: More instrumented operations = more overhead
Number of arguments passed: Each argument adds instructions and register pressure
Runtime function complexity: Complex runtime logic increases overhead
Frequency of instrumented operations: Instrumenting hot loops has high impact

Optimization Tips ¶

Minimize arguments:

Only enable arguments you actually need. Passing fewer arguments reduces overhead.

Use PRE or POST, not both:

If you only need one position, disable the other.

Target filtering:

Use target_regex to instrument only specific targets or modules.

Efficient runtime:

Keep runtime functions simple and fast. Consider:

Lock-free data structures
Thread-local storage
Batching outputs instead of per-call I/O
Sampling (instrument 1 in N calls)

Build with optimizations:

Use -O2 or -O3 when compiling instrumented code. LLVM can optimize away some overhead.

Troubleshooting ¶

Common Issues ¶

“Could not find ‘opt’ binary”

The wizard can’t locate the opt binary.

Specify the path: --opt-path /path/to/opt

“Indirection needed but not indicated”

An argument value is too large for direct passing. The Instrumentor handles this automatically, but you might see this warning. It’s usually harmless - the indirect version of the function will be generated.

Infinite recursion / stack overflow

Your runtime function is calling back into instrumented code. Solutions:

Ensure runtime functions don’t trigger more instrumentation

Linking errors

Runtime functions are undefined. You must:

Implement all enabled runtime functions
Link the runtime implementation with your program
Use the exact function signatures (check generated stubs)

Unexpected instrumentation

More instrumentation than expected. Check:

The enabled flag for each opportunity
host_enabled / gpu_enabled settings
target_regex matches your target
Runtime functions aren’t being instrumented (they should be automatically excluded)
Property filters (filter field) are correctly specified

Less instrumentation than expected

Property filters may be excluding instrumentation points:

Check if the filter field is set for the instrumentation opportunity
Remember that filters only apply to static (compile-time constant) properties
Dynamic values always pass the filter
Use empty filter ("filter": "") or remove the field to disable filtering
Test your filter expression by examining the IR properties

Filter syntax errors

Invalid filter expressions will be reported as errors:

Ensure string literals are quoted: "name==\"foo\"" not "name==foo"
Use correct operators: && for AND, || for OR
Property names must match exactly
Close all parentheses and quotes

Debugging Instrumented Code ¶

View instrumented IR:

opt -passes=instrumentor \
    -instrumentor-read-config-file=config.json \
    input.ll -S -o output.ll

# Examine output.ll to see inserted calls

Print configuration:

opt -passes=instrumentor \
    -instrumentor-write-config-file=debug_config.json \
    input.ll -disable-output

# Examine debug_config.json to see all options

Verify IR:: The Instrumentor automatically verifies the module after instrumentation. If verification fails, there’s a bug in the Instrumentor or the configuration is invalid.
Use debug builds:: Build LLVM with assertions enabled (-DLLVM_ENABLE_ASSERTIONS=ON) to catch issues early.

Extending the Instrumentor ¶

The Instrumentor is designed to be extensible. To add new instrumentation opportunities:

Define the opportunity class inheriting from InstrumentationOpportunity
Implement getter/setter functions for the arguments
Add initialization to populate the opportunity with arguments
Register the opportunity in InstrumentationConfig::populate()
Add tests in llvm/test/Transforms/Instrumentor/

See llvm/lib/Transforms/IPO/Instrumentor.cpp and llvm/include/llvm/Transforms/IPO/Instrumentor.h for examples (LoadIO, StoreIO).

Future instrumentation opportunities being considered:

Basic block entry/exit
Branch instrumentation
Call instructions
Atomic operations
Vector operations
Exception handling
Global variable access

Reference ¶

Command-Line Options ¶

-instrumentor-read-config-file=<path>: Load instrumentation configuration from the specified JSON file.
-instrumentor-write-config-file=<path>: Write the default instrumentation configuration to the specified JSON file (useful for generating templates).