#include "llvm/Transforms/IPO/LowerTypeTests.h"
enum  { BitsPerByte = 8 } 
ByteArrayBuilder ()  
void  allocate (const std::set< uint64_t > &Bits, uint64_t BitSize, uint64_t &AllocByteOffset, uint8_t &AllocMask) 
std::vector< uint8_t >  Bytes 
uint64_t  BitAllocs [BitsPerByte] 
By loading from indexed offsets into the byte array and applying a mask, a program can test bits from the bit set with a relatively short instruction sequence. For example, suppose we have 15 bit sets to lay out:
A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits), F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits), L (4 bits), M (3 bits), N (2 bits), O (1 bit)
These bits can be laid out in a 16byte array like this:
Byte Offset 0123456789ABCDEF
Bit 7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA
For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done in 12 machine instructions on x86, or 46 instructions on ARM.
This is a byte array, rather than (say) a 2byte array or a 4byte array, because for one thing it gives us better packing (the more bins there are, the less evenly they will be filled), and for another, the instruction sequences can be slightly shorter, both on x86 and ARM.
BitsPerByte 
inline 
void ByteArrayBuilder::allocate  (  const std::set< uint64_t > &  Bits, 
uint64_t  BitSize,  
uint64_t &  AllocByteOffset,  
uint8_t &  AllocMask  
) 
AllocByteOffset is set to the offset within the byte array and AllocMask is set to the bitmask for those bits. This uses the LPT (Longest Processing Time) multiprocessor scheduling algorithm to lay out the bits efficiently; the pass allocates bit sets in decreasing size order.
uint64_t llvm::lowertypetests::ByteArrayBuilder::BitAllocs[BitsPerByte] 
std::vector<uint8_t> llvm::lowertypetests::ByteArrayBuilder::Bytes 
