LLVM 22.0.0git
llvm::cas::OnDiskTrieRawHashMap Class Reference

OnDiskTrieRawHashMap is a persistent trie data structure used as hash maps. More...

#include "llvm/CAS/OnDiskTrieRawHashMap.h"

Classes

class  const_pointer
struct  ConstValueProxy
 Const value proxy to access the records stored in TrieRawHashMap. More...
struct  ImplType
class  pointer
class  PointerImpl
 Template class to implement a pointer type into the trie data structure. More...
struct  ValueProxy
 Value proxy to access the records stored in TrieRawHashMap. More...

Public Types

using LazyInsertOnConstructCB
using LazyInsertOnLeakCB

Public Member Functions

LLVM_DUMP_METHOD void dump () const
void print (raw_ostream &OS, function_ref< void(ArrayRef< char >)> PrintRecordData=nullptr) const
Error validate (function_ref< Error(FileOffset, ConstValueProxy)> RecordVerifier) const
 Validate the trie data structure.
const_pointer find (ArrayRef< uint8_t > Hash) const
 Find the value from hash.
Expected< const_pointerrecoverFromFileOffset (FileOffset Offset) const
 Helper function to recover a pointer into the trie from file offset.
Expected< pointerinsertLazy (ArrayRef< uint8_t > Hash, LazyInsertOnConstructCB OnConstruct=nullptr, LazyInsertOnLeakCB OnLeak=nullptr)
 Insert lazily.
Expected< pointerinsert (const ConstValueProxy &Value)
size_t size () const
size_t capacity () const
 OnDiskTrieRawHashMap (OnDiskTrieRawHashMap &&RHS)
OnDiskTrieRawHashMapoperator= (OnDiskTrieRawHashMap &&RHS)
 ~OnDiskTrieRawHashMap ()

Static Public Member Functions

static bool validOffset (FileOffset Offset)
 Check the valid range of file offset for OnDiskTrieRawHashMap.
static Expected< OnDiskTrieRawHashMapcreate (const Twine &Path, const Twine &TrieName, size_t NumHashBits, uint64_t DataSize, uint64_t MaxFileSize, std::optional< uint64_t > NewFileInitialSize, std::optional< size_t > NewTableNumRootBits=std::nullopt, std::optional< size_t > NewTableNumSubtrieBits=std::nullopt)
 Gets or creates a file at Path with a hash-mapped trie named TrieName.

Detailed Description

OnDiskTrieRawHashMap is a persistent trie data structure used as hash maps.

The keys are fixed length, and are expected to be binary hashes with a normal distribution.

  • Thread-safety is achieved through the use of atomics within a shared memory mapping. Atomic access does not work on networked filesystems.
  • Filesystem locks are used, but only sparingly:
    • during initialization, for creating / opening an existing store;
    • for the lifetime of the instance, a shared/reader lock is held
    • during destruction, if there are no concurrent readers, to shrink the files to their minimum size.
  • Path is used as a directory:
    • "index" stores the root trie and subtries.
    • "data" stores (most of) the entries, like a bump-ptr-allocator.
    • Large entries are stored externally in a file named by the key.
  • Code is system-dependent and binary format itself is not portable. These are not artifacts that can/should be moved between different systems; they are only appropriate for local storage.

Definition at line 51 of file OnDiskTrieRawHashMap.h.

Member Typedef Documentation

◆ LazyInsertOnConstructCB

Initial value:
function_ref<void(FileOffset TentativeOffset, ValueProxy TentativeValue)>
FileOffset is a wrapper around uint64_t to represent the offset of data from the beginning of the fil...
Definition FileOffset.h:24
An efficient, type-erasing, non-owning reference to a callable.
Value proxy to access the records stored in TrieRawHashMap.

Definition at line 169 of file OnDiskTrieRawHashMap.h.

◆ LazyInsertOnLeakCB

Initial value:
function_ref<void(FileOffset TentativeOffset, ValueProxy TentativeValue,
FileOffset FinalOffset, ValueProxy FinalValue)>

Definition at line 171 of file OnDiskTrieRawHashMap.h.

Constructor & Destructor Documentation

◆ OnDiskTrieRawHashMap()

OnDiskTrieRawHashMap::OnDiskTrieRawHashMap ( OnDiskTrieRawHashMap && RHS)
default

References OnDiskTrieRawHashMap(), and RHS.

Referenced by OnDiskTrieRawHashMap(), and operator=().

◆ ~OnDiskTrieRawHashMap()

OnDiskTrieRawHashMap::~OnDiskTrieRawHashMap ( )
default

Member Function Documentation

◆ capacity()

size_t OnDiskTrieRawHashMap::capacity ( ) const

Definition at line 1168 of file OnDiskTrieRawHashMap.cpp.

◆ create()

Expected< OnDiskTrieRawHashMap > OnDiskTrieRawHashMap::create ( const Twine & Path,
const Twine & TrieName,
size_t NumHashBits,
uint64_t DataSize,
uint64_t MaxFileSize,
std::optional< uint64_t > NewFileInitialSize,
std::optional< size_t > NewTableNumRootBits = std::nullopt,
std::optional< size_t > NewTableNumSubtrieBits = std::nullopt )
static

Gets or creates a file at Path with a hash-mapped trie named TrieName.

The hash size is NumHashBits (in bits) and the records store data of size DataSize (in bytes).

MaxFileSize controls the maximum file size to support, limiting the size of the mapped_file_region. NewFileInitialSize is the starting size if a new file is created.

NewTableNumRootBits and NewTableNumSubtrieBits are hints to configure the trie, if it doesn't already exist.

Precondition
NumHashBits is a multiple of 8 (byte-aligned).

Definition at line 1127 of file OnDiskTrieRawHashMap.cpp.

References llvm::createStringError(), llvm::DataSize, and llvm::make_error_code().

◆ dump()

LLVM_DUMP_METHOD void llvm::cas::OnDiskTrieRawHashMap::dump ( ) const

References LLVM_DUMP_METHOD.

◆ find()

OnDiskTrieRawHashMap::const_pointer OnDiskTrieRawHashMap::find ( ArrayRef< uint8_t > Hash) const

Find the value from hash.

Returns
pointer to the value if exists, otherwise returns a non-value pointer that evaluates to false when convert to boolean.

Definition at line 1152 of file OnDiskTrieRawHashMap.cpp.

◆ insert()

Expected< pointer > llvm::cas::OnDiskTrieRawHashMap::insert ( const ConstValueProxy & Value)
inline

Definition at line 193 of file OnDiskTrieRawHashMap.h.

References insertLazy().

◆ insertLazy()

Expected< OnDiskTrieRawHashMap::pointer > OnDiskTrieRawHashMap::insertLazy ( ArrayRef< uint8_t > Hash,
LazyInsertOnConstructCB OnConstruct = nullptr,
LazyInsertOnLeakCB OnLeak = nullptr )

Insert lazily.

OnConstruct is called when ready to insert a value, after allocating space for the data. It is called at most once.

OnLeak is called only if OnConstruct has been called and a race occurred before insertion, causing the tentative offset and data to be abandoned. This allows clients to clean up other results or update any references.

NOTE: Does not guarantee that OnConstruct is only called on success. The in-memory TrieRawHashMap uses LazyAtomicPointer to synchronize simultaneous writes, but that seems dangerous to use in a memory-mapped file in case a process crashes in the busy state.

Definition at line 1138 of file OnDiskTrieRawHashMap.cpp.

References llvm::createStringError(), and llvm::make_error_code().

Referenced by insert().

◆ operator=()

OnDiskTrieRawHashMap & OnDiskTrieRawHashMap::operator= ( OnDiskTrieRawHashMap && RHS)
default

References OnDiskTrieRawHashMap(), and RHS.

◆ print()

void OnDiskTrieRawHashMap::print ( raw_ostream & OS,
function_ref< void(ArrayRef< char >)> PrintRecordData = nullptr ) const

Definition at line 1156 of file OnDiskTrieRawHashMap.cpp.

◆ recoverFromFileOffset()

Expected< OnDiskTrieRawHashMap::const_pointer > OnDiskTrieRawHashMap::recoverFromFileOffset ( FileOffset Offset) const

Helper function to recover a pointer into the trie from file offset.

Definition at line 1146 of file OnDiskTrieRawHashMap.cpp.

References llvm::createStringError(), llvm::make_error_code(), and llvm::Offset.

◆ size()

size_t OnDiskTrieRawHashMap::size ( ) const

◆ validate()

Error OnDiskTrieRawHashMap::validate ( function_ref< Error(FileOffset, ConstValueProxy)> RecordVerifier) const

Validate the trie data structure.

Callback receives the file offset to the data entry and the data stored.

Definition at line 1160 of file OnDiskTrieRawHashMap.cpp.

References llvm::createStringError(), and llvm::make_error_code().

◆ validOffset()

bool llvm::cas::OnDiskTrieRawHashMap::validOffset ( FileOffset Offset)
inlinestatic

Check the valid range of file offset for OnDiskTrieRawHashMap.

Definition at line 90 of file OnDiskTrieRawHashMap.h.

References llvm::Offset.

Referenced by llvm::cas::OnDiskTrieRawHashMap::PointerImpl< ProxyT >::PointerImpl().


The documentation for this class was generated from the following files: