LLVM  13.0.0git
Hashing.h
Go to the documentation of this file.
1 //===-- llvm/ADT/Hashing.h - Utilities for hashing --------------*- C++ -*-===//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 //
9 // This file implements the newly proposed standard C++ interfaces for hashing
10 // arbitrary data and building hash functions for user-defined types. This
11 // interface was originally proposed in N3333[1] and is currently under review
12 // for inclusion in a future TR and/or standard.
13 //
14 // The primary interfaces provide are comprised of one type and three functions:
15 //
16 // -- 'hash_code' class is an opaque type representing the hash code for some
17 // data. It is the intended product of hashing, and can be used to implement
18 // hash tables, checksumming, and other common uses of hashes. It is not an
19 // integer type (although it can be converted to one) because it is risky
20 // to assume much about the internals of a hash_code. In particular, each
21 // execution of the program has a high probability of producing a different
22 // hash_code for a given input. Thus their values are not stable to save or
23 // persist, and should only be used during the execution for the
24 // construction of hashing datastructures.
25 //
26 // -- 'hash_value' is a function designed to be overloaded for each
27 // user-defined type which wishes to be used within a hashing context. It
28 // should be overloaded within the user-defined type's namespace and found
29 // via ADL. Overloads for primitive types are provided by this library.
30 //
31 // -- 'hash_combine' and 'hash_combine_range' are functions designed to aid
32 // programmers in easily and intuitively combining a set of data into
33 // a single hash_code for their object. They should only logically be used
34 // within the implementation of a 'hash_value' routine or similar context.
35 //
36 // Note that 'hash_combine_range' contains very special logic for hashing
37 // a contiguous array of integers or pointers. This logic is *extremely* fast,
38 // on a modern Intel "Gainestown" Xeon (Nehalem uarch) @2.2 GHz, these were
39 // benchmarked at over 6.5 GiB/s for large keys, and <20 cycles/hash for keys
40 // under 32-bytes.
41 //
42 //===----------------------------------------------------------------------===//
43 
44 #ifndef LLVM_ADT_HASHING_H
45 #define LLVM_ADT_HASHING_H
46 
47 #include "llvm/Support/DataTypes.h"
51 #include <algorithm>
52 #include <cassert>
53 #include <cstring>
54 #include <string>
55 #include <tuple>
56 #include <utility>
57 
58 namespace llvm {
59 
60 /// An opaque object representing a hash code.
61 ///
62 /// This object represents the result of hashing some entity. It is intended to
63 /// be used to implement hashtables or other hashing-based data structures.
64 /// While it wraps and exposes a numeric value, this value should not be
65 /// trusted to be stable or predictable across processes or executions.
66 ///
67 /// In order to obtain the hash_code for an object 'x':
68 /// \code
69 /// using llvm::hash_value;
70 /// llvm::hash_code code = hash_value(x);
71 /// \endcode
72 class hash_code {
73  size_t value;
74 
75 public:
76  /// Default construct a hash_code.
77  /// Note that this leaves the value uninitialized.
78  hash_code() = default;
79 
80  /// Form a hash code directly from a numerical value.
81  hash_code(size_t value) : value(value) {}
82 
83  /// Convert the hash code to its numerical value for use.
84  /*explicit*/ operator size_t() const { return value; }
85 
86  friend bool operator==(const hash_code &lhs, const hash_code &rhs) {
87  return lhs.value == rhs.value;
88  }
89  friend bool operator!=(const hash_code &lhs, const hash_code &rhs) {
90  return lhs.value != rhs.value;
91  }
92 
93  /// Allow a hash_code to be directly run through hash_value.
94  friend size_t hash_value(const hash_code &code) { return code.value; }
95 };
96 
97 /// Compute a hash_code for any integer value.
98 ///
99 /// Note that this function is intended to compute the same hash_code for
100 /// a particular value without regard to the pre-promotion type. This is in
101 /// contrast to hash_combine which may produce different hash_codes for
102 /// differing argument types even if they would implicit promote to a common
103 /// type without changing the value.
104 template <typename T>
105 std::enable_if_t<is_integral_or_enum<T>::value, hash_code> hash_value(T value);
106 
107 /// Compute a hash_code for a pointer's address.
108 ///
109 /// N.B.: This hashes the *address*. Not the value and not the type.
110 template <typename T> hash_code hash_value(const T *ptr);
111 
112 /// Compute a hash_code for a pair of objects.
113 template <typename T, typename U>
114 hash_code hash_value(const std::pair<T, U> &arg);
115 
116 /// Compute a hash_code for a tuple.
117 template <typename... Ts>
118 hash_code hash_value(const std::tuple<Ts...> &arg);
119 
120 /// Compute a hash_code for a standard string.
121 template <typename T>
122 hash_code hash_value(const std::basic_string<T> &arg);
123 
124 
125 /// Override the execution seed with a fixed value.
126 ///
127 /// This hashing library uses a per-execution seed designed to change on each
128 /// run with high probability in order to ensure that the hash codes are not
129 /// attackable and to ensure that output which is intended to be stable does
130 /// not rely on the particulars of the hash codes produced.
131 ///
132 /// That said, there are use cases where it is important to be able to
133 /// reproduce *exactly* a specific behavior. To that end, we provide a function
134 /// which will forcibly set the seed to a fixed value. This must be done at the
135 /// start of the program, before any hashes are computed. Also, it cannot be
136 /// undone. This makes it thread-hostile and very hard to use outside of
137 /// immediately on start of a simple program designed for reproducible
138 /// behavior.
139 void set_fixed_execution_hash_seed(uint64_t fixed_value);
140 
141 
142 // All of the implementation details of actually computing the various hash
143 // code values are held within this namespace. These routines are included in
144 // the header file mainly to allow inlining and constant propagation.
145 namespace hashing {
146 namespace detail {
147 
148 inline uint64_t fetch64(const char *p) {
149  uint64_t result;
150  memcpy(&result, p, sizeof(result));
153  return result;
154 }
155 
156 inline uint32_t fetch32(const char *p) {
158  memcpy(&result, p, sizeof(result));
161  return result;
162 }
163 
164 /// Some primes between 2^63 and 2^64 for various uses.
165 static constexpr uint64_t k0 = 0xc3a5c85c97cb3127ULL;
166 static constexpr uint64_t k1 = 0xb492b66fbe98f273ULL;
167 static constexpr uint64_t k2 = 0x9ae16a3b2f90404fULL;
168 static constexpr uint64_t k3 = 0xc949d7c7509e6557ULL;
169 
170 /// Bitwise right rotate.
171 /// Normally this will compile to a single instruction, especially if the
172 /// shift is a manifest constant.
173 inline uint64_t rotate(uint64_t val, size_t shift) {
174  // Avoid shifting by 64: doing so yields an undefined result.
175  return shift == 0 ? val : ((val >> shift) | (val << (64 - shift)));
176 }
177 
178 inline uint64_t shift_mix(uint64_t val) {
179  return val ^ (val >> 47);
180 }
181 
182 inline uint64_t hash_16_bytes(uint64_t low, uint64_t high) {
183  // Murmur-inspired hashing.
184  const uint64_t kMul = 0x9ddfea08eb382d69ULL;
185  uint64_t a = (low ^ high) * kMul;
186  a ^= (a >> 47);
187  uint64_t b = (high ^ a) * kMul;
188  b ^= (b >> 47);
189  b *= kMul;
190  return b;
191 }
192 
193 inline uint64_t hash_1to3_bytes(const char *s, size_t len, uint64_t seed) {
194  uint8_t a = s[0];
195  uint8_t b = s[len >> 1];
196  uint8_t c = s[len - 1];
197  uint32_t y = static_cast<uint32_t>(a) + (static_cast<uint32_t>(b) << 8);
198  uint32_t z = static_cast<uint32_t>(len) + (static_cast<uint32_t>(c) << 2);
199  return shift_mix(y * k2 ^ z * k3 ^ seed) * k2;
200 }
201 
202 inline uint64_t hash_4to8_bytes(const char *s, size_t len, uint64_t seed) {
203  uint64_t a = fetch32(s);
204  return hash_16_bytes(len + (a << 3), seed ^ fetch32(s + len - 4));
205 }
206 
207 inline uint64_t hash_9to16_bytes(const char *s, size_t len, uint64_t seed) {
208  uint64_t a = fetch64(s);
209  uint64_t b = fetch64(s + len - 8);
210  return hash_16_bytes(seed ^ a, rotate(b + len, len)) ^ b;
211 }
212 
213 inline uint64_t hash_17to32_bytes(const char *s, size_t len, uint64_t seed) {
214  uint64_t a = fetch64(s) * k1;
215  uint64_t b = fetch64(s + 8);
216  uint64_t c = fetch64(s + len - 8) * k2;
217  uint64_t d = fetch64(s + len - 16) * k0;
218  return hash_16_bytes(rotate(a - b, 43) + rotate(c ^ seed, 30) + d,
219  a + rotate(b ^ k3, 20) - c + len + seed);
220 }
221 
222 inline uint64_t hash_33to64_bytes(const char *s, size_t len, uint64_t seed) {
223  uint64_t z = fetch64(s + 24);
224  uint64_t a = fetch64(s) + (len + fetch64(s + len - 16)) * k0;
225  uint64_t b = rotate(a + z, 52);
226  uint64_t c = rotate(a, 37);
227  a += fetch64(s + 8);
228  c += rotate(a, 7);
229  a += fetch64(s + 16);
230  uint64_t vf = a + z;
231  uint64_t vs = b + rotate(a, 31) + c;
232  a = fetch64(s + 16) + fetch64(s + len - 32);
233  z = fetch64(s + len - 8);
234  b = rotate(a + z, 52);
235  c = rotate(a, 37);
236  a += fetch64(s + len - 24);
237  c += rotate(a, 7);
238  a += fetch64(s + len - 16);
239  uint64_t wf = a + z;
240  uint64_t ws = b + rotate(a, 31) + c;
241  uint64_t r = shift_mix((vf + ws) * k2 + (wf + vs) * k0);
242  return shift_mix((seed ^ (r * k0)) + vs) * k2;
243 }
244 
245 inline uint64_t hash_short(const char *s, size_t length, uint64_t seed) {
246  if (length >= 4 && length <= 8)
247  return hash_4to8_bytes(s, length, seed);
248  if (length > 8 && length <= 16)
249  return hash_9to16_bytes(s, length, seed);
250  if (length > 16 && length <= 32)
251  return hash_17to32_bytes(s, length, seed);
252  if (length > 32)
253  return hash_33to64_bytes(s, length, seed);
254  if (length != 0)
255  return hash_1to3_bytes(s, length, seed);
256 
257  return k2 ^ seed;
258 }
259 
260 /// The intermediate state used during hashing.
261 /// Currently, the algorithm for computing hash codes is based on CityHash and
262 /// keeps 56 bytes of arbitrary state.
263 struct hash_state {
264  uint64_t h0 = 0, h1 = 0, h2 = 0, h3 = 0, h4 = 0, h5 = 0, h6 = 0;
265 
266  /// Create a new hash_state structure and initialize it based on the
267  /// seed and the first 64-byte chunk.
268  /// This effectively performs the initial mix.
269  static hash_state create(const char *s, uint64_t seed) {
270  hash_state state = {
271  0, seed, hash_16_bytes(seed, k1), rotate(seed ^ k1, 49),
272  seed * k1, shift_mix(seed), 0 };
273  state.h6 = hash_16_bytes(state.h4, state.h5);
274  state.mix(s);
275  return state;
276  }
277 
278  /// Mix 32-bytes from the input sequence into the 16-bytes of 'a'
279  /// and 'b', including whatever is already in 'a' and 'b'.
280  static void mix_32_bytes(const char *s, uint64_t &a, uint64_t &b) {
281  a += fetch64(s);
282  uint64_t c = fetch64(s + 24);
283  b = rotate(b + a + c, 21);
284  uint64_t d = a;
285  a += fetch64(s + 8) + fetch64(s + 16);
286  b += rotate(a, 44) + d;
287  a += c;
288  }
289 
290  /// Mix in a 64-byte buffer of data.
291  /// We mix all 64 bytes even when the chunk length is smaller, but we
292  /// record the actual length.
293  void mix(const char *s) {
294  h0 = rotate(h0 + h1 + h3 + fetch64(s + 8), 37) * k1;
295  h1 = rotate(h1 + h4 + fetch64(s + 48), 42) * k1;
296  h0 ^= h6;
297  h1 += h3 + fetch64(s + 40);
298  h2 = rotate(h2 + h5, 33) * k1;
299  h3 = h4 * k1;
300  h4 = h0 + h5;
301  mix_32_bytes(s, h3, h4);
302  h5 = h2 + h6;
303  h6 = h1 + fetch64(s + 16);
304  mix_32_bytes(s + 32, h5, h6);
305  std::swap(h2, h0);
306  }
307 
308  /// Compute the final 64-bit hash code value based on the current
309  /// state and the length of bytes hashed.
310  uint64_t finalize(size_t length) {
311  return hash_16_bytes(hash_16_bytes(h3, h5) + shift_mix(h1) * k1 + h2,
312  hash_16_bytes(h4, h6) + shift_mix(length) * k1 + h0);
313  }
314 };
315 
316 
317 /// A global, fixed seed-override variable.
318 ///
319 /// This variable can be set using the \see llvm::set_fixed_execution_seed
320 /// function. See that function for details. Do not, under any circumstances,
321 /// set or read this variable.
322 extern uint64_t fixed_seed_override;
323 
324 inline uint64_t get_execution_seed() {
325  // FIXME: This needs to be a per-execution seed. This is just a placeholder
326  // implementation. Switching to a per-execution seed is likely to flush out
327  // instability bugs and so will happen as its own commit.
328  //
329  // However, if there is a fixed seed override set the first time this is
330  // called, return that instead of the per-execution seed.
331  const uint64_t seed_prime = 0xff51afd7ed558ccdULL;
332  static uint64_t seed = fixed_seed_override ? fixed_seed_override : seed_prime;
333  return seed;
334 }
335 
336 
337 /// Trait to indicate whether a type's bits can be hashed directly.
338 ///
339 /// A type trait which is true if we want to combine values for hashing by
340 /// reading the underlying data. It is false if values of this type must
341 /// first be passed to hash_value, and the resulting hash_codes combined.
342 //
343 // FIXME: We want to replace is_integral_or_enum and is_pointer here with
344 // a predicate which asserts that comparing the underlying storage of two
345 // values of the type for equality is equivalent to comparing the two values
346 // for equality. For all the platforms we care about, this holds for integers
347 // and pointers, but there are platforms where it doesn't and we would like to
348 // support user-defined types which happen to satisfy this property.
349 template <typename T> struct is_hashable_data
350  : std::integral_constant<bool, ((is_integral_or_enum<T>::value ||
351  std::is_pointer<T>::value) &&
352  64 % sizeof(T) == 0)> {};
353 
354 // Special case std::pair to detect when both types are viable and when there
355 // is no alignment-derived padding in the pair. This is a bit of a lie because
356 // std::pair isn't truly POD, but it's close enough in all reasonable
357 // implementations for our use case of hashing the underlying data.
358 template <typename T, typename U> struct is_hashable_data<std::pair<T, U> >
359  : std::integral_constant<bool, (is_hashable_data<T>::value &&
360  is_hashable_data<U>::value &&
361  (sizeof(T) + sizeof(U)) ==
362  sizeof(std::pair<T, U>))> {};
363 
364 /// Helper to get the hashable data representation for a type.
365 /// This variant is enabled when the type itself can be used.
366 template <typename T>
367 std::enable_if_t<is_hashable_data<T>::value, T>
368 get_hashable_data(const T &value) {
369  return value;
370 }
371 /// Helper to get the hashable data representation for a type.
372 /// This variant is enabled when we must first call hash_value and use the
373 /// result as our data.
374 template <typename T>
375 std::enable_if_t<!is_hashable_data<T>::value, size_t>
376 get_hashable_data(const T &value) {
378  return hash_value(value);
379 }
380 
381 /// Helper to store data from a value into a buffer and advance the
382 /// pointer into that buffer.
383 ///
384 /// This routine first checks whether there is enough space in the provided
385 /// buffer, and if not immediately returns false. If there is space, it
386 /// copies the underlying bytes of value into the buffer, advances the
387 /// buffer_ptr past the copied bytes, and returns true.
388 template <typename T>
389 bool store_and_advance(char *&buffer_ptr, char *buffer_end, const T& value,
390  size_t offset = 0) {
391  size_t store_size = sizeof(value) - offset;
392  if (buffer_ptr + store_size > buffer_end)
393  return false;
394  const char *value_data = reinterpret_cast<const char *>(&value);
395  memcpy(buffer_ptr, value_data + offset, store_size);
396  buffer_ptr += store_size;
397  return true;
398 }
399 
400 /// Implement the combining of integral values into a hash_code.
401 ///
402 /// This overload is selected when the value type of the iterator is
403 /// integral. Rather than computing a hash_code for each object and then
404 /// combining them, this (as an optimization) directly combines the integers.
405 template <typename InputIteratorT>
406 hash_code hash_combine_range_impl(InputIteratorT first, InputIteratorT last) {
407  const uint64_t seed = get_execution_seed();
408  char buffer[64], *buffer_ptr = buffer;
409  char *const buffer_end = std::end(buffer);
410  while (first != last && store_and_advance(buffer_ptr, buffer_end,
411  get_hashable_data(*first)))
412  ++first;
413  if (first == last)
414  return hash_short(buffer, buffer_ptr - buffer, seed);
415  assert(buffer_ptr == buffer_end);
416 
417  hash_state state = state.create(buffer, seed);
418  size_t length = 64;
419  while (first != last) {
420  // Fill up the buffer. We don't clear it, which re-mixes the last round
421  // when only a partial 64-byte chunk is left.
422  buffer_ptr = buffer;
423  while (first != last && store_and_advance(buffer_ptr, buffer_end,
424  get_hashable_data(*first)))
425  ++first;
426 
427  // Rotate the buffer if we did a partial fill in order to simulate doing
428  // a mix of the last 64-bytes. That is how the algorithm works when we
429  // have a contiguous byte sequence, and we want to emulate that here.
430  std::rotate(buffer, buffer_ptr, buffer_end);
431 
432  // Mix this chunk into the current state.
433  state.mix(buffer);
434  length += buffer_ptr - buffer;
435  };
436 
437  return state.finalize(length);
438 }
439 
440 /// Implement the combining of integral values into a hash_code.
441 ///
442 /// This overload is selected when the value type of the iterator is integral
443 /// and when the input iterator is actually a pointer. Rather than computing
444 /// a hash_code for each object and then combining them, this (as an
445 /// optimization) directly combines the integers. Also, because the integers
446 /// are stored in contiguous memory, this routine avoids copying each value
447 /// and directly reads from the underlying memory.
448 template <typename ValueT>
449 std::enable_if_t<is_hashable_data<ValueT>::value, hash_code>
450 hash_combine_range_impl(ValueT *first, ValueT *last) {
451  const uint64_t seed = get_execution_seed();
452  const char *s_begin = reinterpret_cast<const char *>(first);
453  const char *s_end = reinterpret_cast<const char *>(last);
454  const size_t length = std::distance(s_begin, s_end);
455  if (length <= 64)
456  return hash_short(s_begin, length, seed);
457 
458  const char *s_aligned_end = s_begin + (length & ~63);
459  hash_state state = state.create(s_begin, seed);
460  s_begin += 64;
461  while (s_begin != s_aligned_end) {
462  state.mix(s_begin);
463  s_begin += 64;
464  }
465  if (length & 63)
466  state.mix(s_end - 64);
467 
468  return state.finalize(length);
469 }
470 
471 } // namespace detail
472 } // namespace hashing
473 
474 
475 /// Compute a hash_code for a sequence of values.
476 ///
477 /// This hashes a sequence of values. It produces the same hash_code as
478 /// 'hash_combine(a, b, c, ...)', but can run over arbitrary sized sequences
479 /// and is significantly faster given pointers and types which can be hashed as
480 /// a sequence of bytes.
481 template <typename InputIteratorT>
482 hash_code hash_combine_range(InputIteratorT first, InputIteratorT last) {
484 }
485 
486 
487 // Implementation details for hash_combine.
488 namespace hashing {
489 namespace detail {
490 
491 /// Helper class to manage the recursive combining of hash_combine
492 /// arguments.
493 ///
494 /// This class exists to manage the state and various calls involved in the
495 /// recursive combining of arguments used in hash_combine. It is particularly
496 /// useful at minimizing the code in the recursive calls to ease the pain
497 /// caused by a lack of variadic functions.
499  char buffer[64] = {};
501  const uint64_t seed;
502 
503 public:
504  /// Construct a recursive hash combining helper.
505  ///
506  /// This sets up the state for a recursive hash combine, including getting
507  /// the seed and buffer setup.
509  : seed(get_execution_seed()) {}
510 
511  /// Combine one chunk of data into the current in-flight hash.
512  ///
513  /// This merges one chunk of data into the hash. First it tries to buffer
514  /// the data. If the buffer is full, it hashes the buffer into its
515  /// hash_state, empties it, and then merges the new chunk in. This also
516  /// handles cases where the data straddles the end of the buffer.
517  template <typename T>
518  char *combine_data(size_t &length, char *buffer_ptr, char *buffer_end, T data) {
519  if (!store_and_advance(buffer_ptr, buffer_end, data)) {
520  // Check for skew which prevents the buffer from being packed, and do
521  // a partial store into the buffer to fill it. This is only a concern
522  // with the variadic combine because that formation can have varying
523  // argument types.
524  size_t partial_store_size = buffer_end - buffer_ptr;
525  memcpy(buffer_ptr, &data, partial_store_size);
526 
527  // If the store fails, our buffer is full and ready to hash. We have to
528  // either initialize the hash state (on the first full buffer) or mix
529  // this buffer into the existing hash state. Length tracks the *hashed*
530  // length, not the buffered length.
531  if (length == 0) {
533  length = 64;
534  } else {
535  // Mix this chunk into the current state and bump length up by 64.
536  state.mix(buffer);
537  length += 64;
538  }
539  // Reset the buffer_ptr to the head of the buffer for the next chunk of
540  // data.
541  buffer_ptr = buffer;
542 
543  // Try again to store into the buffer -- this cannot fail as we only
544  // store types smaller than the buffer.
545  if (!store_and_advance(buffer_ptr, buffer_end, data,
546  partial_store_size))
547  llvm_unreachable("buffer smaller than stored type");
548  }
549  return buffer_ptr;
550  }
551 
552  /// Recursive, variadic combining method.
553  ///
554  /// This function recurses through each argument, combining that argument
555  /// into a single hash.
556  template <typename T, typename ...Ts>
557  hash_code combine(size_t length, char *buffer_ptr, char *buffer_end,
558  const T &arg, const Ts &...args) {
559  buffer_ptr = combine_data(length, buffer_ptr, buffer_end, get_hashable_data(arg));
560 
561  // Recurse to the next argument.
562  return combine(length, buffer_ptr, buffer_end, args...);
563  }
564 
565  /// Base case for recursive, variadic combining.
566  ///
567  /// The base case when combining arguments recursively is reached when all
568  /// arguments have been handled. It flushes the remaining buffer and
569  /// constructs a hash_code.
570  hash_code combine(size_t length, char *buffer_ptr, char *buffer_end) {
571  // Check whether the entire set of values fit in the buffer. If so, we'll
572  // use the optimized short hashing routine and skip state entirely.
573  if (length == 0)
574  return hash_short(buffer, buffer_ptr - buffer, seed);
575 
576  // Mix the final buffer, rotating it if we did a partial fill in order to
577  // simulate doing a mix of the last 64-bytes. That is how the algorithm
578  // works when we have a contiguous byte sequence, and we want to emulate
579  // that here.
580  std::rotate(buffer, buffer_ptr, buffer_end);
581 
582  // Mix this chunk into the current state.
583  state.mix(buffer);
584  length += buffer_ptr - buffer;
585 
586  return state.finalize(length);
587  }
588 };
589 
590 } // namespace detail
591 } // namespace hashing
592 
593 /// Combine values into a single hash_code.
594 ///
595 /// This routine accepts a varying number of arguments of any type. It will
596 /// attempt to combine them into a single hash_code. For user-defined types it
597 /// attempts to call a \see hash_value overload (via ADL) for the type. For
598 /// integer and pointer types it directly combines their data into the
599 /// resulting hash_code.
600 ///
601 /// The result is suitable for returning from a user's hash_value
602 /// *implementation* for their user-defined type. Consumers of a type should
603 /// *not* call this routine, they should instead call 'hash_value'.
604 template <typename ...Ts> hash_code hash_combine(const Ts &...args) {
605  // Recursively hash each argument using a helper class.
607  return helper.combine(0, helper.buffer, helper.buffer + 64, args...);
608 }
609 
610 // Implementation details for implementations of hash_value overloads provided
611 // here.
612 namespace hashing {
613 namespace detail {
614 
615 /// Helper to hash the value of a single integer.
616 ///
617 /// Overloads for smaller integer types are not provided to ensure consistent
618 /// behavior in the presence of integral promotions. Essentially,
619 /// "hash_value('4')" and "hash_value('0' + 4)" should be the same.
620 inline hash_code hash_integer_value(uint64_t value) {
621  // Similar to hash_4to8_bytes but using a seed instead of length.
622  const uint64_t seed = get_execution_seed();
623  const char *s = reinterpret_cast<const char *>(&value);
624  const uint64_t a = fetch32(s);
625  return hash_16_bytes(seed + (a << 3), fetch32(s + 4));
626 }
627 
628 } // namespace detail
629 } // namespace hashing
630 
631 // Declared and documented above, but defined here so that any of the hashing
632 // infrastructure is available.
633 template <typename T>
634 std::enable_if_t<is_integral_or_enum<T>::value, hash_code> hash_value(T value) {
636  static_cast<uint64_t>(value));
637 }
638 
639 // Declared and documented above, but defined here so that any of the hashing
640 // infrastructure is available.
641 template <typename T> hash_code hash_value(const T *ptr) {
643  reinterpret_cast<uintptr_t>(ptr));
644 }
645 
646 // Declared and documented above, but defined here so that any of the hashing
647 // infrastructure is available.
648 template <typename T, typename U>
649 hash_code hash_value(const std::pair<T, U> &arg) {
650  return hash_combine(arg.first, arg.second);
651 }
652 
653 // Implementation details for the hash_value overload for std::tuple<...>(...).
654 namespace hashing {
655 namespace detail {
656 
657 template <typename... Ts, std::size_t... Indices>
658 hash_code hash_value_tuple_helper(const std::tuple<Ts...> &arg,
659  std::index_sequence<Indices...>) {
660  return hash_combine(std::get<Indices>(arg)...);
661 }
662 
663 } // namespace detail
664 } // namespace hashing
665 
666 template <typename... Ts>
667 hash_code hash_value(const std::tuple<Ts...> &arg) {
668  // TODO: Use std::apply when LLVM starts using C++17.
670  arg, typename std::index_sequence_for<Ts...>());
671 }
672 
673 // Declared and documented above, but defined here so that any of the hashing
674 // infrastructure is available.
675 template <typename T>
676 hash_code hash_value(const std::basic_string<T> &arg) {
677  return hash_combine_range(arg.begin(), arg.end());
678 }
679 
680 } // namespace llvm
681 
682 #endif
z
return z
Definition: README.txt:14
llvm::set_fixed_execution_hash_seed
void set_fixed_execution_hash_seed(uint64_t fixed_value)
Override the execution seed with a fixed value.
Definition: Hashing.cpp:26
llvm::hashing::detail::hash_state::h4
uint64_t h4
Definition: Hashing.h:264
llvm
Definition: AllocatorList.h:23
llvm::hashing::detail::hash_combine_recursive_helper::state
hash_state state
Definition: Hashing.h:500
llvm::sys::IsBigEndianHost
constexpr bool IsBigEndianHost
Definition: SwapByteOrder.h:98
llvm::sys::swapByteOrder
void swapByteOrder(T &Value)
Definition: SwapByteOrder.h:158
llvm::hashing::detail::hash_state::h3
uint64_t h3
Definition: Hashing.h:264
llvm::hash_code::hash_value
friend size_t hash_value(const hash_code &code)
Allow a hash_code to be directly run through hash_value.
Definition: Hashing.h:94
ErrorHandling.h
llvm::hashing::detail::hash_combine_recursive_helper::combine_data
char * combine_data(size_t &length, char *buffer_ptr, char *buffer_end, T data)
Combine one chunk of data into the current in-flight hash.
Definition: Hashing.h:518
llvm::hashing::detail::hash_17to32_bytes
uint64_t hash_17to32_bytes(const char *s, size_t len, uint64_t seed)
Definition: Hashing.h:213
SwapByteOrder.h
llvm::hashing::detail::hash_combine_recursive_helper::seed
const uint64_t seed
Definition: Hashing.h:501
llvm::sys::path::end
const_iterator end(StringRef path)
Get end iterator over path.
Definition: Path.cpp:233
llvm::hashing::detail::hash_state::finalize
uint64_t finalize(size_t length)
Compute the final 64-bit hash code value based on the current state and the length of bytes hashed.
Definition: Hashing.h:310
T
#define T
Definition: Mips16ISelLowering.cpp:341
llvm::hashing::detail::k2
static constexpr uint64_t k2
Definition: Hashing.h:167
llvm::hash_value
hash_code hash_value(const APFloat &Arg)
See friend declarations above.
Definition: APFloat.cpp:4803
p
the resulting code requires compare and branches when and if * p
Definition: README.txt:396
size_t
llvm::hashing::detail::shift_mix
uint64_t shift_mix(uint64_t val)
Definition: Hashing.h:178
a
=0.0 ? 0.0 :(a > 0.0 ? 1.0 :-1.0) a
Definition: README.txt:489
llvm::hashing::detail::hash_1to3_bytes
uint64_t hash_1to3_bytes(const char *s, size_t len, uint64_t seed)
Definition: Hashing.h:193
result
It looks like we only need to define PPCfmarto for these because according to these instructions perform RTO on fma s result
Definition: README_P9.txt:256
llvm::hashing::detail::get_hashable_data
std::enable_if_t< is_hashable_data< T >::value, T > get_hashable_data(const T &value)
Helper to get the hashable data representation for a type.
Definition: Hashing.h:368
llvm::hashing::detail::hash_combine_recursive_helper::combine
hash_code combine(size_t length, char *buffer_ptr, char *buffer_end, const T &arg, const Ts &...args)
Recursive, variadic combining method.
Definition: Hashing.h:557
llvm::hashing::detail::rotate
uint64_t rotate(uint64_t val, size_t shift)
Bitwise right rotate.
Definition: Hashing.h:173
llvm::hash_code::operator==
friend bool operator==(const hash_code &lhs, const hash_code &rhs)
Definition: Hashing.h:86
b
the resulting code requires compare and branches when and if the revised code is with conditional branches instead of More there is a byte word extend before each where there should be only and the condition codes are not remembered when the same two values are compared twice More LSR enhancements i8 and i32 load store addressing modes are identical int b
Definition: README.txt:418
llvm::hashing::detail::hash_combine_recursive_helper::combine
hash_code combine(size_t length, char *buffer_ptr, char *buffer_end)
Base case for recursive, variadic combining.
Definition: Hashing.h:570
llvm::hashing::detail::hash_value_tuple_helper
hash_code hash_value_tuple_helper(const std::tuple< Ts... > &arg, std::index_sequence< Indices... >)
Definition: Hashing.h:658
llvm::hashing::detail::hash_4to8_bytes
uint64_t hash_4to8_bytes(const char *s, size_t len, uint64_t seed)
Definition: Hashing.h:202
llvm::hashing::detail::fetch64
uint64_t fetch64(const char *p)
Definition: Hashing.h:148
llvm::hashing::detail::hash_state::h1
uint64_t h1
Definition: Hashing.h:264
llvm::hashing::detail::is_hashable_data
Trait to indicate whether a type's bits can be hashed directly.
Definition: Hashing.h:349
llvm::hashing::detail::hash_16_bytes
uint64_t hash_16_bytes(uint64_t low, uint64_t high)
Definition: Hashing.h:182
llvm::hashing::detail::fetch32
uint32_t fetch32(const char *p)
Definition: Hashing.h:156
c
the resulting code requires compare and branches when and if the revised code is with conditional branches instead of More there is a byte word extend before each where there should be only and the condition codes are not remembered when the same two values are compared twice More LSR enhancements i8 and i32 load store addressing modes are identical int int c
Definition: README.txt:418
llvm::hashing::detail::k1
static constexpr uint64_t k1
Definition: Hashing.h:166
llvm::hashing::detail::hash_state::mix_32_bytes
static void mix_32_bytes(const char *s, uint64_t &a, uint64_t &b)
Mix 32-bytes from the input sequence into the 16-bytes of 'a' and 'b', including whatever is already ...
Definition: Hashing.h:280
llvm::hashing::detail::k3
static constexpr uint64_t k3
Definition: Hashing.h:168
llvm::hash_code::operator!=
friend bool operator!=(const hash_code &lhs, const hash_code &rhs)
Definition: Hashing.h:89
llvm::hashing::detail::hash_combine_range_impl
hash_code hash_combine_range_impl(InputIteratorT first, InputIteratorT last)
Implement the combining of integral values into a hash_code.
Definition: Hashing.h:406
val
The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 val
Definition: README.txt:15
llvm::hashing::detail::hash_combine_recursive_helper::hash_combine_recursive_helper
hash_combine_recursive_helper()
Construct a recursive hash combining helper.
Definition: Hashing.h:508
s
multiplies can be turned into SHL s
Definition: README.txt:370
llvm::hashing::detail::hash_state::h2
uint64_t h2
Definition: Hashing.h:264
llvm::hashing::detail::hash_9to16_bytes
uint64_t hash_9to16_bytes(const char *s, size_t len, uint64_t seed)
Definition: Hashing.h:207
llvm::hashing::detail::hash_combine_recursive_helper
Helper class to manage the recursive combining of hash_combine arguments.
Definition: Hashing.h:498
llvm::hashing::detail::get_execution_seed
uint64_t get_execution_seed()
Definition: Hashing.h:324
assert
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
std::swap
void swap(llvm::BitVector &LHS, llvm::BitVector &RHS)
Implement std::swap in terms of BitVector swap.
Definition: BitVector.h:840
memcpy
<%struct.s * > cast struct s *S to sbyte *< sbyte * > sbyte uint cast struct s *agg result to sbyte *< sbyte * > sbyte uint cast struct s *memtmp to sbyte *< sbyte * > sbyte uint ret void llc ends up issuing two memcpy or custom lower memcpy(of small size) to be ldmia/stmia. I think option 2 is better but the current register allocator cannot allocate a chunk of registers at a time. A feasible temporary solution is to use specific physical registers at the lowering time for small(<
llvm::hashing::detail::hash_state::mix
void mix(const char *s)
Mix in a 64-byte buffer of data.
Definition: Hashing.h:293
llvm::hashing::detail::hash_integer_value
hash_code hash_integer_value(uint64_t value)
Helper to hash the value of a single integer.
Definition: Hashing.h:620
llvm::hashing::detail::fixed_seed_override
uint64_t fixed_seed_override
A global, fixed seed-override variable.
Definition: Hashing.cpp:22
llvm::hashing::detail::hash_state::h0
uint64_t h0
Definition: Hashing.h:264
llvm_unreachable
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
Definition: ErrorHandling.h:136
uint32_t
llvm::hashing::detail::hash_state
The intermediate state used during hashing.
Definition: Hashing.h:263
llvm::hashing::detail::hash_combine_recursive_helper::buffer
char buffer[64]
Definition: Hashing.h:499
llvm::hash_value
hash_code hash_value(const std::basic_string< T > &arg)
Compute a hash_code for a standard string.
Definition: Hashing.h:676
llvm::hashing::detail::hash_short
uint64_t hash_short(const char *s, size_t length, uint64_t seed)
Definition: Hashing.h:245
std
Definition: BitVector.h:838
llvm::hashing::detail::hash_combine_range_impl
std::enable_if_t< is_hashable_data< ValueT >::value, hash_code > hash_combine_range_impl(ValueT *first, ValueT *last)
Implement the combining of integral values into a hash_code.
Definition: Hashing.h:450
type_traits.h
code
*Add support for compiling functions in both ARM and Thumb then taking the smallest *Add support for compiling individual basic blocks in thumb when in a larger ARM function This can be used for presumed cold code
Definition: README-Thumb.txt:9
llvm::hashing::detail::hash_state::create
static hash_state create(const char *s, uint64_t seed)
Create a new hash_state structure and initialize it based on the seed and the first 64-byte chunk.
Definition: Hashing.h:269
y
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul int y
Definition: README.txt:61
llvm::hash_code::hash_code
hash_code()=default
Default construct a hash_code.
llvm::hashing::detail::hash_state::h6
uint64_t h6
Definition: Hashing.h:264
llvm::hashing::detail::hash_33to64_bytes
uint64_t hash_33to64_bytes(const char *s, size_t len, uint64_t seed)
Definition: Hashing.h:222
llvm::hash_combine
hash_code hash_combine(const Ts &...args)
Combine values into a single hash_code.
Definition: Hashing.h:604
llvm::hashing::detail::hash_state::h5
uint64_t h5
Definition: Hashing.h:264
llvm::hashing::detail::store_and_advance
bool store_and_advance(char *&buffer_ptr, char *buffer_end, const T &value, size_t offset=0)
Helper to store data from a value into a buffer and advance the pointer into that buffer.
Definition: Hashing.h:389
llvm::hash_combine_range
hash_code hash_combine_range(InputIteratorT first, InputIteratorT last)
Compute a hash_code for a sequence of values.
Definition: Hashing.h:482
shift
http eax xorl edx cl sete al setne dl sall eax sall edx But that requires good bit subreg support this might be better It s an extra shift
Definition: README.txt:30
llvm::hash_code::hash_code
hash_code(size_t value)
Form a hash code directly from a numerical value.
Definition: Hashing.h:81
DataTypes.h
d
the resulting code requires compare and branches when and if the revised code is with conditional branches instead of More there is a byte word extend before each where there should be only and the condition codes are not remembered when the same two values are compared twice More LSR enhancements i8 and i32 load store addressing modes are identical int int int d
Definition: README.txt:418
llvm::hashing::detail::k0
static constexpr uint64_t k0
Some primes between 2^63 and 2^64 for various uses.
Definition: Hashing.h:165
llvm::hash_code
An opaque object representing a hash code.
Definition: Hashing.h:72