Performance Ninja -- Replacing Branches with Lookup Tables

Поділитися
Вставка
  • Опубліковано 27 жов 2024

КОМЕНТАРІ • 11

  • @easyperf3992
    @easyperf3992  2 роки тому

    Here is the complete code of the solution:
    ```
    #include "solution.hpp"
    int buckets[100] = {
    // 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
    4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
    5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
    6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6
    };
    static size_t mapToBucket(size_t v) {
    if (v < (sizeof (buckets) / sizeof (int)))
    return buckets[v];
    return -1; // let it crash
    }
    std::array histogram(const std::vector &values) {
    std::array retBuckets{0};
    for (auto v : values) {
    retBuckets[mapToBucket(v)]++;
    }
    return retBuckets;
    }
    ```

    • @XCemaXX
      @XCemaXX 9 місяців тому

      Another version to fill an array:
      template
      struct LookUpTable {
      constexpr LookUpTable() : arr() {
      int i = 0;
      unsigned int v = 0;
      for (; i < N; ++i) {
      v += (i == 13 || i == 29 || i == 41 || i == 53 || i == 71 || i == 83);
      arr[i] = v;
      }
      }
      std::array arr;
      };
      constexpr auto look_up_table = LookUpTable();

  • @jwbaker
    @jwbaker Рік тому +1

    I am surprised these videos have only hundreds of views after years. I watched the whole series and I think they're great for practitioners at any level of experience. Thanks for making these.

  • @MaxAbramson3
    @MaxAbramson3 2 роки тому +3

    You get dramatically better performance by using uint8[] for buckets because the whole lookup table would likely fit on the L1 cache. With about 100 entries, the uint8 would only be about 100 bytes. As int[], it would use up about 400 bytes, or four cache lines in many systems. In two-way set associative caches, this would use up an entire set, leading to both capacity misses and associativity misses, as well as adding a lot of burden to the memory system.

    • @easyperf3992
      @easyperf3992  2 роки тому +1

      Agree, it could be a further improvement.

  • @ennio5763
    @ennio5763 6 місяців тому

    This is a bit of a contrived example.
    In most cases, a branch cannot be replaced by a table lookup, because the impact of the branch is not limited to just generating a scalar value. It may do a different set of actions altogether.

    • @charactername263
      @charactername263 13 днів тому

      table lookups can return addresses of executable memory, or in higher level systems programming could return function pointers. In languages with coroutines it could return a coroutine to switch to

    • @ennio5763
      @ennio5763 13 днів тому

      @@charactername263 branching on an executable address stored into an array is still branching: it's actually even worse than an `if` for the control flow of the cpu: it is even less likely to guess what's the correct next set of instructions, which is the whole point.

    • @charactername263
      @charactername263 13 днів тому

      @@ennio5763 there is no branch, that is the point of the lookup table. If a branch is not predictable, then the overhead of something like switching coroutine or doing a jump can be cheaper than the unpredictable branch. ymmv There's not really any point doing this in code that is not latency critical, and such codebases generally "warm" the hot path anyway.

    • @ennio5763
      @ennio5763 12 днів тому

      @@charactername263 the problem is not the branch per say, but rather the unpredictability of the jump address. "table lookups can return addresses of executable memory" doesn't help, because it doesn't make the code flow more predictable.

    • @charactername263
      @charactername263 12 днів тому

      @@ennio5763 Yes you are right, the executable memory to jump to can be outside of the cache leading to a fault, each technique has downsides :)