Skip to content

platanus

platanus is a modern fork of the B-tree library cpp-btree.

Comparing to STL set/map

A btree is both smaller and faster than STL set/map. The red-black tree implementation of STL set/map has an overhead of 3 pointers (left, right and parent) plus the node color information for each stored value. So a std::set<std::int32_t> consumes 32 bytes for each value stored. This btree implementation stores the fixed number of values on nodes (often around 64 when MaxNumOfValues is left at platanus::kAutoSize) and doesn't store child pointers for leaf nodes. The result is that a platanus::btree_set<std::int32> may use much less memory per stored value. For the random insertion benchmark, a platanus::btree_set<std::int32_t> with 64 values per node uses 4.64 bytes per stored value.

The packing of multiple values into nodes of a btree has another effect besides better space utilization: better cache locality due to fewer cache lines being accessed, that is, faster operations.

Feature

  • satisfying alignment requirement,
  • supporting stateful comparer,
  • the standard C++ container-like interface,
  • supporting CMake,
  • easier to read than the original,
  • supporting three-way comparison operator,
  • supporting std::pmr::polymorphic_allocator (in platanus::pmr namespace).

Performance

In benchmark/bench_result_with_absl.json, platanus(auto) outperforms STL in 45 out of 60 benchmark groups, with a geometric-mean speedup of 2.301x over STL. The largest wins appear in forward iteration, where platanus(auto) is faster in all 12 groups and reaches roughly 18.4x geometric-mean speedup. Insert and lookup are also usually faster than STL, while delete and merge are more mixed and can regress for some string-heavy cases.

Against absl, however, platanus(auto) is generally slower in this benchmark set. absl is the fastest implementation in 53 out of 60 groups, while platanus(auto) is the fastest in 1 group. So, in the author's current benchmark environment, platanus is often a strong improvement over STL, but not a drop-in performance win over absl.

Limitation and caveats

Supporting the orders of B-Tree

We need at least 3 values per a node in order to perform splitting (1 value for the two nodes involved in the split and 1 value propagated to the parent as the delimiter for the split). That is, We don't support the 3 order B-trees.

Invalidating iterators after insertions and deletions

Insertions and deletions on a btree can cause splitting, merging or rebalancing of btree nodes. And even without these operations, insertions and deletions on a btree will move values around within a node. In both cases, the result is that insertions and deletions can invalidate iterators pointing to values other than the one being inserted/deleted. This is notably different from STL set/map which takes care to not invalidate iterators on insert/erase except, of course, for iterators pointing to the value being erased. A partial workaround when erasing is available: erase() returns an iterator pointing to the item just after the one that was erased (or end() if none exists).

Thread-safety

All functions and classes in this library are not thread-safe.

Installation

platanus is a header-only library, so you don't have to install it. When using CMake, you only have to link your program to platanus::platanus. For example, target_link_libraries(your_program PUBLIC platanus::platanus).

Test

If you want to test, for example, run the following commands in the top directory of platanus.

cmake -S . -B build/debug --preset test
cd build/debug/test
./btree_test

Performance test

If you want to check how much platanus is faster than STL, run the following commands.

cmake -S . -B build/release -DPLATANUS_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build/release --target btree_bench
cd build/release/benchmark
./btree_bench

The output has the following form.

BM_<(implementation)(container type)<(value type)[, (the max number of values per node or platanus::kAutoSize)]>>
  (average time[ns] per iteration)
  (average CPU time[ns] per iteration)
  (the total of iterations)

When PLATANUS_BENCHMARK_WITH_ABSL=ON (default), benchmark registration switches to comparison mode and the benchmark set becomes:

implementation note
STL std::set, std::multiset, std::map, std::multimap
Absl absl::btree_set, absl::btree_multiset, absl::btree_map, absl::btree_multimap
BTree(..., 64) platanus non-pmr containers with 64 values per node

In this comparison mode:

  • BTree(..., 128) is not registered.
  • platanus::pmr benchmark variants are not registered.
  • the intended comparison set is exactly STL / absl / platanus(64).

If you want the old wider benchmark set without absl, configure with:

cmake -S . -B build/release -DPLATANUS_BUILD_BENCHMARK=ON -DPLATANUS_BENCHMARK_WITH_ABSL=OFF -DCMAKE_BUILD_TYPE=Release
cmake --build build/release --target btree_bench

The test cases are:

test case meaning
Insert Benchmark insertion of random values into an empty container.
Lookup Benchmark lookup of random values in a container.
Delete Benchmark deletion of random values from a container.
FwdIter Benchmark iteration (forward) and reference through the container. Note that this benchmark includes a copy constructing of key_type.
Merge Benchmark merging two containers with the same size.

If you want to know a good size of values per node, run the following command.

cmake -S . -B build/release -DPLATANUS_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release -DPLATANUS_BENCHMARK_VALUES_SIZE_TEST
cmake --build build/release --target btree_bench
cd build/release/benchmark
./btree_bench

For plotting benchmark JSON output, use:

python3 benchmark/compare_benchmarks.py left.json right.json -o compared.html

The plotting script understands STL, absl, platanus(N), and platanus(auto) benchmark names and can also render a single JSON file.

License

platanus is licensed under Apache License, Version 2.0.