platanus¶
platanus is a modern fork of the B-tree library cpp-btree.
Comparing to STL set/map¶
A btree is both smaller and faster than STL set/map.
The red-black tree implementation of STL set/map has an overhead of 3 pointers (left, right and parent) plus the node color information for each stored value.
So a std::set<std::int32_t> consumes 32 bytes for each value stored.
This btree implementation stores the fixed number of values on nodes (often around 64 when MaxNumOfValues is left at platanus::kAutoSize) and doesn't store child pointers for leaf nodes.
The result is that a platanus::btree_set<std::int32> may use much less memory per stored value.
For the random insertion benchmark, a platanus::btree_set<std::int32_t> with 64 values per node uses 4.64 bytes per stored value.
The packing of multiple values into nodes of a btree has another effect besides better space utilization: better cache locality due to fewer cache lines being accessed, that is, faster operations.
Feature¶
- satisfying alignment requirement,
- supporting stateful comparer,
- the standard C++ container-like interface,
- supporting CMake,
- easier to read than the original,
- supporting three-way comparison operator,
- supporting
std::pmr::polymorphic_allocator(inplatanus::pmrnamespace).
Performance¶
In benchmark/bench_result_with_absl.json, platanus(auto) outperforms STL in 45 out of 60 benchmark groups, with a geometric-mean speedup of 2.301x over STL.
The largest wins appear in forward iteration, where platanus(auto) is faster in all 12 groups and reaches roughly 18.4x geometric-mean speedup.
Insert and lookup are also usually faster than STL, while delete and merge are more mixed and can regress for some string-heavy cases.
Against absl, however, platanus(auto) is generally slower in this benchmark set.
absl is the fastest implementation in 53 out of 60 groups, while platanus(auto) is the fastest in 1 group.
So, in the author's current benchmark environment, platanus is often a strong improvement over STL, but not a drop-in performance win over absl.
Limitation and caveats¶
Supporting the orders of B-Tree¶
We need at least 3 values per a node in order to perform splitting (1 value for the two nodes involved in the split and 1 value propagated to the parent as the delimiter for the split). That is, We don't support the 3 order B-trees.
Invalidating iterators after insertions and deletions¶
Insertions and deletions on a btree can cause splitting, merging or rebalancing of btree nodes.
And even without these operations, insertions and deletions on a btree will move values around within a node.
In both cases, the result is that insertions and deletions can invalidate iterators pointing to values other than the one being inserted/deleted.
This is notably different from STL set/map which takes care to not invalidate iterators on insert/erase except, of course, for iterators pointing to the value being erased.
A partial workaround when erasing is available: erase() returns an iterator pointing to the item just after the one that was erased (or end() if none exists).
Thread-safety¶
All functions and classes in this library are not thread-safe.
Installation¶
platanus is a header-only library, so you don't have to install it.
When using CMake, you only have to link your program to platanus::platanus.
For example, target_link_libraries(your_program PUBLIC platanus::platanus).
Test¶
If you want to test, for example, run the following commands in the top directory of platanus.
Performance test¶
If you want to check how much platanus is faster than STL, run the following commands.
cmake -S . -B build/release -DPLATANUS_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build/release --target btree_bench
cd build/release/benchmark
./btree_bench
The output has the following form.
BM_<(implementation)(container type)<(value type)[, (the max number of values per node or platanus::kAutoSize)]>>
(average time[ns] per iteration)
(average CPU time[ns] per iteration)
(the total of iterations)
When PLATANUS_BENCHMARK_WITH_ABSL=ON (default), benchmark registration switches to comparison mode and the benchmark set becomes:
| implementation | note |
|---|---|
STL |
std::set, std::multiset, std::map, std::multimap |
Absl |
absl::btree_set, absl::btree_multiset, absl::btree_map, absl::btree_multimap |
BTree(..., 64) |
platanus non-pmr containers with 64 values per node |
In this comparison mode:
BTree(..., 128)is not registered.platanus::pmrbenchmark variants are not registered.- the intended comparison set is exactly
STL / absl / platanus(64).
If you want the old wider benchmark set without absl, configure with:
cmake -S . -B build/release -DPLATANUS_BUILD_BENCHMARK=ON -DPLATANUS_BENCHMARK_WITH_ABSL=OFF -DCMAKE_BUILD_TYPE=Release
cmake --build build/release --target btree_bench
The test cases are:
| test case | meaning |
|---|---|
| Insert | Benchmark insertion of random values into an empty container. |
| Lookup | Benchmark lookup of random values in a container. |
| Delete | Benchmark deletion of random values from a container. |
| FwdIter | Benchmark iteration (forward) and reference through the container. Note that this benchmark includes a copy constructing of key_type. |
| Merge | Benchmark merging two containers with the same size. |
If you want to know a good size of values per node, run the following command.
cmake -S . -B build/release -DPLATANUS_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release -DPLATANUS_BENCHMARK_VALUES_SIZE_TEST
cmake --build build/release --target btree_bench
cd build/release/benchmark
./btree_bench
For plotting benchmark JSON output, use:
The plotting script understands STL, absl, platanus(N), and platanus(auto) benchmark names and can also render a single JSON file.
License¶
platanus is licensed under Apache License, Version 2.0.