Rope Data Structure (C++ Implementation)
This library provides an efficient rope implementation in C++ for large, mutable text. It supports fast insertion, deletion, concatenation, iteration, and regex-based search. The design uses fixed-size leaf chunks and automatically maintains a balanced binary tree.
Features
- Efficient handling of large text buffers
- Log-time insertion, deletion, and concatenation
- Optional tunable chunk sizes
- Line, leaf, and byte iterators
- Regex search using PCRE2 (DFA mode)
- Regex using
noosecan also be done but it is much slower for now, I am working on making it faster. - Zero-copy leaf iteration (safe immutable reads)
- Fully heap-allocated, manual memory management
- Deterministic balancing strategy (AVL-style depth tracking)
⚡ Performance Benchmarks
Performance comparison between this Rope implementation and the standard C++ std::string.
Test Context:
- Dataset: ~1 GiB generated text (1,073,741,856 bytes).
- Goal: Simulating heavy text-editing operations.
Key Highlights
When handling large datasets, the structural advantages of the Rope become massive:
- Appending Large Text: 2.4 Million x faster (0.001ms vs 1831ms)
- Splitting Text: 162,000x faster
- Insert (Middle): 87,000x faster
- Erase: 15,000x faster
Numbers
| Operation | Rope Time | String Time | String to Rope Ratio | Winner |
|---|---|---|---|---|
| Concat (Append Large) | 0.001 ms | 1831.666 ms | 2,429,265x | ✅ Rope |
| Split (Half) | 0.008 ms | 1353.086 ms | 162,240x | ✅ Rope |
| Insert (Middle) | 0.011 ms | 976.105 ms | 87,206x | ✅ Rope |
| Erase (5KB) | 0.013 ms | 207.760 ms | 15,953x | ✅ Rope |
| Iterate 1000 Lines | 0.063 ms | 5.723 ms | 90x | ✅ Rope |
| Concat (Append small) | 0.002 ms | 0.000 ms | 0.1 | ❌ String |
| Load / Create | 1512.449 ms | 871.330 ms | 0.58x | ❌ String |
| Free / Destruct | 194.167 ms | 153.365 ms | 0.79x | ❌ String |
| Read / Substr (1KiB) | 0.008 ms | 0.002 ms | 0.24x | ❌ String |
| Search (Regex) | 6417.719 ms | 1.526 ms | ~0x | ❌ String |
Why is the Rope faster?
Standard strings require contiguous memory. When you insert text into the middle of a 1GB std::string, the CPU must shift all subsequent bytes in memory (O(n) complexity). This Rope implementation uses a tree structure, allowing insertions and deletions by simply modifying pointers (O(\log n) complexity).
Also for concatenation ropes have amortized constant time vs strings with linear time.
So smaller concatenation strings are faster but for any larger concatenation the rope is best.
Why is the String faster for Regex/Read?
std::string is just a flat array of bytes, which is extremely cache-friendly for linear scanning and regex engines. Ropes require tree traversal to read sequential data, resulting in higher overhead for read-only operations.
I have also tested against ropey the rust library and rust strings.
Ropey has much faster load times but is slightly slower than my implementation in other operations.
I am not including the tests here as my test might not be in the most optimal condition and so have different results.
The tests included here are the ones that I have done myself with clang++ compiling both tests in the same binary. (see: tests/benchmark.cpp)
Also on a side note rust strings are faster than c++ strings apart from reading.
Rope Node Structure
Each rope node (Knot) is either:
- an internal node with
leftandrightchildren, or - a leaf node containing text data of size
chunk_size.
Metadata stored per node:
depth– subtree heightchunk_size– leaf capacityline_count– number of\nnewline characterschar_count– total byte length covered by the subtree
typedef struct Knot {
Knot *left;
Knot *right;
uint8_t depth;
uint32_t chunk_size;
uint32_t line_count;
uint32_t char_count;
char data[];
} Knot;
Chunk Size
The library supports arbitrary positive chunk sizes, but provides:
uint32_t optimal_chunk_size(uint64_t length);
This suggests a chunk size based on the target input size. Valid range is:
- MIN_CHUNK_SIZE: 64 bytes
- MAX_CHUNK_SIZE: 8192 bytes
You may choose any value; all ropes participating in concat operations must use the same chunk size.
API Overview
Construction and Loading
load(char *str, uint32_t len, uint32_t chunk_size)
Builds a rope from a raw byte buffer.
str is not consumed and may be freed after loading.
Structural Operations
Knot *concat(Knot *left, Knot *right)
Concatenates two ropes (must share chunk size). Both input roots are invalid after the call.
Knot *insert(Knot *node, uint32_t offset, char *str, uint32_t len)
Inserts text at a byte offset. Returns a new rope. The original node becomes invalid.
Knot *erase(Knot *node, uint32_t offset, uint32_t len)
Deletes a byte range. Returns a new rope. The original node becomes invalid.
void split(Knot *node, uint32_t offset, Knot **left, Knot **right)
Splits a rope into two ropes at the given byte offset. The original node becomes invalid.
char *read(Knot *root, uint32_t offset, uint32_t len)
Extracts a substring. Returns a null-terminated buffer; caller must free it.
Line & Byte Mapping
uint32_t byte_to_line(Knot *root, uint32_t offset)
Converts a byte offset into a line index.
uint32_t line_to_byte(Knot *root, uint32_t line, uint32_t *out_len)
Returns the byte offset of the beginning of a line and outputs that line’s length.
Iterators
Line Iterator
LineIterator *begin_l_iter(Knot *root, uint32_t start_line);
char *next_line(LineIterator *it); // caller frees result
Leaf Iterator
LeafIterator *begin_k_iter(Knot *root);
char *next_leaf(LeafIterator *it); // DO NOT free result
Byte Iterator
ByteIterator *begin_b_iter(Knot *root);
char next_byte(ByteIterator *it);
All iterators must be freed by the caller after use.
For
ByteIterator,ByteIterator.itmust be freed before the iterator is freed.
Searching
std::vector<std::pair<size_t, size_t>> search_rope(Knot *root, const char *pattern)
Searches the rope using PCRE2 in DFA mode.
Returns a vector of (start_offset, length) pairs.
Only deterministic patterns are supported (no backtracking).
Memory Management
void free_rope(Knot *root)
Recursively frees all nodes in the rope. Must be called once when the rope is no longer needed.
Example Usage
uint32_t chunk_size = optimal_chunk_size(strlen(input));
Knot *r = load(input, strlen(input), chunk_size);
r = insert(r, 5, "hello", 5);
r = erase(r, 20, 3);
char *sub = read(r, 10, 25);
printf("%s\n", sub);
free(sub);
free_rope(r);