Red-Black Tree with Okasaki's Functional Balancing
Tutorial Video
Text description (accessibility)
This video demonstrates the "Red-Black Tree with Okasaki's Functional Balancing" functional Rust example. Difficulty level: Advanced. Key concepts covered: Trees, Functional Data Structures. Implement a persistent red-black tree with insertion that maintains balance using Okasaki's elegant four-case rebalancing rule expressed as pattern matching. Key difference from OCaml: 1. **Heap allocation:** OCaml GC handles recursive types automatically; Rust requires explicit `Box<T>` for indirection
Tutorial
The Problem
Implement a persistent red-black tree with insertion that maintains balance using Okasaki's elegant four-case rebalancing rule expressed as pattern matching.
🎯 Learning Outcomes
type 'a rbtree) to Rust enums with Box<T> for recursive typesbox patterns using match guards and let-elseimpl blocks) for functional data structures — idiomatic Rust OOP wrapping a pure functional coreClone is needed for in-order traversal when the tree retains ownership of its values🦀 The Rust Way
Rust mirrors the OCaml type with enum RBTree<V> using Box for heap-allocated children. The
four balance cases use match guards (if matches!(*ll, T(Red, ..))) since Rust doesn't support
nested box patterns on stable. Methods live in impl<V: Ord> blocks, giving a fluent API
(tree.insert(x).insert(y)) while the internals remain purely functional (each insert returns
a new tree).
Code Example
impl<V: Ord> RBTree<V> {
pub fn insert(self, value: V) -> Self {
match Self::ins(value, self) {
T(_, left, v, right) => T(Black, left, v, right),
E => E,
}
}
fn ins(x: V, tree: Self) -> Self {
match tree {
E => T(Red, Box::new(E), x, Box::new(E)),
T(color, left, y, right) => {
if x < y {
Self::balance(color, Self::ins(x, *left), y, *right)
} else if x > y {
Self::balance(color, *left, y, Self::ins(x, *right))
} else {
T(color, left, y, right)
}
}
}
}
}Key Differences
Box<T> for indirectionlet-else destructuringinsert takes self by value (consuming the old tree); OCaml shares structure via GCimpl blocks for a natural tree.insert(x) API; OCaml uses free functions insert x tOCaml Approach
OCaml represents the tree as a sum type 'a rbtree with constructors E and T. The balance
function uses a single function match with four or-patterns to detect all red-red violations,
collapsing them into one canonical balanced node. Insertion is a nested let rec ins with the root
repainted black afterward. The entire implementation is around 20 lines.
Full Source
#![allow(clippy::all)]
// Red-Black Tree with Okasaki's Functional Balancing (Method Style)
//
// A purely functional red-black tree using Okasaki's four-case balance rule,
// expressed with methods on the tree type rather than free functions.
// The tree is persistent: `insert` returns a new tree, leaving the original intact.
/// Node color in a red-black tree.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Color {
Red,
Black,
}
/// A persistent red-black tree.
///
/// `E` is the empty leaf. `T` holds a color, left subtree, value, and right
/// subtree — a direct mirror of the OCaml `'a rbtree` type.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum RBTree<V> {
E,
T(Color, Box<RBTree<V>>, V, Box<RBTree<V>>),
}
use Color::{Black, Red};
use RBTree::{E, T};
// ---------------------------------------------------------------------------
// Solution 1: Idiomatic Rust — methods with Okasaki balance
// ---------------------------------------------------------------------------
impl<V: Ord> RBTree<V> {
/// Create an empty red-black tree.
pub fn new() -> Self {
E
}
/// Insert a value, returning a new balanced tree.
///
/// The inner recursive `ins` creates red nodes, then `balance` fixes any
/// red-red violations. The root is always repainted black.
pub fn insert(self, value: V) -> Self {
// Repaint root black after insertion
match Self::ins(value, self) {
T(_, left, v, right) => T(Black, left, v, right),
E => E,
}
}
/// Check whether `value` is present in the tree.
///
/// Mirrors the OCaml `mem` function: recursively descend left or right
/// based on comparison, returning `true` on an exact match.
pub fn mem(&self, value: &V) -> bool {
match self {
E => false,
T(_, left, v, right) => {
if value == v {
true
} else if value < v {
left.mem(value)
} else {
right.mem(value)
}
}
}
}
/// Internal recursive insert — colors new leaves red.
fn ins(x: V, tree: Self) -> Self {
match tree {
E => T(Red, Box::new(E), x, Box::new(E)),
T(color, left, y, right) => {
if x < y {
Self::balance(color, Self::ins(x, *left), y, *right)
} else if x > y {
Self::balance(color, *left, y, Self::ins(x, *right))
} else {
// Duplicate — return tree unchanged
T(color, left, y, right)
}
}
}
}
/// Okasaki's balance: four red-red violation cases → one balanced result.
///
/// Each arm detects a different position of the red-red violation and
/// rotates into the canonical form:
/// `T(Red, T(Black, a, x, b), y, T(Black, c, z, d))`
fn balance(color: Color, left: Self, val: V, right: Self) -> Self {
// Helper: produce the common rebalanced node
let balanced = |a: Self, x: V, b: Self, y: V, c: Self, z: V, d: Self| -> Self {
T(
Red,
Box::new(T(Black, Box::new(a), x, Box::new(b))),
y,
Box::new(T(Black, Box::new(c), z, Box::new(d))),
)
};
match (color, left, val, right) {
// Case 1: left-left red-red
(Black, T(Red, ll, y, c), z, d) if matches!(*ll, T(Red, ..)) => {
let T(_, a, x, b) = *ll else { unreachable!() };
balanced(*a, x, *b, y, *c, z, d)
}
// Case 2: left-right red-red
(Black, T(Red, a, x, lr), z, d) if matches!(*lr, T(Red, ..)) => {
let T(_, b, y, c) = *lr else { unreachable!() };
balanced(*a, x, *b, y, *c, z, d)
}
// Case 3: right-left red-red
(Black, a, x, T(Red, rl, z, d)) if matches!(*rl, T(Red, ..)) => {
let T(_, b, y, c) = *rl else { unreachable!() };
balanced(a, x, *b, y, *c, z, *d)
}
// Case 4: right-right red-red
(Black, a, x, T(Red, b, y, rr)) if matches!(*rr, T(Red, ..)) => {
let T(_, c, z, d) = *rr else { unreachable!() };
balanced(a, x, *b, y, *c, z, *d)
}
// No violation — reconstruct unchanged
(col, a, x, b) => T(col, Box::new(a), x, Box::new(b)),
}
}
}
// ---------------------------------------------------------------------------
// Solution 2: Accumulator-based in-order traversal
// ---------------------------------------------------------------------------
impl<V: Clone> RBTree<V> {
/// Collect all values in sorted (in-order) order.
///
/// Uses a mutable accumulator for O(n) traversal, avoiding the O(n log n)
/// cost of OCaml's repeated list append (`@`).
pub fn to_sorted_vec(&self) -> Vec<V> {
let mut result = Vec::new();
self.collect_inorder(&mut result);
result
}
fn collect_inorder(&self, acc: &mut Vec<V>) {
if let T(_, left, v, right) = self {
left.collect_inorder(acc);
acc.push(v.clone()); // clone: tree retains ownership of values
right.collect_inorder(acc);
}
}
}
// ---------------------------------------------------------------------------
// Solution 3: Recursive to_list — closer to OCaml style (allocates per level)
// ---------------------------------------------------------------------------
impl<V: Clone> RBTree<V> {
/// In-order list, OCaml style: `to_list a @ [v] @ to_list b`.
///
/// Allocates at every level, mirroring the OCaml original. Prefer
/// `to_sorted_vec` for real use.
pub fn to_list_recursive(&self) -> Vec<V> {
match self {
E => Vec::new(),
T(_, left, v, right) => {
let mut result = left.to_list_recursive();
result.push(v.clone()); // clone: tree keeps ownership
result.extend(right.to_list_recursive());
result
}
}
}
}
// ---------------------------------------------------------------------------
// Invariant validation helpers
// ---------------------------------------------------------------------------
impl<V: Ord> RBTree<V> {
/// Validate all red-black tree invariants:
/// 1. Root is black
/// 2. No red node has a red child
/// 3. Every root-to-leaf path has the same black-height
pub fn is_valid(&self) -> bool {
let root_black = !matches!(self, T(Red, ..));
root_black && self.no_red_red() && self.black_height().is_some()
}
/// Check that no red node has a red child.
fn no_red_red(&self) -> bool {
match self {
E => true,
T(Red, left, _, right) => {
!matches!(left.as_ref(), T(Red, ..))
&& !matches!(right.as_ref(), T(Red, ..))
&& left.no_red_red()
&& right.no_red_red()
}
T(Black, left, _, right) => left.no_red_red() && right.no_red_red(),
}
}
/// Compute the black-height (uniform black count on every root-to-leaf path).
/// Returns `None` if the invariant is violated.
fn black_height(&self) -> Option<usize> {
match self {
E => Some(1), // leaves count as black
T(color, left, _, right) => {
let lh = left.black_height()?;
let rh = right.black_height()?;
if lh != rh {
return None;
}
Some(lh + usize::from(*color == Black))
}
}
}
}
impl<V: Ord> Default for RBTree<V> {
fn default() -> Self {
Self::new()
}
}
/// Build a tree from an iterator by folding `insert` over each element.
///
/// Direct analog of OCaml's `List.fold_left (fun t x -> insert x t) E xs`.
impl<V: Ord> FromIterator<V> for RBTree<V> {
fn from_iter<I: IntoIterator<Item = V>>(iter: I) -> Self {
iter.into_iter().fold(E, |tree, x| tree.insert(x))
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty_tree() {
let tree: RBTree<i32> = RBTree::new();
assert!(!tree.mem(&1));
assert_eq!(tree.to_sorted_vec(), Vec::<i32>::new());
assert!(tree.is_valid());
}
#[test]
fn test_single_insert() {
let tree = RBTree::new().insert(42);
assert!(tree.mem(&42));
assert!(!tree.mem(&0));
assert_eq!(tree.to_sorted_vec(), vec![42]);
// Root must be black after insert
assert!(matches!(tree, T(Black, ..)));
assert!(tree.is_valid());
}
#[test]
fn test_multiple_inserts_sorted_output() {
// Same sequence as the OCaml example
let tree = RBTree::from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
assert_eq!(tree.to_sorted_vec(), vec![1, 2, 3, 4, 5, 6, 7, 8, 9]);
}
#[test]
fn test_membership() {
let tree = RBTree::from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
for v in 1..=9 {
assert!(tree.mem(&v), "expected {v} in tree");
}
assert!(!tree.mem(&0));
assert!(!tree.mem(&10));
}
#[test]
fn test_duplicate_inserts_ignored() {
let tree = RBTree::from_iter([3, 1, 2, 3, 1, 2, 3]);
assert_eq!(tree.to_sorted_vec(), vec![1, 2, 3]);
}
#[test]
fn test_invariants_after_scrambled_inserts() {
let tree = RBTree::from_iter([
10, 5, 15, 3, 7, 12, 18, 1, 4, 6, 8, 11, 13, 17, 20, 2, 9, 14, 16, 19,
]);
assert!(tree.is_valid());
assert_eq!(tree.to_sorted_vec(), (1..=20).collect::<Vec<_>>());
}
#[test]
fn test_ascending_insert_stays_balanced() {
let tree = RBTree::from_iter(1..=100);
assert!(tree.is_valid());
assert_eq!(tree.to_sorted_vec(), (1..=100).collect::<Vec<_>>());
}
#[test]
fn test_descending_insert_stays_balanced() {
let tree = RBTree::from_iter((1..=50).rev());
assert!(tree.is_valid());
assert_eq!(tree.to_sorted_vec(), (1..=50).collect::<Vec<_>>());
}
#[test]
fn test_from_iter_empty() {
let tree: RBTree<i32> = RBTree::from_iter(std::iter::empty());
assert_eq!(tree, E);
assert!(tree.is_valid());
}
#[test]
fn test_to_list_recursive_matches_sorted_vec() {
let tree = RBTree::from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
assert_eq!(tree.to_list_recursive(), tree.to_sorted_vec());
}
#[test]
fn test_string_keys() {
let tree = RBTree::from_iter(["cherry", "apple", "banana", "date"]);
assert_eq!(
tree.to_sorted_vec(),
vec!["apple", "banana", "cherry", "date"]
);
assert!(tree.mem(&"banana"));
assert!(!tree.mem(&"fig"));
assert!(tree.is_valid());
}
#[test]
fn test_default_is_empty() {
let tree: RBTree<i32> = RBTree::default();
assert_eq!(tree, E);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty_tree() {
let tree: RBTree<i32> = RBTree::new();
assert!(!tree.mem(&1));
assert_eq!(tree.to_sorted_vec(), Vec::<i32>::new());
assert!(tree.is_valid());
}
#[test]
fn test_single_insert() {
let tree = RBTree::new().insert(42);
assert!(tree.mem(&42));
assert!(!tree.mem(&0));
assert_eq!(tree.to_sorted_vec(), vec![42]);
// Root must be black after insert
assert!(matches!(tree, T(Black, ..)));
assert!(tree.is_valid());
}
#[test]
fn test_multiple_inserts_sorted_output() {
// Same sequence as the OCaml example
let tree = RBTree::from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
assert_eq!(tree.to_sorted_vec(), vec![1, 2, 3, 4, 5, 6, 7, 8, 9]);
}
#[test]
fn test_membership() {
let tree = RBTree::from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
for v in 1..=9 {
assert!(tree.mem(&v), "expected {v} in tree");
}
assert!(!tree.mem(&0));
assert!(!tree.mem(&10));
}
#[test]
fn test_duplicate_inserts_ignored() {
let tree = RBTree::from_iter([3, 1, 2, 3, 1, 2, 3]);
assert_eq!(tree.to_sorted_vec(), vec![1, 2, 3]);
}
#[test]
fn test_invariants_after_scrambled_inserts() {
let tree = RBTree::from_iter([
10, 5, 15, 3, 7, 12, 18, 1, 4, 6, 8, 11, 13, 17, 20, 2, 9, 14, 16, 19,
]);
assert!(tree.is_valid());
assert_eq!(tree.to_sorted_vec(), (1..=20).collect::<Vec<_>>());
}
#[test]
fn test_ascending_insert_stays_balanced() {
let tree = RBTree::from_iter(1..=100);
assert!(tree.is_valid());
assert_eq!(tree.to_sorted_vec(), (1..=100).collect::<Vec<_>>());
}
#[test]
fn test_descending_insert_stays_balanced() {
let tree = RBTree::from_iter((1..=50).rev());
assert!(tree.is_valid());
assert_eq!(tree.to_sorted_vec(), (1..=50).collect::<Vec<_>>());
}
#[test]
fn test_from_iter_empty() {
let tree: RBTree<i32> = RBTree::from_iter(std::iter::empty());
assert_eq!(tree, E);
assert!(tree.is_valid());
}
#[test]
fn test_to_list_recursive_matches_sorted_vec() {
let tree = RBTree::from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
assert_eq!(tree.to_list_recursive(), tree.to_sorted_vec());
}
#[test]
fn test_string_keys() {
let tree = RBTree::from_iter(["cherry", "apple", "banana", "date"]);
assert_eq!(
tree.to_sorted_vec(),
vec!["apple", "banana", "cherry", "date"]
);
assert!(tree.mem(&"banana"));
assert!(!tree.mem(&"fig"));
assert!(tree.is_valid());
}
#[test]
fn test_default_is_empty() {
let tree: RBTree<i32> = RBTree::default();
assert_eq!(tree, E);
}
}
Deep Comparison
OCaml vs Rust: Red-Black Tree with Okasaki's Functional Balancing
Side-by-Side Code
OCaml
type color = Red | Black
type 'a rbtree = E | T of color * 'a rbtree * 'a * 'a rbtree
let balance = function
| Black, T (Red, T (Red, a, x, b), y, c), z, d
| Black, T (Red, a, x, T (Red, b, y, c)), z, d
| Black, a, x, T (Red, T (Red, b, y, c), z, d)
| Black, a, x, T (Red, b, y, T (Red, c, z, d)) ->
T (Red, T (Black, a, x, b), y, T (Black, c, z, d))
| color, a, x, b -> T (color, a, x, b)
let insert x t =
let rec ins = function
| E -> T (Red, E, x, E)
| T (color, a, y, b) ->
if x < y then balance (color, ins a, y, b)
else if x > y then balance (color, a, y, ins b)
else T (color, a, y, b)
in
match ins t with T (_, a, y, b) -> T (Black, a, y, b) | E -> E
Rust (idiomatic — method style)
impl<V: Ord> RBTree<V> {
pub fn insert(self, value: V) -> Self {
match Self::ins(value, self) {
T(_, left, v, right) => T(Black, left, v, right),
E => E,
}
}
fn ins(x: V, tree: Self) -> Self {
match tree {
E => T(Red, Box::new(E), x, Box::new(E)),
T(color, left, y, right) => {
if x < y {
Self::balance(color, Self::ins(x, *left), y, *right)
} else if x > y {
Self::balance(color, *left, y, Self::ins(x, *right))
} else {
T(color, left, y, right)
}
}
}
}
}
Rust (balance function with match guards)
fn balance(color: Color, left: Self, val: V, right: Self) -> Self {
match (color, left, val, right) {
// Case 1: left-left red-red
(Black, T(Red, ll, y, c), z, d) if matches!(*ll, T(Red, ..)) => {
let T(_, a, x, b) = *ll else { unreachable!() };
balanced(*a, x, *b, y, *c, z, d)
}
// Case 2: left-right red-red
(Black, T(Red, a, x, lr), z, d) if matches!(*lr, T(Red, ..)) => {
let T(_, b, y, c) = *lr else { unreachable!() };
balanced(*a, x, *b, y, *c, z, d)
}
// ... cases 3 and 4 symmetric on the right side
(col, a, x, b) => T(col, Box::new(a), x, Box::new(b)),
}
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Tree type | 'a rbtree | RBTree<V> (enum with Box indirection) |
| Color type | color (sum type) | Color (enum) |
| Empty tree | E | RBTree::E |
| Node | T of color * 'a rbtree * 'a * 'a rbtree | T(Color, Box<RBTree<V>>, V, Box<RBTree<V>>) |
| Insert | val insert : 'a -> 'a rbtree -> 'a rbtree | fn insert(self, value: V) -> Self |
| Membership | val mem : 'a -> 'a rbtree -> bool | fn mem(&self, value: &V) -> bool |
| To list | val to_list : 'a rbtree -> 'a list | fn to_sorted_vec(&self) -> Vec<V> |
Key Insights
|. Rust stable cannot match inside Box, so each case needs a separate arm with a matches! guard followed by let-else destructuring. This is the largest syntactic gap between the two translations.insert takes self by value because it deconstructs the tree. OCaml's GC allows structural sharing implicitly; in Rust, the caller gives up the old tree and receives a new one. This makes the functional "persistent" style explicit at the type level.Box<T> as the price of recursion:** OCaml's runtime handles recursive types transparently. Rust requires Box to break the infinite-size cycle. Every node construction wraps children in Box::new(...), adding syntactic noise but making heap allocation visible.insert x t (free function, value first). Rust idiom is tree.insert(x) (method, receiver first). The impl block groups related operations, enabling chaining: RBTree::new().insert(5).insert(3).to_sorted_vec) must clone() each value to produce a Vec without consuming the tree. OCaml's GC allows returning shared references; Rust makes the copy cost explicit.When to Use Each Style
Use idiomatic Rust (method style) when: building a reusable data structure library — the impl block API is discoverable, chainable, and integrates with Rust's trait system (Default, Debug, Iterator).
Use recursive/OCaml-style Rust when: prototyping algorithms from functional programming literature or teaching — the structural recursion maps directly to Okasaki's pseudocode and makes the balance invariant visually obvious in the pattern matching.
Exercises
is_balanced — a function that verifies all red-black invariants (no two consecutive red nodes, equal black heights on all paths) and returns a descriptive error on violation.remove_min operation that deletes the smallest element while maintaining balance.MultiSet variant of the red-black tree that allows duplicate keys by storing a count alongside each element.