Example 1083: Red-Black Tree
Difficulty: โญโญโญ Category: Trees OCaml Source: Cornell CS3110 โ Functional Programming in OCaml (Okasaki's algorithm)Problem Statement
Implement a purely functional red-black tree supporting insert and membership lookup, using Okasaki's elegant balancing technique where all four rotation cases collapse into a single rewrite rule.Learning Outcomes
- How Rust enums model algebraic data types (sum types with data) just like OCaml variants
- Ownership-based tree restructuring via path copying โ Rust's move semantics naturally implement persistent data structures
- Pattern matching on nested owned data requires a peek-then-destructure strategy, unlike OCaml's direct nested pattern matching
- The `Box<T>` heap allocation pattern for recursive data structures
OCaml Approach
OCaml defines the tree as a simple algebraic type `'a rbtree = E | T of color 'a rbtree 'a * 'a rbtree`. The `balance` function uses a single `match` with four or-patterns to catch all red-red violations and rewrite them into one canonical balanced form. This is the essence of Okasaki's insight โ the code is remarkably concise because OCaml allows deep nested pattern matching across multiple cases in one arm.Rust Approach
Rust uses `enum RBTree<T>` with `Box` for recursive children. The `balance` function cannot match four nested patterns in one arm like OCaml, so it peeks at the structure via references (to determine which case applies) then destructures by moving owned values out. The tree is generic over `T: Ord` and uses `Ordering` for comparisons. Insert takes `self` by value and returns a new tree โ natural path-copying via Rust's move semantics.Key Differences
1. Pattern matching depth: OCaml matches 3 levels deep across 4 cases in one arm; Rust requires sequential peek-then-destructure for each case 2. Memory management: OCaml's GC handles sharing automatically; Rust uses `Box<T>` with explicit heap allocation and move semantics for path copying 3. Polymorphism: OCaml uses `'a` with structural equality; Rust uses `T: Ord` trait bound for ordered comparisons 4. Conciseness: The OCaml `balance` is ~6 lines; Rust's is ~80 lines due to explicit destructuring โ but both encode the same four-case logic/// Red-Black Tree โ Okasaki's purely functional balanced BST
use std::cmp::Ordering;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum Color {
Red,
Black,
}
#[derive(Debug, Clone, PartialEq, Eq)]
enum RBTree<T> {
Empty,
Node {
color: Color,
left: Box<RBTree<T>>,
value: T,
right: Box<RBTree<T>>,
},
}
use Color::{Black, Red};
use RBTree::{Empty, Node};
impl<T> RBTree<T> {
fn node(color: Color, left: RBTree<T>, value: T, right: RBTree<T>) -> Self {
Node {
color,
left: Box::new(left),
value,
right: Box::new(right),
}
}
fn is_red_node(&self) -> bool {
matches!(self, Node { color: Red, .. })
}
}
impl<T: Ord> RBTree<T> {
fn new() -> Self {
Empty
}
fn mem(&self, x: &T) -> bool {
match self {
Empty => false,
Node {
left, value, right, ..
} => match x.cmp(value) {
Ordering::Equal => true,
Ordering::Less => left.mem(x),
Ordering::Greater => right.mem(x),
},
}
}
fn insert(self, x: T) -> Self
where
T: Clone,
{
fn ins<T: Ord + Clone>(tree: RBTree<T>, x: &T) -> RBTree<T> {
match tree {
Empty => RBTree::node(Red, Empty, x.clone(), Empty),
Node {
color,
left,
value,
right,
} => match x.cmp(&value) {
Ordering::Less => balance(color, ins(*left, x), value, *right),
Ordering::Greater => balance(color, *left, value, ins(*right, x)),
Ordering::Equal => RBTree::node(color, *left, value, *right),
},
}
}
match ins(self, &x) {
Node {
left,
value,
right,
..
} => Node {
color: Black,
left,
value,
right,
},
Empty => Empty,
}
}
fn to_sorted_vec(&self) -> Vec<&T> {
match self {
Empty => vec![],
Node {
left, value, right, ..
} => {
let mut result = left.to_sorted_vec();
result.push(value);
result.extend(right.to_sorted_vec());
result
}
}
}
}
fn balance<T>(color: Color, left: RBTree<T>, value: T, right: RBTree<T>) -> RBTree<T> {
if color != Black {
return RBTree::node(color, left, value, right);
}
// Case 1: left-left
if left.is_red_node() {
if let Node { left: ref ll, .. } = left {
if ll.is_red_node() {
if let Node {
left: ll_box,
value: y,
right: c,
..
} = left
{
if let Node {
left: a,
value: x,
right: b,
..
} = *ll_box
{
return RBTree::node(
Red,
RBTree::node(Black, *a, x, *b),
y,
RBTree::node(Black, *c, value, right),
);
}
}
unreachable!();
}
}
}
// Case 2: left-right
if left.is_red_node() {
if let Node { right: ref lr, .. } = left {
if lr.is_red_node() {
if let Node {
left: a,
value: x,
right: lr_box,
..
} = left
{
if let Node {
left: b,
value: y,
right: c,
..
} = *lr_box
{
return RBTree::node(
Red,
RBTree::node(Black, *a, x, *b),
y,
RBTree::node(Black, *c, value, right),
);
}
}
unreachable!();
}
}
}
// Case 3: right-left
if right.is_red_node() {
if let Node { left: ref rl, .. } = right {
if rl.is_red_node() {
if let Node {
left: rl_box,
value: z,
right: d,
..
} = right
{
if let Node {
left: b,
value: y,
right: c,
..
} = *rl_box
{
return RBTree::node(
Red,
RBTree::node(Black, left, value, *b),
y,
RBTree::node(Black, *c, z, *d),
);
}
}
unreachable!();
}
}
}
// Case 4: right-right
if right.is_red_node() {
if let Node { right: ref rr, .. } = right {
if rr.is_red_node() {
if let Node {
left: b,
value: y,
right: rr_box,
..
} = right
{
if let Node {
left: c,
value: z,
right: d,
..
} = *rr_box
{
return RBTree::node(
Red,
RBTree::node(Black, left, value, *b),
y,
RBTree::node(Black, *c, z, *d),
);
}
}
unreachable!();
}
}
}
RBTree::node(Black, left, value, right)
}
fn from_iter<T: Ord + Clone>(iter: impl IntoIterator<Item = T>) -> RBTree<T> {
iter.into_iter().fold(RBTree::new(), |t, x| t.insert(x))
}
fn main() {
let tree = from_iter([5, 3, 7, 1, 4, 6, 8, 2, 9]);
print!("Sorted: ");
for v in tree.to_sorted_vec() {
print!("{v} ");
}
println!();
println!("mem(5) = {}", tree.mem(&5));
println!("mem(10) = {}", tree.mem(&10));
// Duplicate insert โ should not increase size
let tree2 = from_iter([3, 1, 3, 2, 1, 3]);
print!("Deduped: ");
for v in tree2.to_sorted_vec() {
print!("{v} ");
}
println!();
}
/* Output:
Sorted: 1 2 3 4 5 6 7 8 9
mem(5) = true
mem(10) = false
Deduped: 1 2 3
*/
type color = Red | Black
type 'a rbtree = E | T of color * 'a rbtree * 'a * 'a rbtree
(* Okasaki's balance: four rotation cases โ one canonical form *)
let balance = function
| Black, T (Red, T (Red, a, x, b), y, c), z, d
| Black, T (Red, a, x, T (Red, b, y, c)), z, d
| Black, a, x, T (Red, T (Red, b, y, c), z, d)
| Black, a, x, T (Red, b, y, T (Red, c, z, d)) ->
T (Red, T (Black, a, x, b), y, T (Black, c, z, d))
| color, a, x, b -> T (color, a, x, b)
(* Insert with rebalancing โ root is always repainted black *)
let insert x t =
let rec ins = function
| E -> T (Red, E, x, E)
| T (color, a, y, b) ->
if x < y then balance (color, ins a, y, b)
else if x > y then balance (color, a, y, ins b)
else T (color, a, y, b)
in
match ins t with T (_, a, y, b) -> T (Black, a, y, b) | E -> E
(* Membership test โ O(log n) *)
let rec mem x = function
| E -> false
| T (_, a, y, b) -> x = y || (if x < y then mem x a else mem x b)
(* In-order traversal to sorted list *)
let rec to_list = function
| E -> []
| T (_, a, v, b) -> to_list a @ [v] @ to_list b
(* Tests *)
let () =
let t = List.fold_left (fun t x -> insert x t) E [5;3;7;1;4;6;8;2;9] in
assert (to_list t = [1;2;3;4;5;6;7;8;9]);
assert (mem 5 t = true);
assert (mem 10 t = false);
(* Duplicate inserts are ignored *)
let t2 = List.fold_left (fun t x -> insert x t) E [3;1;3;2;1;3] in
assert (to_list t2 = [1;2;3]);
(* Root is always black *)
assert (match t with T (Black, _, _, _) -> true | _ -> false);
List.iter (Printf.printf "%d ") (to_list t);
print_endline "ok"
๐ Detailed Comparison
OCaml vs Rust: Red-Black Tree (Okasaki's Functional Balancing)
Side-by-Side Code
OCaml
๐ช Show OCaml equivalent
type color = Red | Black
type 'a rbtree = E | T of color * 'a rbtree * 'a * 'a rbtree
let balance = function
| Black, T (Red, T (Red, a, x, b), y, c), z, d
| Black, T (Red, a, x, T (Red, b, y, c)), z, d
| Black, a, x, T (Red, T (Red, b, y, c), z, d)
| Black, a, x, T (Red, b, y, T (Red, c, z, d)) ->
T (Red, T (Black, a, x, b), y, T (Black, c, z, d))
| color, a, x, b -> T (color, a, x, b)
let insert x t =
let rec ins = function
| E -> T (Red, E, x, E)
| T (color, a, y, b) ->
if x < y then balance (color, ins a, y, b)
else if x > y then balance (color, a, y, ins b)
else T (color, a, y, b)
in
match ins t with T (_, a, y, b) -> T (Black, a, y, b) | E -> ERust (idiomatic โ peek-then-destructure)
fn balance<T>(color: Color, left: RBTree<T>, value: T, right: RBTree<T>) -> RBTree<T> {
if color != Black {
return RBTree::node(color, left, value, right);
}
// Case 1: left-left โ peek at shape, then destructure by move
if left.is_red_node() {
if let Node { left: ref ll, .. } = left {
if ll.is_red_node() {
if let Node { left: ll_box, value: y, right: c, .. } = left {
if let Node { left: a, value: x, right: b, .. } = *ll_box {
return RBTree::node(Red,
RBTree::node(Black, *a, x, *b), y,
RBTree::node(Black, *c, value, right));
}
}
}
}
}
// ... Cases 2, 3, 4 follow the same peek-then-move pattern
RBTree::node(Black, left, value, right)
}Rust (insert โ functional path copying)
pub fn insert(self, x: T) -> Self where T: Clone {
fn ins<T: Ord + Clone>(tree: RBTree<T>, x: &T) -> RBTree<T> {
match tree {
Empty => RBTree::node(Red, Empty, x.clone(), Empty),
Node { color, left, value, right } => match x.cmp(&value) {
Ordering::Less => balance(color, ins(*left, x), value, *right),
Ordering::Greater => balance(color, *left, value, ins(*right, x)),
Ordering::Equal => RBTree::node(color, *left, value, *right),
},
}
}
match ins(self, &x) {
Node { left, value, right, .. } => Node { color: Black, left, value, right },
Empty => Empty,
}
}Type Signatures
| Concept | OCaml | Rust | |
|---|---|---|---|
| Color type | `type color = Red \ | Black` | `enum Color { Red, Black }` |
| Tree type | `'a rbtree = E \ | T of color 'a rbtree 'a 'a rbtree` | `enum RBTree<T> { Empty, Node { color, left: Box<RBTree<T>>, value: T, right: Box<RBTree<T>> } }` |
| Insert | `val insert : 'a -> 'a rbtree -> 'a rbtree` | `fn insert(self, x: T) -> Self where T: Ord + Clone` | |
| Membership | `val mem : 'a -> 'a rbtree -> bool` | `fn mem(&self, x: &T) -> bool` | |
| Balance | `val balance : color 'a rbtree 'a 'a rbtree -> 'a rbtree` | `fn balance(color, left, value, right) -> RBTree<T>` |
Key Insights
1. Or-patterns vs sequential checks: OCaml's `balance` uses four or-patterns in a single match arm โ arguably the most elegant expression of Okasaki's algorithm. Rust lacks this capability for nested owned destructuring, requiring sequential peek-then-move checks for each case.
2. Move semantics as path copying: Rust's ownership system naturally implements the functional "path copying" technique. When `insert` takes `self` by value and reconstructs the spine, moved subtrees are shared for free โ no reference counting or GC needed.
3. Box for recursive types: OCaml's recursive types are heap-allocated automatically by the runtime. Rust requires explicit `Box<T>` to break the infinite-size recursion, making the indirection visible in the type signature.
4. Comparison traits: OCaml uses structural equality (`=`) and comparison (`<`, `>`) via polymorphic operators. Rust requires explicit `T: Ord` trait bounds, making the ordering requirement part of the type contract.
5. Clone for insert: OCaml's `insert x t` captures `x` by reference implicitly (GC keeps it alive). Rust's `insert` requires `T: Clone` because the inner `ins` function may need to clone `x` at each recursive call when creating new leaf nodes.
When to Use Each Style
Use idiomatic Rust (peek-then-destructure) when: building production-quality balanced trees where you need compile-time ownership guarantees and zero-cost abstraction โ the verbose balance function compiles to the same efficient code as hand-written rotations.
Use OCaml/functional style when: prototyping tree algorithms or writing educational code where the elegance of pattern matching makes the algorithm's structure immediately visible โ Okasaki's four-case balance is a canonical example of how pattern matching shines.