1166 Advanced

Trie-based Persistent Vector for O(log n) Random Access

Functional Programming

Tutorial

The Problem

Purely functional data structures must never mutate in place — every update produces a new version while preserving the old one. Naive approaches copy the entire structure on each update, giving O(n) time. A trie-based persistent vector solves this by sharing unchanged subtrees between versions, achieving O(log n) reads and writes while remaining fully immutable. This structure underpins Clojure's persistent vector and any system requiring cheap snapshots or time-travel debugging.

🎯 Learning Outcomes

• Understand why persistent data structures require structural sharing to be efficient

• Learn how a 32-way branching trie achieves O(log₃₂ n) practical amortized access

• See how Arc in Rust enables safe structural sharing without a garbage collector

• Appreciate the trade-off between mutability performance and version-preserving guarantees

Code Example

#![allow(clippy::all)]
// placeholder

(* Simplified persistent vector using balanced binary tree *)
type 'a pvec = Nil | One of 'a | Two of 'a pvec * 'a pvec

let rec size = function
  | Nil -> 0 | One _ -> 1
  | Two (l, r) -> size l + size r

let rec get i = function
  | One x -> if i = 0 then x else failwith "index"
  | Two (l, r) ->
    let ls = size l in
    if i < ls then get i l else get (i - ls) r
  | Nil -> failwith "empty"

let rec set i v = function
  | One _ -> if i = 0 then One v else failwith "index"
  | Two (l, r) ->
    let ls = size l in
    if i < ls then Two (set i v l, r)
    else Two (l, set (i - ls) v r)
  | Nil -> failwith "empty"

let of_list lst =
  let rec build = function
    | [] -> Nil | [x] -> One x
    | lst ->
      let n = List.length lst in
      let left = List.filteri (fun i _ -> i < n/2) lst in
      let right = List.filteri (fun i _ -> i >= n/2) lst in
      Two (build left, build right)
  in build lst

let () =
  let v = of_list [10;20;30;40;50] in
  Printf.printf "v[2] = %d\n" (get 2 v);
  let v = set 2 99 v in
  Printf.printf "v[2] = %d\n" (get 2 v)

Key Differences

Memory management: OCaml relies on GC for sharing; Rust uses Arc<T> with explicit reference counting and no GC pauses.

Mutability default: OCaml values are immutable by default, making persistence natural; Rust requires Arc wrapping to opt into sharing.

Performance profile: Rust's Arc has atomic overhead on clone/drop; OCaml's GC has periodic pause overhead — different latency trade-offs.

Type encoding: OCaml uses polymorphic recursive types directly; Rust requires Box or Arc to break recursive type size cycles.

OCaml Approach

OCaml's garbage collector handles structural sharing automatically — the GC keeps shared nodes alive as long as any version references them. Persistent vectors are typically implemented as trees of arrays (type 'a t = Leaf of 'a array | Node of 'a t array). Because OCaml values are immutable by default, sharing is the natural result of returning old sub-nodes without copying.

Full Source

#![allow(clippy::all)]
// placeholder

(* Simplified persistent vector using balanced binary tree *)
type 'a pvec = Nil | One of 'a | Two of 'a pvec * 'a pvec

let rec size = function
  | Nil -> 0 | One _ -> 1
  | Two (l, r) -> size l + size r

let rec get i = function
  | One x -> if i = 0 then x else failwith "index"
  | Two (l, r) ->
    let ls = size l in
    if i < ls then get i l else get (i - ls) r
  | Nil -> failwith "empty"

let rec set i v = function
  | One _ -> if i = 0 then One v else failwith "index"
  | Two (l, r) ->
    let ls = size l in
    if i < ls then Two (set i v l, r)
    else Two (l, set (i - ls) v r)
  | Nil -> failwith "empty"

let of_list lst =
  let rec build = function
    | [] -> Nil | [x] -> One x
    | lst ->
      let n = List.length lst in
      let left = List.filteri (fun i _ -> i < n/2) lst in
      let right = List.filteri (fun i _ -> i >= n/2) lst in
      Two (build left, build right)
  in build lst

let () =
  let v = of_list [10;20;30;40;50] in
  Printf.printf "v[2] = %d\n" (get 2 v);
  let v = set 2 99 v in
  Printf.printf "v[2] = %d\n" (get 2 v)

Deep Comparison

Exercises

Implement get(index) on a simple binary trie (branching factor 2) and verify O(log n) node visits with a counter.

Add set(index, value) returning a new root, sharing all unchanged subtrees via Arc::clone.

Extend to support push_back that grows the trie by one level when the current depth is exhausted.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust