Persistent Vector
Tutorial
The Problem
Implement a persistent (immutable) vector backed by a balanced binary tree, where set returns a new version sharing all unchanged subtrees with the original.
🎯 Learning Outcomes
Rc<T> enables structural sharing — unchanged subtrees are pointer-aliased, not copiedset is O(log n) new allocations, not O(n) full copyRc<T> (shared ownership, no sharing guarantees in OCaml) and OCaml's GC (automatic sharing via value identity)🦀 The Rust Way
Rust requires explicit ownership management. Rc<PVec<T>> gives shared ownership so unchanged subtrees can be aliased across versions. Rc::clone in the set path only bumps a reference count (O(1)), while only the nodes on the path from root to the updated leaf are newly allocated (O(log n) total).
Code Example
use std::rc::Rc;
#[derive(Debug, Clone)]
pub enum PVec<T> {
Nil,
Leaf(T),
Branch(Rc<PVec<T>>, Rc<PVec<T>>),
}
impl<T: Clone> PVec<T> {
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVec::Nil => None,
PVec::Leaf(_) => (i == 0).then(|| PVec::Leaf(v)),
PVec::Branch(l, r) => {
let ls = l.size();
if i < ls {
// Rc::clone(r) = pointer bump, not deep copy
l.set(i, v)
.map(|new_l| PVec::Branch(Rc::new(new_l), Rc::clone(r)))
} else {
r.set(i - ls, v)
.map(|new_r| PVec::Branch(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
}Key Differences
Rc<T> to express shared ownership explicitly.failwith (exceptions); Rust returns Option<T> — out-of-bounds is a value, not a control-flow exception.Rc::clone is a pointer bump (O(1)); cloning a Box<T> is a deep copy (O(subtree size)).from_slice and set require T: Clone explicitly.OCaml Approach
OCaml uses a sum type 'a pvec = Nil | One of 'a | Two of 'a pvec * 'a pvec. Since OCaml is garbage-collected, set naturally shares the unchanged branch — the GC handles aliasing automatically. The programmer writes purely functional code and the runtime provides structural sharing for free.
Full Source
#![allow(dead_code)]
use std::rc::Rc;
/// Persistent vector using a balanced binary tree with structural sharing via `Rc`.
///
/// `set` returns a new `PVec` sharing all unchanged subtrees — O(log n) new allocations.
/// The original vector is untouched, making this safe to use as an immutable value.
#[derive(Debug, Clone)]
pub enum PVec<T> {
Nil,
Leaf(T),
Branch(Rc<PVec<T>>, Rc<PVec<T>>),
}
impl<T> PVec<T> {
pub fn empty() -> Self {
PVec::Nil
}
pub fn size(&self) -> usize {
match self {
PVec::Nil => 0,
PVec::Leaf(_) => 1,
PVec::Branch(l, r) => l.size() + r.size(),
}
}
/// Returns a reference to element at index `i`, or `None` if out of bounds.
pub fn get(&self, i: usize) -> Option<&T> {
match self {
PVec::Nil => None,
PVec::Leaf(x) => (i == 0).then_some(x),
PVec::Branch(l, r) => {
let ls = l.size();
if i < ls {
l.get(i)
} else {
r.get(i - ls)
}
}
}
}
}
impl<T: Clone> PVec<T> {
/// Returns a new `PVec` with index `i` replaced by `v`.
///
/// Shares unchanged subtrees via `Rc::clone` — only the path from root to the
/// modified leaf is newly allocated (O(log n) nodes). The original is unmodified.
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVec::Nil => None,
PVec::Leaf(_) => (i == 0).then(|| PVec::Leaf(v)),
PVec::Branch(l, r) => {
let ls = l.size();
if i < ls {
l.set(i, v)
.map(|new_l| PVec::Branch(Rc::new(new_l), Rc::clone(r)))
} else {
r.set(i - ls, v)
.map(|new_r| PVec::Branch(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
/// Build a balanced `PVec` from a slice in O(n) time.
pub fn from_slice(items: &[T]) -> Self {
match items {
[] => PVec::Nil,
[x] => PVec::Leaf(x.clone()),
_ => {
let mid = items.len() / 2;
PVec::Branch(
Rc::new(Self::from_slice(&items[..mid])),
Rc::new(Self::from_slice(&items[mid..])),
)
}
}
}
/// Flatten the tree back to a `Vec<T>` in index order.
pub fn to_vec(&self) -> Vec<T> {
match self {
PVec::Nil => vec![],
PVec::Leaf(x) => vec![x.clone()],
PVec::Branch(l, r) => l.to_vec().into_iter().chain(r.to_vec()).collect(),
}
}
}
impl<T: Clone> Default for PVec<T> {
fn default() -> Self {
PVec::empty()
}
}
/// Box-based persistent vector — mirrors the OCaml style more directly.
///
/// No structural sharing: each `set` allocates the entire path from root to leaf
/// and clones the unchanged subtrees. Simpler, but O(n) per update in the worst case.
#[derive(Debug, Clone)]
pub enum PVecBox<T> {
Nil,
Leaf(T),
Branch(Box<PVecBox<T>>, Box<PVecBox<T>>),
}
impl<T> PVecBox<T> {
pub fn size(&self) -> usize {
match self {
PVecBox::Nil => 0,
PVecBox::Leaf(_) => 1,
PVecBox::Branch(l, r) => l.size() + r.size(),
}
}
pub fn get(&self, i: usize) -> Option<&T> {
match self {
PVecBox::Nil => None,
PVecBox::Leaf(x) => (i == 0).then_some(x),
PVecBox::Branch(l, r) => {
let ls = l.size();
if i < ls {
l.get(i)
} else {
r.get(i - ls)
}
}
}
}
}
impl<T: Clone> PVecBox<T> {
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVecBox::Nil => None,
PVecBox::Leaf(_) => (i == 0).then(|| PVecBox::Leaf(v)),
PVecBox::Branch(l, r) => {
let ls = l.size();
if i < ls {
// Right subtree cloned wholesale (no sharing)
l.set(i, v)
.map(|new_l| PVecBox::Branch(Box::new(new_l), r.clone()))
} else {
r.set(i - ls, v)
.map(|new_r| PVecBox::Branch(l.clone(), Box::new(new_r)))
}
}
}
}
pub fn from_slice(items: &[T]) -> Self {
match items {
[] => PVecBox::Nil,
[x] => PVecBox::Leaf(x.clone()),
_ => {
let mid = items.len() / 2;
PVecBox::Branch(
Box::new(Self::from_slice(&items[..mid])),
Box::new(Self::from_slice(&items[mid..])),
)
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
// --- PVec (Rc-based) tests ---
#[test]
fn test_empty_size_and_get() {
let v: PVec<i32> = PVec::empty();
assert_eq!(v.size(), 0);
assert_eq!(v.get(0), None);
}
#[test]
fn test_single_element() {
let v = PVec::from_slice(&[42]);
assert_eq!(v.size(), 1);
assert_eq!(v.get(0), Some(&42));
assert_eq!(v.get(1), None);
}
#[test]
fn test_from_slice_and_get() {
let v = PVec::from_slice(&[10, 20, 30, 40, 50]);
assert_eq!(v.size(), 5);
assert_eq!(v.get(0), Some(&10));
assert_eq!(v.get(2), Some(&30));
assert_eq!(v.get(4), Some(&50));
assert_eq!(v.get(5), None);
}
#[test]
fn test_set_returns_new_version() {
let v1 = PVec::from_slice(&[10, 20, 30, 40, 50]);
let v2 = v1.set(2, 99).expect("valid index");
// v2 has the update
assert_eq!(v2.get(2), Some(&99));
// v1 is unchanged (persistence)
assert_eq!(v1.get(2), Some(&30));
// Other positions preserved in v2
assert_eq!(v2.get(0), Some(&10));
assert_eq!(v2.get(4), Some(&50));
}
#[test]
fn test_set_out_of_bounds_returns_none() {
let v = PVec::from_slice(&[1, 2, 3]);
assert!(v.set(3, 99).is_none());
assert!(v.set(10, 0).is_none());
}
#[test]
fn test_to_vec_round_trip() {
let input = vec![1, 2, 3, 4, 5, 6, 7, 8];
let pv = PVec::from_slice(&input);
assert_eq!(pv.to_vec(), input);
}
#[test]
fn test_multiple_persistent_versions() {
let v0 = PVec::from_slice(&[1, 2, 3]);
let v1 = v0.set(0, 10).unwrap();
let v2 = v1.set(1, 20).unwrap();
let v3 = v2.set(2, 30).unwrap();
assert_eq!(v0.to_vec(), vec![1, 2, 3]);
assert_eq!(v1.to_vec(), vec![10, 2, 3]);
assert_eq!(v2.to_vec(), vec![10, 20, 3]);
assert_eq!(v3.to_vec(), vec![10, 20, 30]);
}
// --- PVecBox (Box-based) tests ---
#[test]
fn test_box_empty() {
let v: PVecBox<i32> = PVecBox::Nil;
assert_eq!(v.size(), 0);
assert_eq!(v.get(0), None);
}
#[test]
fn test_box_from_slice_and_get() {
let v = PVecBox::from_slice(&[10, 20, 30, 40, 50]);
assert_eq!(v.size(), 5);
assert_eq!(v.get(2), Some(&30));
}
#[test]
fn test_box_set_persistence() {
let v1 = PVecBox::from_slice(&[10, 20, 30]);
let v2 = v1.set(1, 99).expect("valid index");
assert_eq!(v2.get(1), Some(&99));
assert_eq!(v1.get(1), Some(&20)); // original unchanged
}
#[test]
fn test_box_set_out_of_bounds() {
let v = PVecBox::from_slice(&[1, 2]);
assert!(v.set(5, 0).is_none());
}
}#[cfg(test)]
mod tests {
use super::*;
// --- PVec (Rc-based) tests ---
#[test]
fn test_empty_size_and_get() {
let v: PVec<i32> = PVec::empty();
assert_eq!(v.size(), 0);
assert_eq!(v.get(0), None);
}
#[test]
fn test_single_element() {
let v = PVec::from_slice(&[42]);
assert_eq!(v.size(), 1);
assert_eq!(v.get(0), Some(&42));
assert_eq!(v.get(1), None);
}
#[test]
fn test_from_slice_and_get() {
let v = PVec::from_slice(&[10, 20, 30, 40, 50]);
assert_eq!(v.size(), 5);
assert_eq!(v.get(0), Some(&10));
assert_eq!(v.get(2), Some(&30));
assert_eq!(v.get(4), Some(&50));
assert_eq!(v.get(5), None);
}
#[test]
fn test_set_returns_new_version() {
let v1 = PVec::from_slice(&[10, 20, 30, 40, 50]);
let v2 = v1.set(2, 99).expect("valid index");
// v2 has the update
assert_eq!(v2.get(2), Some(&99));
// v1 is unchanged (persistence)
assert_eq!(v1.get(2), Some(&30));
// Other positions preserved in v2
assert_eq!(v2.get(0), Some(&10));
assert_eq!(v2.get(4), Some(&50));
}
#[test]
fn test_set_out_of_bounds_returns_none() {
let v = PVec::from_slice(&[1, 2, 3]);
assert!(v.set(3, 99).is_none());
assert!(v.set(10, 0).is_none());
}
#[test]
fn test_to_vec_round_trip() {
let input = vec![1, 2, 3, 4, 5, 6, 7, 8];
let pv = PVec::from_slice(&input);
assert_eq!(pv.to_vec(), input);
}
#[test]
fn test_multiple_persistent_versions() {
let v0 = PVec::from_slice(&[1, 2, 3]);
let v1 = v0.set(0, 10).unwrap();
let v2 = v1.set(1, 20).unwrap();
let v3 = v2.set(2, 30).unwrap();
assert_eq!(v0.to_vec(), vec![1, 2, 3]);
assert_eq!(v1.to_vec(), vec![10, 2, 3]);
assert_eq!(v2.to_vec(), vec![10, 20, 3]);
assert_eq!(v3.to_vec(), vec![10, 20, 30]);
}
// --- PVecBox (Box-based) tests ---
#[test]
fn test_box_empty() {
let v: PVecBox<i32> = PVecBox::Nil;
assert_eq!(v.size(), 0);
assert_eq!(v.get(0), None);
}
#[test]
fn test_box_from_slice_and_get() {
let v = PVecBox::from_slice(&[10, 20, 30, 40, 50]);
assert_eq!(v.size(), 5);
assert_eq!(v.get(2), Some(&30));
}
#[test]
fn test_box_set_persistence() {
let v1 = PVecBox::from_slice(&[10, 20, 30]);
let v2 = v1.set(1, 99).expect("valid index");
assert_eq!(v2.get(1), Some(&99));
assert_eq!(v1.get(1), Some(&20)); // original unchanged
}
#[test]
fn test_box_set_out_of_bounds() {
let v = PVecBox::from_slice(&[1, 2]);
assert!(v.set(5, 0).is_none());
}
}
Deep Comparison
OCaml vs Rust: Persistent Vector
Side-by-Side Code
OCaml
type 'a pvec = Nil | One of 'a | Two of 'a pvec * 'a pvec
let rec get i = function
| One x -> if i = 0 then x else failwith "index"
| Two (l, r) ->
let ls = size l in
if i < ls then get i l else get (i - ls) r
| Nil -> failwith "empty"
let rec set i v = function
| One _ -> if i = 0 then One v else failwith "index"
| Two (l, r) ->
let ls = size l in
if i < ls then Two (set i v l, r) (* right branch shared by GC *)
else Two (l, set (i - ls) v r) (* left branch shared by GC *)
| Nil -> failwith "empty"
Rust (idiomatic — explicit sharing with Rc)
use std::rc::Rc;
#[derive(Debug, Clone)]
pub enum PVec<T> {
Nil,
Leaf(T),
Branch(Rc<PVec<T>>, Rc<PVec<T>>),
}
impl<T: Clone> PVec<T> {
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVec::Nil => None,
PVec::Leaf(_) => (i == 0).then(|| PVec::Leaf(v)),
PVec::Branch(l, r) => {
let ls = l.size();
if i < ls {
// Rc::clone(r) = pointer bump, not deep copy
l.set(i, v)
.map(|new_l| PVec::Branch(Rc::new(new_l), Rc::clone(r)))
} else {
r.set(i - ls, v)
.map(|new_r| PVec::Branch(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
}
Rust (Box-based — mirrors OCaml pattern, no structural sharing)
#[derive(Debug, Clone)]
pub enum PVecBox<T> {
Nil,
Leaf(T),
Branch(Box<PVecBox<T>>, Box<PVecBox<T>>),
}
impl<T: Clone> PVecBox<T> {
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVecBox::Nil => None,
PVecBox::Leaf(_) => (i == 0).then(|| PVecBox::Leaf(v)),
PVecBox::Branch(l, r) => {
let ls = l.size();
if i < ls {
// r.clone() = deep copy of entire right subtree
l.set(i, v)
.map(|new_l| PVecBox::Branch(Box::new(new_l), r.clone()))
} else {
r.set(i - ls, v)
.map(|new_r| PVecBox::Branch(l.clone(), Box::new(new_r)))
}
}
}
}
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| ADT definition | type 'a pvec = Nil \| One of 'a \| Two of 'a pvec * 'a pvec | enum PVec<T> { Nil, Leaf(T), Branch(Rc<PVec<T>>, Rc<PVec<T>>) } |
| Get signature | val get : int -> 'a pvec -> 'a | fn get(&self, i: usize) -> Option<&T> |
| Set signature | val set : int -> 'a -> 'a pvec -> 'a pvec | fn set(&self, i: usize, v: T) -> Option<Self> |
| Shared node | implicit (GC alias) | Rc<PVec<T>> |
| Error mode | failwith "index" (exception) | None (Option) |
Key Insights
Two (set i v l, r) is aliased to the original r. In Rust, Rc<T> makes this explicit: Rc::clone(r) is a reference-count bump, not a value copy.Box<T> vs Rc<T> for trees:** Box<T> expresses unique ownership — updating a Box-tree requires deep-cloning all untouched subtrees. Rc<T> expresses shared ownership — unchanged subtrees can be aliased safely across versions. Choosing Rc is what makes this data structure genuinely persistent.failwith). Rust makes errors part of the type: Option<T> forces the caller to handle the absent case at compile time, not at runtime.'a pvec is polymorphic with no constraints — set works for any element type. Rust's set and from_slice require T: Clone explicitly because they copy values into new nodes. Operations that only traverse the tree (get, size) need no bound.Box<T> or Rc<T>). Without the pointer indirection, the compiler cannot determine the size of PVec<T> on the stack.When to Use Each Style
**Use Rc-based PVec when:** you need genuine persistence — multiple versions of a data structure coexisting, e.g. implementing undo history, a persistent map, or a CRDT.
**Use Box-based PVecBox when:** you want the clearest structural analogy to OCaml for learning purposes, or you only ever need one version at a time and functional updates are infrequent.
**Use Vec::clone() when:** you need "copy-on-write" semantics but don't care about structural sharing — simplest code, fine for small vectors or infrequent updates.