Persistent Vector (Functional Array)
Tutorial
The Problem
Implement a persistent (immutable) vector using a balanced binary tree where get retrieves an element by index and set returns a new version of the vector with one element replaced, leaving the original completely unchanged.
🎯 Learning Outcomes
Rc<T> enables structural sharing β shared subtrees avoid deep copies on updateset operationOption instead of panicking (failwith) for safe out-of-bounds handling🦀 The Rust Way
Rust uses Rc<T> (reference-counted pointer) to achieve the same structural sharing without a GC. Rc::clone increments the reference count in O(1) β no deep copy occurs. Out-of-bounds returns Option<T> rather than panicking. The set method takes &self (shared borrow) and returns an owned Option<Self>, reflecting the persistent semantics.
Code Example
#[derive(Debug, Clone)]
pub enum PVec<T> {
Nil,
One(T),
Two(Rc<PVec<T>>, Rc<PVec<T>>),
}
impl<T> PVec<T> {
pub fn get(&self, i: usize) -> Option<&T> {
match self {
PVec::Nil => None,
PVec::One(x) => (i == 0).then_some(x),
PVec::Two(l, r) => {
let ls = l.size();
if i < ls { l.get(i) } else { r.get(i - ls) }
}
}
}
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVec::Nil => None,
PVec::One(_) => (i == 0).then(|| PVec::One(v)),
PVec::Two(l, r) => {
let ls = l.size();
if i < ls {
let new_l = l.set(i, v)?;
Some(PVec::Two(Rc::new(new_l), Rc::clone(r)))
} else {
let new_r = r.set(i - ls, v)?;
Some(PVec::Two(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
}Key Differences
Rc<T> for explicit reference counting with deterministic drop.failwith); Rust returns Option<T> for safe, composable error propagation.Rc::clone(r) makes the sharing explicit β you can see exactly what is shared.get i v (index before collection); Rust idiom uses method syntax v.get(i), though free-function style mirrors OCaml.OCaml Approach
OCaml's garbage collector and immutable-by-default values make persistent structures natural: set constructs new Two nodes along the path to the modified leaf, while unchanged subtrees are automatically shared (the GC tracks all references). The failwith calls handle errors by raising exceptions.
Full Source
use std::rc::Rc;
/// Persistent vector using a balanced binary tree.
///
/// `Rc` enables structural sharing: `set` creates only O(log n) new nodes,
/// sharing the unchanged subtrees with the original version. The original
/// is never mutated β both old and new versions coexist.
#[derive(Debug, Clone)]
pub enum PVec<T> {
Nil,
One(T),
/// Internal node: left subtree and right subtree, shared via Rc.
Two(Rc<PVec<T>>, Rc<PVec<T>>),
}
// --- Idiomatic Rust: method-based API ---
impl<T> PVec<T> {
/// Number of elements in the persistent vector.
pub fn size(&self) -> usize {
match self {
PVec::Nil => 0,
PVec::One(_) => 1,
PVec::Two(l, r) => l.size() + r.size(),
}
}
/// Get element at index i, or None if out of bounds.
pub fn get(&self, i: usize) -> Option<&T> {
match self {
PVec::Nil => None,
PVec::One(x) => (i == 0).then_some(x),
PVec::Two(l, r) => {
let ls = l.size();
if i < ls {
l.get(i)
} else {
r.get(i - ls)
}
}
}
}
/// Return a new persistent vector with element at index i replaced by v.
///
/// The original remains unmodified. Only the O(log n) path from root to
/// the modified leaf is newly allocated; all other subtrees are shared
/// via `Rc::clone` (reference-count increment only, no deep copy).
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVec::Nil => None,
PVec::One(_) => (i == 0).then(|| PVec::One(v)),
PVec::Two(l, r) => {
let ls = l.size();
if i < ls {
let new_l = l.set(i, v)?;
// Rc::clone shares r β O(1), no deep copy
Some(PVec::Two(Rc::new(new_l), Rc::clone(r)))
} else {
let new_r = r.set(i - ls, v)?;
Some(PVec::Two(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
}
impl<T: Clone> PVec<T> {
/// Build a balanced persistent vector from a slice. O(n) time and space.
pub fn of_list(lst: &[T]) -> Self {
match lst {
[] => PVec::Nil,
[x] => PVec::One(x.clone()),
lst => {
let mid = lst.len() / 2;
PVec::Two(
Rc::new(Self::of_list(&lst[..mid])),
Rc::new(Self::of_list(&lst[mid..])),
)
}
}
}
}
// --- Functional style: free functions mirroring OCaml argument order ---
// OCaml convention: index before collection (get i v, set i val v)
/// Size as a free function β mirrors OCaml's `size v`.
pub fn pvec_size<T>(v: &PVec<T>) -> usize {
match v {
PVec::Nil => 0,
PVec::One(_) => 1,
PVec::Two(l, r) => pvec_size(l) + pvec_size(r),
}
}
/// Get as a free function β `pvec_get(i, v)` mirrors OCaml's `get i v`.
pub fn pvec_get<T>(i: usize, v: &PVec<T>) -> Option<&T> {
match v {
PVec::Nil => None,
PVec::One(x) => (i == 0).then_some(x),
PVec::Two(l, r) => {
let ls = pvec_size(l);
if i < ls {
pvec_get(i, l)
} else {
pvec_get(i - ls, r)
}
}
}
}
/// Set as a free function β `pvec_set(i, val, v)` mirrors OCaml's `set i v pvec`.
/// Returns a new persistent vector; the original is unchanged.
pub fn pvec_set<T>(i: usize, val: T, v: &PVec<T>) -> Option<PVec<T>> {
match v {
PVec::Nil => None,
PVec::One(_) => (i == 0).then(|| PVec::One(val)),
PVec::Two(l, r) => {
let ls = pvec_size(l);
if i < ls {
let new_l = pvec_set(i, val, l)?;
Some(PVec::Two(Rc::new(new_l), Rc::clone(r)))
} else {
let new_r = pvec_set(i - ls, val, r)?;
Some(PVec::Two(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty() {
let v: PVec<i32> = PVec::Nil;
assert_eq!(v.size(), 0);
assert_eq!(v.get(0), None);
assert!(v.set(0, 1).is_none());
}
#[test]
fn test_single_element() {
let v = PVec::of_list(&[42]);
assert_eq!(v.size(), 1);
assert_eq!(v.get(0), Some(&42));
assert_eq!(v.get(1), None);
let v2 = v.set(0, 99).unwrap();
assert_eq!(v2.get(0), Some(&99));
assert_eq!(v.get(0), Some(&42)); // original unchanged
}
#[test]
fn test_get_multiple_elements() {
let v = PVec::of_list(&[10, 20, 30, 40, 50]);
assert_eq!(v.size(), 5);
assert_eq!(v.get(0), Some(&10));
assert_eq!(v.get(2), Some(&30));
assert_eq!(v.get(4), Some(&50));
assert_eq!(v.get(5), None);
}
#[test]
fn test_set_is_persistent() {
let v1 = PVec::of_list(&[10, 20, 30, 40, 50]);
let v2 = v1.set(2, 99).unwrap();
// v1 is unchanged β structural sharing preserved original
assert_eq!(v1.get(2), Some(&30));
// v2 reflects the update
assert_eq!(v2.get(2), Some(&99));
// shared elements are accessible in v2
assert_eq!(v2.get(0), Some(&10));
assert_eq!(v2.get(4), Some(&50));
}
#[test]
fn test_set_out_of_bounds_returns_none() {
let v = PVec::of_list(&[1, 2, 3]);
assert!(v.set(5, 99).is_none());
assert!(PVec::<i32>::Nil.set(0, 99).is_none());
}
#[test]
fn test_set_first_and_last() {
let v = PVec::of_list(&[1, 2, 3, 4, 5]);
let v_first = v.set(0, 100).unwrap();
let v_last = v.set(4, 500).unwrap();
assert_eq!(v_first.get(0), Some(&100));
assert_eq!(v_first.get(4), Some(&5));
assert_eq!(v_last.get(0), Some(&1));
assert_eq!(v_last.get(4), Some(&500));
assert_eq!(v.get(0), Some(&1)); // original still intact
}
#[test]
fn test_functional_free_functions() {
let v = PVec::of_list(&[10, 20, 30, 40, 50]);
assert_eq!(pvec_size(&v), 5);
assert_eq!(pvec_get(2, &v), Some(&30));
let v2 = pvec_set(2, 99, &v).unwrap();
assert_eq!(pvec_get(2, &v), Some(&30)); // original unchanged
assert_eq!(pvec_get(2, &v2), Some(&99));
}
#[test]
fn test_chained_updates() {
let v0 = PVec::of_list(&[1, 2, 3, 4, 5]);
let v1 = v0.set(1, 20).unwrap();
let v2 = v1.set(3, 40).unwrap();
// v0 untouched
assert_eq!(v0.get(1), Some(&2));
assert_eq!(v0.get(3), Some(&4));
// v1 has first update
assert_eq!(v1.get(1), Some(&20));
assert_eq!(v1.get(3), Some(&4));
// v2 has both updates
assert_eq!(v2.get(1), Some(&20));
assert_eq!(v2.get(3), Some(&40));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_empty() {
let v: PVec<i32> = PVec::Nil;
assert_eq!(v.size(), 0);
assert_eq!(v.get(0), None);
assert!(v.set(0, 1).is_none());
}
#[test]
fn test_single_element() {
let v = PVec::of_list(&[42]);
assert_eq!(v.size(), 1);
assert_eq!(v.get(0), Some(&42));
assert_eq!(v.get(1), None);
let v2 = v.set(0, 99).unwrap();
assert_eq!(v2.get(0), Some(&99));
assert_eq!(v.get(0), Some(&42)); // original unchanged
}
#[test]
fn test_get_multiple_elements() {
let v = PVec::of_list(&[10, 20, 30, 40, 50]);
assert_eq!(v.size(), 5);
assert_eq!(v.get(0), Some(&10));
assert_eq!(v.get(2), Some(&30));
assert_eq!(v.get(4), Some(&50));
assert_eq!(v.get(5), None);
}
#[test]
fn test_set_is_persistent() {
let v1 = PVec::of_list(&[10, 20, 30, 40, 50]);
let v2 = v1.set(2, 99).unwrap();
// v1 is unchanged β structural sharing preserved original
assert_eq!(v1.get(2), Some(&30));
// v2 reflects the update
assert_eq!(v2.get(2), Some(&99));
// shared elements are accessible in v2
assert_eq!(v2.get(0), Some(&10));
assert_eq!(v2.get(4), Some(&50));
}
#[test]
fn test_set_out_of_bounds_returns_none() {
let v = PVec::of_list(&[1, 2, 3]);
assert!(v.set(5, 99).is_none());
assert!(PVec::<i32>::Nil.set(0, 99).is_none());
}
#[test]
fn test_set_first_and_last() {
let v = PVec::of_list(&[1, 2, 3, 4, 5]);
let v_first = v.set(0, 100).unwrap();
let v_last = v.set(4, 500).unwrap();
assert_eq!(v_first.get(0), Some(&100));
assert_eq!(v_first.get(4), Some(&5));
assert_eq!(v_last.get(0), Some(&1));
assert_eq!(v_last.get(4), Some(&500));
assert_eq!(v.get(0), Some(&1)); // original still intact
}
#[test]
fn test_functional_free_functions() {
let v = PVec::of_list(&[10, 20, 30, 40, 50]);
assert_eq!(pvec_size(&v), 5);
assert_eq!(pvec_get(2, &v), Some(&30));
let v2 = pvec_set(2, 99, &v).unwrap();
assert_eq!(pvec_get(2, &v), Some(&30)); // original unchanged
assert_eq!(pvec_get(2, &v2), Some(&99));
}
#[test]
fn test_chained_updates() {
let v0 = PVec::of_list(&[1, 2, 3, 4, 5]);
let v1 = v0.set(1, 20).unwrap();
let v2 = v1.set(3, 40).unwrap();
// v0 untouched
assert_eq!(v0.get(1), Some(&2));
assert_eq!(v0.get(3), Some(&4));
// v1 has first update
assert_eq!(v1.get(1), Some(&20));
assert_eq!(v1.get(3), Some(&4));
// v2 has both updates
assert_eq!(v2.get(1), Some(&20));
assert_eq!(v2.get(3), Some(&40));
}
}
Deep Comparison
OCaml vs Rust: Persistent Vector (Functional Array)
Side-by-Side Code
OCaml
type 'a pvec = Nil | One of 'a | Two of 'a pvec * 'a pvec
let rec get i = function
| One x -> if i = 0 then x else failwith "index"
| Two (l, r) ->
let ls = size l in
if i < ls then get i l else get (i - ls) r
| Nil -> failwith "empty"
let rec set i v = function
| One _ -> if i = 0 then One v else failwith "index"
| Two (l, r) ->
let ls = size l in
if i < ls then Two (set i v l, r)
else Two (l, set (i - ls) v r)
| Nil -> failwith "empty"
Rust (idiomatic β method API)
#[derive(Debug, Clone)]
pub enum PVec<T> {
Nil,
One(T),
Two(Rc<PVec<T>>, Rc<PVec<T>>),
}
impl<T> PVec<T> {
pub fn get(&self, i: usize) -> Option<&T> {
match self {
PVec::Nil => None,
PVec::One(x) => (i == 0).then_some(x),
PVec::Two(l, r) => {
let ls = l.size();
if i < ls { l.get(i) } else { r.get(i - ls) }
}
}
}
pub fn set(&self, i: usize, v: T) -> Option<Self> {
match self {
PVec::Nil => None,
PVec::One(_) => (i == 0).then(|| PVec::One(v)),
PVec::Two(l, r) => {
let ls = l.size();
if i < ls {
let new_l = l.set(i, v)?;
Some(PVec::Two(Rc::new(new_l), Rc::clone(r)))
} else {
let new_r = r.set(i - ls, v)?;
Some(PVec::Two(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
}
Rust (functional style β free functions mirroring OCaml argument order)
pub fn pvec_get<T>(i: usize, v: &PVec<T>) -> Option<&T> {
match v {
PVec::Nil => None,
PVec::One(x) => (i == 0).then_some(x),
PVec::Two(l, r) => {
let ls = pvec_size(l);
if i < ls { pvec_get(i, l) } else { pvec_get(i - ls, r) }
}
}
}
pub fn pvec_set<T>(i: usize, val: T, v: &PVec<T>) -> Option<PVec<T>> {
match v {
PVec::Nil => None,
PVec::One(_) => (i == 0).then(|| PVec::One(val)),
PVec::Two(l, r) => {
let ls = pvec_size(l);
if i < ls {
let new_l = pvec_set(i, val, l)?;
Some(PVec::Two(Rc::new(new_l), Rc::clone(r)))
} else {
let new_r = pvec_set(i - ls, val, r)?;
Some(PVec::Two(Rc::clone(l), Rc::new(new_r)))
}
}
}
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| ADT definition | type 'a pvec = Nil \| One of 'a \| Two of 'a pvec * 'a pvec | enum PVec<T> { Nil, One(T), Two(Rc<PVec<T>>, Rc<PVec<T>>) } |
| Get signature | val get : int -> 'a pvec -> 'a | fn get(&self, i: usize) -> Option<&T> |
| Set signature | val set : int -> 'a -> 'a pvec -> 'a pvec | fn set(&self, i: usize, v: T) -> Option<Self> |
| Error handling | failwith "index" (exception) | None (Option) |
| Shared subtree | Implicit via GC | Rc::clone(r) β explicit, O(1) |
Key Insights
Rc::clone(r) at every sharing point, making the O(log n) allocation budget visible in the code itself.Rc replaces the GC for this pattern.** Reference counting gives the same persistent-structure semantics as a tracing GC: a subtree lives as long as any version references it, and is freed deterministically when the last reference drops.Option replaces exceptions.** OCaml's failwith "index" raises an exception that crosses stack frames; Rust's Option with the ? operator propagates the failure value up the call chain without unwinding β composable, type-safe, zero-cost.usize vs int.** OCaml uses signed int for indices (so negative indices are representable, though not useful here); Rust uses usize β unsigned, platform-width β making the precondition i >= 0 enforced by the type system with no runtime check needed.v.get(i) (method syntax); the OCaml convention is get i v (index before collection). Both styles are provided: the method API for ergonomics and the free-function API for a direct structural translation from OCaml.When to Use Each Style
Use idiomatic Rust (method API) when: writing new Rust code, composing with other Rust types, or when method chaining (v.set(2, 99)?.set(3, 42)) improves readability.
Use functional free-function style when: translating OCaml algorithms directly, keeping argument order identical to the source for comparison, or teaching the OCamlβRust translation explicitly.