๐Ÿฆ€ Functional Rust

178: Length-Indexed Lists

Difficulty: โญโญโญ Level: Advanced Track a list's length in its type so that `head` on an empty list and `zip` on mismatched lengths are compile-time errors, not runtime panics.

The Problem This Solves

`Vec::first()` returns `Option<T>` because the compiler doesn't know if the vec is empty. This is honest but verbose: you unwrap at every call site, scatter `expect("this can't be empty")` through your code, and trust that the invariant actually holds. If you're wrong, you panic at runtime. The type system shrugs. The deeper issue arises with operations like `zip`. If you zip two lists of different lengths, what do you get? Different languages make different choices (truncate, panic, error), but all of them make the choice at runtime. A length-indexed list catches the mismatch at compile time โ€” the type `[T; 3]` can only zip with another `[U; 3]`, full stop. OCaml's GADT approach uses Peano-encoded naturals at the type level: `type zero`, `type 'n succ`, and `('a, 'n succ) vec` is a list with at least one element. Rust's const generics offer a cleaner modern alternative: `Vec2<T, const N: usize>` carries the length directly in the type parameter. Either way, the length becomes part of the type, and the compiler enforces it.

The Intuition

Think of differently-sized bags. A `Bag<3>` holds exactly 3 items. You can zip `Bag<3>` with another `Bag<3>` โ€” both have the same type parameter. You can't zip `Bag<3>` with `Bag<5>` โ€” the compiler sees different types. And `head` works on any `Bag<N>` where the compiler can prove `N โ‰ฅ 1`, which you encode via trait bounds or const constraints. Rust const generics make this ergonomic: `struct Vec2<T, const N: usize>` is a real compile-time feature. The length `N` is part of the type signature, visible in documentation, and checked by the compiler. The Peano approach (`Zero`, `Succ<N>`) is more academic but mirrors OCaml exactly โ€” useful for understanding the theory.

How It Works in Rust

// Approach 1: const generics โ€” most ergonomic in modern Rust
#[derive(Debug, Clone)]
struct Vec2<T, const N: usize> {
 data: [T; N],
}

// zip only compiles when BOTH arrays have the same length N
fn zip_vec<T: Copy, U: Copy, const N: usize>(
 a: &Vec2<T, N>,
 b: &Vec2<U, N>,
) -> Vec2<(T, U), N> {
 let mut result = [(a.data[0], b.data[0]); N]; // simplified
 for i in 0..N { result[i] = (a.data[i], b.data[i]); }
 Vec2 { data: result }
}

// Usage:
let xs = Vec2 { data: [1, 2, 3] };      // Vec2<i32, 3>
let ys = Vec2 { data: [4, 5, 6] };      // Vec2<i32, 3>
let zipped = zip_vec(&xs, &ys);          // Vec2<(i32,i32), 3> โœ“

// This fails to compile โ€” different lengths:
// let ys2 = Vec2 { data: [4, 5] };     // Vec2<i32, 2>
// zip_vec(&xs, &ys2);                  // N=3 โ‰  N=2

// Approach 2: Peano types โ€” mirrors OCaml GADTs
struct Zero;
struct Succ<N>(std::marker::PhantomData<N>);

enum TypeVec<T, N> {
 VNil,                                // only valid when N = Zero
 VCons(T, Box<TypeVec<T, N>>),        // only valid when N = Succ<M>
}
// OCaml: ('a, zero) vec = VNil | ('a, 'n succ) vec = VCons
The const generic approach is what you'd use in practice. The Peano approach demonstrates the theoretical connection to type-level natural number arithmetic.

What This Unlocks

Key Differences

ConceptOCamlRust
Length encodingPeano types: `type zero`, `type 'n succ`Const generics: `const N: usize` (or manual Peano structs)
Non-empty guarantee`(a, 'n succ) vec` โ€” type-level proof of โ‰ฅ1 element`const N: usize` alone can't express `N > 0`; use `NonEmptyVec<T>` pattern instead
Zip`('a, 'n) vec -> ('b, 'n) vec` โ€” same `'n` forces equal length`Vec2<T, N>` and `Vec2<U, N>` โ€” same `N` enforces equal length
ErgonomicsNatural with GADTs but verbose PeanoConst generics are clean; Peano is verbose in Rust
Arithmetic on lengthsType-level addition possible with GADTsConst generics support `const { N + M }` in nightly; stable is limited
// Example 178: Length-Indexed Lists
// Rust uses const generics to track length at compile time

// === Approach 1: Const generic array wrapper ===

#[derive(Debug, Clone)]
struct Vec2<T, const N: usize> {
    data: [T; N],
}

impl<T: Default + Copy, const N: usize> Vec2<T, N> {
    fn replicate(val: T) -> Self {
        Vec2 { data: [val; N] }
    }
}

impl<T: Copy, const N: usize> Vec2<T, N> {
    fn head(&self) -> T {
        assert!(N > 0, "Cannot take head of empty Vec2");
        self.data[0]
    }

    fn map<U: Default + Copy>(&self, f: impl Fn(T) -> U) -> Vec2<U, N> {
        let mut result = [U::default(); N];
        for i in 0..N {
            result[i] = f(self.data[i]);
        }
        Vec2 { data: result }
    }
}

fn zip_vec<T: Copy + Default, U: Copy + Default, const N: usize>(
    a: &Vec2<T, N>,
    b: &Vec2<U, N>,
) -> Vec2<(T, U), N> {
    let mut result = [(T::default(), U::default()); N];
    for i in 0..N {
        result[i] = (a.data[i], b.data[i]);
    }
    Vec2 { data: result }
}

// === Approach 2: Type-level Peano numbers with nested tuples ===

trait Nat {}
struct Zero;
struct Succ<N: Nat>(std::marker::PhantomData<N>);
impl Nat for Zero {}
impl<N: Nat> Nat for Succ<N> {}

trait TypeVec<T> {
    fn to_vec(&self) -> std::vec::Vec<T> where T: Clone;
}

#[derive(Debug)]
struct VNil;

#[derive(Debug)]
struct VCons<T, N: Nat, Rest: TypeVec<T>>(T, Rest, std::marker::PhantomData<N>);

impl<T: Clone> TypeVec<T> for VNil {
    fn to_vec(&self) -> std::vec::Vec<T> { vec![] }
}

impl<T: Clone, N: Nat, Rest: TypeVec<T>> TypeVec<T> for VCons<T, N, Rest> {
    fn to_vec(&self) -> std::vec::Vec<T> {
        let mut v = vec![self.0.clone()];
        v.extend(self.1.to_vec());
        v
    }
}

impl<T, N: Nat, Rest: TypeVec<T>> VCons<T, N, Rest> {
    fn head(&self) -> &T { &self.0 }
    fn tail(&self) -> &Rest { &self.1 }
}

fn vnil() -> VNil { VNil }

fn vcons<T, N: Nat, R: TypeVec<T>>(x: T, rest: R) -> VCons<T, Succ<N>, R>
where R: TypeVec<T>
{
    VCons(x, rest, std::marker::PhantomData)
}

// === Approach 3: Recursive type with compile-time length encoding ===

trait SizedList {
    type Elem;
    const LEN: usize;
    fn to_vec(&self) -> std::vec::Vec<Self::Elem> where Self::Elem: Clone;
}

struct LNil<T>(std::marker::PhantomData<T>);
struct LCons<T, Tail: SizedList<Elem = T>>(T, Tail);

impl<T> SizedList for LNil<T> {
    type Elem = T;
    const LEN: usize = 0;
    fn to_vec(&self) -> std::vec::Vec<T> where T: Clone { vec![] }
}

impl<T, Tail: SizedList<Elem = T>> SizedList for LCons<T, Tail> {
    type Elem = T;
    const LEN: usize = 1 + Tail::LEN;
    fn to_vec(&self) -> std::vec::Vec<T> where T: Clone {
        let mut v = vec![self.0.clone()];
        v.extend(self.1.to_vec());
        v
    }
}

impl<T, Tail: SizedList<Elem = T>> LCons<T, Tail> {
    fn head(&self) -> &T { &self.0 }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_const_generic_head() {
        let v = Vec2 { data: [1, 2, 3] };
        assert_eq!(v.head(), 1);
    }

    #[test]
    fn test_const_generic_map() {
        let v = Vec2 { data: [1, 2, 3] };
        let d = v.map(|x| x * 2);
        assert_eq!(d.data, [2, 4, 6]);
    }

    #[test]
    fn test_const_generic_zip() {
        let a = Vec2 { data: [1, 2] };
        let b = Vec2 { data: [10, 20] };
        let z = zip_vec(&a, &b);
        assert_eq!(z.data, [(1, 10), (2, 20)]);
    }

    #[test]
    fn test_const_generic_replicate() {
        let v: Vec2<i32, 3> = Vec2::replicate(42);
        assert_eq!(v.data, [42, 42, 42]);
    }

    #[test]
    fn test_peano_vec() {
        let inner = vcons::<i32, Zero, VNil>(3, vnil());
        let mid = vcons::<i32, Succ<Zero>, _>(2, inner);
        let pv = vcons::<i32, Succ<Succ<Zero>>, _>(1, mid);
        assert_eq!(*pv.head(), 1);
        assert_eq!(pv.to_vec(), vec![1, 2, 3]);
    }

    #[test]
    fn test_recursive_list() {
        let l = LCons(10, LCons(20, LNil(std::marker::PhantomData)));
        assert_eq!(*l.head(), 10);
        assert_eq!(l.to_vec(), vec![10, 20]);
        assert_eq!(<LCons<i32, LCons<i32, LNil<i32>>> as SizedList>::LEN, 2);
    }
}
(* Example 178: Length-Indexed Lists (Vectors) *)
(* Lists whose length is tracked at the type level *)

(* Approach 1: Peano-encoded length-indexed vector *)
type zero = Zero
type 'n succ = Succ

type ('a, 'n) vec =
  | Nil  : ('a, zero) vec
  | Cons : 'a * ('a, 'n) vec -> ('a, 'n succ) vec

let head : type a n. (a, n succ) vec -> a = function
  | Cons (x, _) -> x
  (* Nil case is impossible โ€” the type prevents it! *)

let tail : type a n. (a, n succ) vec -> (a, n) vec = function
  | Cons (_, xs) -> xs

let rec to_list : type a n. (a, n) vec -> a list = function
  | Nil -> []
  | Cons (x, xs) -> x :: to_list xs

let rec map : type a b n. (a -> b) -> (a, n) vec -> (b, n) vec =
  fun f -> function
    | Nil -> Nil
    | Cons (x, xs) -> Cons (f x, map f xs)

let rec append : type a m n. (a, m) vec -> (a, n) vec -> a list =
  fun v1 v2 -> match v1 with
    | Nil -> to_list v2
    | Cons (x, xs) -> x :: append xs v2

(* Approach 2: Using GADTs for safe zip *)
let rec zip : type a b n. (a, n) vec -> (b, n) vec -> (a * b, n) vec =
  fun v1 v2 -> match v1, v2 with
    | Nil, Nil -> Nil
    | Cons (x, xs), Cons (y, ys) -> Cons ((x, y), zip xs ys)

(* Approach 3: Length witness for runtime operations *)
type _ length =
  | LZ : zero length
  | LS : 'n length -> 'n succ length

let rec replicate : type a n. n length -> a -> (a, n) vec =
  fun n x -> match n with
    | LZ -> Nil
    | LS n' -> Cons (x, replicate n' x)

let () =
  let v1 = Cons (1, Cons (2, Cons (3, Nil))) in
  let v2 = Cons (10, Cons (20, Cons (30, Nil))) in

  assert (head v1 = 1);
  assert (to_list (tail v1) = [2; 3]);
  assert (to_list (map (fun x -> x * 2) v1) = [2; 4; 6]);
  assert (to_list (zip v1 v2) = [(1,10); (2,20); (3,30)]);

  let v3 = replicate (LS (LS (LS LZ))) 42 in
  assert (to_list v3 = [42; 42; 42]);

  print_endline "โœ“ All tests passed"

๐Ÿ“Š Detailed Comparison

Comparison: Example 178 โ€” Length-Indexed Lists

Type Definition

OCaml

๐Ÿช Show OCaml equivalent
type zero = Zero
type 'n succ = Succ

type ('a, 'n) vec =
| Nil  : ('a, zero) vec
| Cons : 'a * ('a, 'n) vec -> ('a, 'n succ) vec

Rust (const generics)

struct Vec2<T, const N: usize> {
 data: [T; N],
}

Rust (Peano)

struct VNil;
struct VCons<T, N: Nat, Rest: TypeVec<T>>(T, Rest, PhantomData<N>);

Safe Head

OCaml

๐Ÿช Show OCaml equivalent
let head : type a n. (a, n succ) vec -> a = function
| Cons (x, _) -> x

Rust

impl<T: Copy, const N: usize> Vec2<T, N> {
 fn head(&self) -> T { self.data[0] }
}

Zip (same length guaranteed)

OCaml

๐Ÿช Show OCaml equivalent
let rec zip : type a b n. (a, n) vec -> (b, n) vec -> (a * b, n) vec =
fun v1 v2 -> match v1, v2 with
 | Nil, Nil -> Nil
 | Cons (x, xs), Cons (y, ys) -> Cons ((x, y), zip xs ys)

Rust

fn zip_vec<T: Copy + Default, U: Copy + Default, const N: usize>(
 a: &Vec2<T, N>, b: &Vec2<U, N>,
) -> Vec2<(T, U), N> {
 let mut result = [(T::default(), U::default()); N];
 for i in 0..N { result[i] = (a.data[i], b.data[i]); }
 Vec2 { data: result }
}

Replicate

OCaml

๐Ÿช Show OCaml equivalent
let rec replicate : type a n. n length -> a -> (a, n) vec =
fun n x -> match n with
 | LZ -> Nil
 | LS n' -> Cons (x, replicate n' x)

Rust

impl<T: Default + Copy, const N: usize> Vec2<T, N> {
 fn replicate(val: T) -> Self {
     Vec2 { data: [val; N] }
 }
}