178: Length-Indexed Lists
Difficulty: โญโญโญ Level: Advanced Track a list's length in its type so that `head` on an empty list and `zip` on mismatched lengths are compile-time errors, not runtime panics.The Problem This Solves
`Vec::first()` returns `Option<T>` because the compiler doesn't know if the vec is empty. This is honest but verbose: you unwrap at every call site, scatter `expect("this can't be empty")` through your code, and trust that the invariant actually holds. If you're wrong, you panic at runtime. The type system shrugs. The deeper issue arises with operations like `zip`. If you zip two lists of different lengths, what do you get? Different languages make different choices (truncate, panic, error), but all of them make the choice at runtime. A length-indexed list catches the mismatch at compile time โ the type `[T; 3]` can only zip with another `[U; 3]`, full stop. OCaml's GADT approach uses Peano-encoded naturals at the type level: `type zero`, `type 'n succ`, and `('a, 'n succ) vec` is a list with at least one element. Rust's const generics offer a cleaner modern alternative: `Vec2<T, const N: usize>` carries the length directly in the type parameter. Either way, the length becomes part of the type, and the compiler enforces it.The Intuition
Think of differently-sized bags. A `Bag<3>` holds exactly 3 items. You can zip `Bag<3>` with another `Bag<3>` โ both have the same type parameter. You can't zip `Bag<3>` with `Bag<5>` โ the compiler sees different types. And `head` works on any `Bag<N>` where the compiler can prove `N โฅ 1`, which you encode via trait bounds or const constraints. Rust const generics make this ergonomic: `struct Vec2<T, const N: usize>` is a real compile-time feature. The length `N` is part of the type signature, visible in documentation, and checked by the compiler. The Peano approach (`Zero`, `Succ<N>`) is more academic but mirrors OCaml exactly โ useful for understanding the theory.How It Works in Rust
// Approach 1: const generics โ most ergonomic in modern Rust
#[derive(Debug, Clone)]
struct Vec2<T, const N: usize> {
data: [T; N],
}
// zip only compiles when BOTH arrays have the same length N
fn zip_vec<T: Copy, U: Copy, const N: usize>(
a: &Vec2<T, N>,
b: &Vec2<U, N>,
) -> Vec2<(T, U), N> {
let mut result = [(a.data[0], b.data[0]); N]; // simplified
for i in 0..N { result[i] = (a.data[i], b.data[i]); }
Vec2 { data: result }
}
// Usage:
let xs = Vec2 { data: [1, 2, 3] }; // Vec2<i32, 3>
let ys = Vec2 { data: [4, 5, 6] }; // Vec2<i32, 3>
let zipped = zip_vec(&xs, &ys); // Vec2<(i32,i32), 3> โ
// This fails to compile โ different lengths:
// let ys2 = Vec2 { data: [4, 5] }; // Vec2<i32, 2>
// zip_vec(&xs, &ys2); // N=3 โ N=2
// Approach 2: Peano types โ mirrors OCaml GADTs
struct Zero;
struct Succ<N>(std::marker::PhantomData<N>);
enum TypeVec<T, N> {
VNil, // only valid when N = Zero
VCons(T, Box<TypeVec<T, N>>), // only valid when N = Succ<M>
}
// OCaml: ('a, zero) vec = VNil | ('a, 'n succ) vec = VCons
The const generic approach is what you'd use in practice. The Peano approach demonstrates the theoretical connection to type-level natural number arithmetic.
What This Unlocks
- Matrix operations โ `Matrix<T, R, C>` where matrix multiplication `Matrix<T, M, N> ร Matrix<T, N, P> -> Matrix<T, M, P>` checks dimension compatibility at compile time.
- Fixed-size protocol buffers โ network messages with a guaranteed field count; serialization/deserialization functions can be written without bounds checks.
- SIMD/vectorized code โ wrapping aligned chunks of a known fixed size, ensuring operations only apply to correctly-sized vectors.
Key Differences
| Concept | OCaml | Rust |
|---|---|---|
| Length encoding | Peano types: `type zero`, `type 'n succ` | Const generics: `const N: usize` (or manual Peano structs) |
| Non-empty guarantee | `(a, 'n succ) vec` โ type-level proof of โฅ1 element | `const N: usize` alone can't express `N > 0`; use `NonEmptyVec<T>` pattern instead |
| Zip | `('a, 'n) vec -> ('b, 'n) vec` โ same `'n` forces equal length | `Vec2<T, N>` and `Vec2<U, N>` โ same `N` enforces equal length |
| Ergonomics | Natural with GADTs but verbose Peano | Const generics are clean; Peano is verbose in Rust |
| Arithmetic on lengths | Type-level addition possible with GADTs | Const generics support `const { N + M }` in nightly; stable is limited |
// Example 178: Length-Indexed Lists
// Rust uses const generics to track length at compile time
// === Approach 1: Const generic array wrapper ===
#[derive(Debug, Clone)]
struct Vec2<T, const N: usize> {
data: [T; N],
}
impl<T: Default + Copy, const N: usize> Vec2<T, N> {
fn replicate(val: T) -> Self {
Vec2 { data: [val; N] }
}
}
impl<T: Copy, const N: usize> Vec2<T, N> {
fn head(&self) -> T {
assert!(N > 0, "Cannot take head of empty Vec2");
self.data[0]
}
fn map<U: Default + Copy>(&self, f: impl Fn(T) -> U) -> Vec2<U, N> {
let mut result = [U::default(); N];
for i in 0..N {
result[i] = f(self.data[i]);
}
Vec2 { data: result }
}
}
fn zip_vec<T: Copy + Default, U: Copy + Default, const N: usize>(
a: &Vec2<T, N>,
b: &Vec2<U, N>,
) -> Vec2<(T, U), N> {
let mut result = [(T::default(), U::default()); N];
for i in 0..N {
result[i] = (a.data[i], b.data[i]);
}
Vec2 { data: result }
}
// === Approach 2: Type-level Peano numbers with nested tuples ===
trait Nat {}
struct Zero;
struct Succ<N: Nat>(std::marker::PhantomData<N>);
impl Nat for Zero {}
impl<N: Nat> Nat for Succ<N> {}
trait TypeVec<T> {
fn to_vec(&self) -> std::vec::Vec<T> where T: Clone;
}
#[derive(Debug)]
struct VNil;
#[derive(Debug)]
struct VCons<T, N: Nat, Rest: TypeVec<T>>(T, Rest, std::marker::PhantomData<N>);
impl<T: Clone> TypeVec<T> for VNil {
fn to_vec(&self) -> std::vec::Vec<T> { vec![] }
}
impl<T: Clone, N: Nat, Rest: TypeVec<T>> TypeVec<T> for VCons<T, N, Rest> {
fn to_vec(&self) -> std::vec::Vec<T> {
let mut v = vec![self.0.clone()];
v.extend(self.1.to_vec());
v
}
}
impl<T, N: Nat, Rest: TypeVec<T>> VCons<T, N, Rest> {
fn head(&self) -> &T { &self.0 }
fn tail(&self) -> &Rest { &self.1 }
}
fn vnil() -> VNil { VNil }
fn vcons<T, N: Nat, R: TypeVec<T>>(x: T, rest: R) -> VCons<T, Succ<N>, R>
where R: TypeVec<T>
{
VCons(x, rest, std::marker::PhantomData)
}
// === Approach 3: Recursive type with compile-time length encoding ===
trait SizedList {
type Elem;
const LEN: usize;
fn to_vec(&self) -> std::vec::Vec<Self::Elem> where Self::Elem: Clone;
}
struct LNil<T>(std::marker::PhantomData<T>);
struct LCons<T, Tail: SizedList<Elem = T>>(T, Tail);
impl<T> SizedList for LNil<T> {
type Elem = T;
const LEN: usize = 0;
fn to_vec(&self) -> std::vec::Vec<T> where T: Clone { vec![] }
}
impl<T, Tail: SizedList<Elem = T>> SizedList for LCons<T, Tail> {
type Elem = T;
const LEN: usize = 1 + Tail::LEN;
fn to_vec(&self) -> std::vec::Vec<T> where T: Clone {
let mut v = vec![self.0.clone()];
v.extend(self.1.to_vec());
v
}
}
impl<T, Tail: SizedList<Elem = T>> LCons<T, Tail> {
fn head(&self) -> &T { &self.0 }
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_const_generic_head() {
let v = Vec2 { data: [1, 2, 3] };
assert_eq!(v.head(), 1);
}
#[test]
fn test_const_generic_map() {
let v = Vec2 { data: [1, 2, 3] };
let d = v.map(|x| x * 2);
assert_eq!(d.data, [2, 4, 6]);
}
#[test]
fn test_const_generic_zip() {
let a = Vec2 { data: [1, 2] };
let b = Vec2 { data: [10, 20] };
let z = zip_vec(&a, &b);
assert_eq!(z.data, [(1, 10), (2, 20)]);
}
#[test]
fn test_const_generic_replicate() {
let v: Vec2<i32, 3> = Vec2::replicate(42);
assert_eq!(v.data, [42, 42, 42]);
}
#[test]
fn test_peano_vec() {
let inner = vcons::<i32, Zero, VNil>(3, vnil());
let mid = vcons::<i32, Succ<Zero>, _>(2, inner);
let pv = vcons::<i32, Succ<Succ<Zero>>, _>(1, mid);
assert_eq!(*pv.head(), 1);
assert_eq!(pv.to_vec(), vec![1, 2, 3]);
}
#[test]
fn test_recursive_list() {
let l = LCons(10, LCons(20, LNil(std::marker::PhantomData)));
assert_eq!(*l.head(), 10);
assert_eq!(l.to_vec(), vec![10, 20]);
assert_eq!(<LCons<i32, LCons<i32, LNil<i32>>> as SizedList>::LEN, 2);
}
}
(* Example 178: Length-Indexed Lists (Vectors) *)
(* Lists whose length is tracked at the type level *)
(* Approach 1: Peano-encoded length-indexed vector *)
type zero = Zero
type 'n succ = Succ
type ('a, 'n) vec =
| Nil : ('a, zero) vec
| Cons : 'a * ('a, 'n) vec -> ('a, 'n succ) vec
let head : type a n. (a, n succ) vec -> a = function
| Cons (x, _) -> x
(* Nil case is impossible โ the type prevents it! *)
let tail : type a n. (a, n succ) vec -> (a, n) vec = function
| Cons (_, xs) -> xs
let rec to_list : type a n. (a, n) vec -> a list = function
| Nil -> []
| Cons (x, xs) -> x :: to_list xs
let rec map : type a b n. (a -> b) -> (a, n) vec -> (b, n) vec =
fun f -> function
| Nil -> Nil
| Cons (x, xs) -> Cons (f x, map f xs)
let rec append : type a m n. (a, m) vec -> (a, n) vec -> a list =
fun v1 v2 -> match v1 with
| Nil -> to_list v2
| Cons (x, xs) -> x :: append xs v2
(* Approach 2: Using GADTs for safe zip *)
let rec zip : type a b n. (a, n) vec -> (b, n) vec -> (a * b, n) vec =
fun v1 v2 -> match v1, v2 with
| Nil, Nil -> Nil
| Cons (x, xs), Cons (y, ys) -> Cons ((x, y), zip xs ys)
(* Approach 3: Length witness for runtime operations *)
type _ length =
| LZ : zero length
| LS : 'n length -> 'n succ length
let rec replicate : type a n. n length -> a -> (a, n) vec =
fun n x -> match n with
| LZ -> Nil
| LS n' -> Cons (x, replicate n' x)
let () =
let v1 = Cons (1, Cons (2, Cons (3, Nil))) in
let v2 = Cons (10, Cons (20, Cons (30, Nil))) in
assert (head v1 = 1);
assert (to_list (tail v1) = [2; 3]);
assert (to_list (map (fun x -> x * 2) v1) = [2; 4; 6]);
assert (to_list (zip v1 v2) = [(1,10); (2,20); (3,30)]);
let v3 = replicate (LS (LS (LS LZ))) 42 in
assert (to_list v3 = [42; 42; 42]);
print_endline "โ All tests passed"
๐ Detailed Comparison
Comparison: Example 178 โ Length-Indexed Lists
Type Definition
OCaml
๐ช Show OCaml equivalent
type zero = Zero
type 'n succ = Succ
type ('a, 'n) vec =
| Nil : ('a, zero) vec
| Cons : 'a * ('a, 'n) vec -> ('a, 'n succ) vecRust (const generics)
struct Vec2<T, const N: usize> {
data: [T; N],
}Rust (Peano)
struct VNil;
struct VCons<T, N: Nat, Rest: TypeVec<T>>(T, Rest, PhantomData<N>);Safe Head
OCaml
๐ช Show OCaml equivalent
let head : type a n. (a, n succ) vec -> a = function
| Cons (x, _) -> xRust
impl<T: Copy, const N: usize> Vec2<T, N> {
fn head(&self) -> T { self.data[0] }
}Zip (same length guaranteed)
OCaml
๐ช Show OCaml equivalent
let rec zip : type a b n. (a, n) vec -> (b, n) vec -> (a * b, n) vec =
fun v1 v2 -> match v1, v2 with
| Nil, Nil -> Nil
| Cons (x, xs), Cons (y, ys) -> Cons ((x, y), zip xs ys)Rust
fn zip_vec<T: Copy + Default, U: Copy + Default, const N: usize>(
a: &Vec2<T, N>, b: &Vec2<U, N>,
) -> Vec2<(T, U), N> {
let mut result = [(T::default(), U::default()); N];
for i in 0..N { result[i] = (a.data[i], b.data[i]); }
Vec2 { data: result }
}Replicate
OCaml
๐ช Show OCaml equivalent
let rec replicate : type a n. n length -> a -> (a, n) vec =
fun n x -> match n with
| LZ -> Nil
| LS n' -> Cons (x, replicate n' x)Rust
impl<T: Default + Copy, const N: usize> Vec2<T, N> {
fn replicate(val: T) -> Self {
Vec2 { data: [val; N] }
}
}