597: Scott Encoding for Algebraic Types
Difficulty: 4 Level: Advanced Represent algebraic data types as functions โ the Church encoding family, but with O(1) pattern matching.The Problem This Solves
In lambda calculus and type theory, data types don't exist as primitives โ everything is a function. Church encoding and Scott encoding both represent ADTs (Option, List, Bool) purely as functions. The difference: Church encoding folds over the entire structure, while Scott encoding matches on exactly one level, giving O(1) pattern match and O(n) recursion โ the same as native `enum`. Why study this in Rust? It illuminates what `enum` and `match` mean computationally. It shows that pattern matching is fundamentally function application. It also demonstrates the limits of Rust's type system when encoding recursive types (you'll reach for `Rc<dyn Fn...>` quickly), and it bridges directly to Church encoding in proof assistants and dependently typed languages. It's also genuinely mind-bending: a Scott-encoded `Option<T>` is a function that takes "what to do if None" and "what to do if Some(T)" and applies the right one. That is a match expression, as a value.The Intuition
A vending machine as a data value. Instead of inspecting a coin to determine if it's a quarter, the coin is a function: give it a slot for each denomination, and it activates the right slot. `scott_some(42)` is a value that, when given a "nothing" handler and a "has value" handler, calls the latter with 42. The data and the dispatch are unified.How It Works in Rust
1. Scott-encoded `Option<T, R>` โ a function that takes two handlers and calls the right one:use std::rc::Rc;
type ScottOption<T, R> = Rc<dyn Fn(Box<dyn Fn() -> R>, Box<dyn Fn(T) -> R>) -> R>;
fn scott_none<T: 'static, R: 'static>() -> ScottOption<T, R> {
Rc::new(|on_none: Box<dyn Fn() -> R>, _| on_none())
}
fn scott_some<T: Clone + 'static, R: 'static>(v: T) -> ScottOption<T, R> {
Rc::new(move |_, on_some: Box<dyn Fn(T) -> R>| on_some(v.clone()))
}
2. Pattern matching is function application:
fn scott_match<T, R>(m: &ScottOption<T, R>,
on_none: impl Fn() -> R + 'static,
on_some: impl Fn(T) -> R + 'static) -> R {
m(Box::new(on_none), Box::new(on_some))
}
let result = scott_match(&scott_some(42),
|| "nothing".to_string(),
|v| format!("got {}", v)); // "got 42"
3. Scott-encoded Bool โ simpler, no `Box` needed with concrete types:
type SBool<R> = Box<dyn Fn(R, R) -> R>;
fn s_true<R: Clone + 'static>() -> SBool<R> { Box::new(|t, _f| t) }
fn s_false<R: Clone + 'static>() -> SBool<R> { Box::new(|_t, f| f) }
fn s_if<R: Clone + 'static>(b: SBool<R>, t: R, f: R) -> R { b(t, f) }
4. Recursive types require `Rc` โ `ScottList` would need `Rc<dyn Fn(...)>` to break the infinite type cycle that `Box<dyn Fn(...)>` hits.
What This Unlocks
- Denotational clarity โ you see exactly what `match` means: apply a function indexed by constructor.
- Lambda calculus bridge โ Scott encoding is the standard ADT representation in type theory; understanding it prepares you for Coq, Agda, and Lean.
- Introspect your type system โ the contortions Rust needs (`Rc<dyn Fn>`, `'static` bounds, `Clone` constraints) reveal what a GC buys you in a functional language.
Key Differences
| Concept | OCaml | Rust | |
|---|---|---|---|
| Scott encoding | Clean with polymorphic types | Requires `Rc<dyn Fn>` and `'static` bounds | |
| Pattern match | `match` (native) | Function application (encoded) | |
| Recursive type | `type 'a list = Nil \ | Cons of 'a * 'a list` | Needs `Rc` to break type cycle |
| Performance | GC overhead | `Rc` + `Box` allocations per constructor |
// Scott-encoded Option<i32>
// None = |on_none, on_some| on_none()
// Some v = |on_none, on_some| on_some(v)
use std::rc::Rc;
type ScottOption<T, R> = Rc<dyn Fn(Box<dyn Fn() -> R>, Box<dyn Fn(T) -> R>) -> R>;
fn scott_none<T: 'static, R: 'static>() -> ScottOption<T, R> {
Rc::new(|on_none: Box<dyn Fn() -> R>, _| on_none())
}
fn scott_some<T: Clone + 'static, R: 'static>(v: T) -> ScottOption<T, R> {
Rc::new(move |_, on_some: Box<dyn Fn(T) -> R>| on_some(v.clone()))
}
fn scott_match<T, R>(
m: &ScottOption<T, R>,
on_none: impl Fn() -> R + 'static,
on_some: impl Fn(T) -> R + 'static,
) -> R {
m(Box::new(on_none), Box::new(on_some))
}
// Simpler version with concrete type for clarity
type SBool<R> = Box<dyn Fn(R, R) -> R>;
fn s_true<R: Clone + 'static>() -> SBool<R> { Box::new(|t, _f| t) }
fn s_false<R: Clone + 'static>() -> SBool<R> { Box::new(|_t, f| f) }
fn s_if<R: Clone + 'static>(b: SBool<R>, t: R, f: R) -> R { b(t, f) }
fn main() {
// Scott Option
let none = scott_none::<i32,String>();
let some42 = scott_some::<i32,String>(42);
let r1 = scott_match(&none, || "nothing".into(), |v| format!("got {}", v));
let r2 = scott_match(&some42,|| "nothing".into(), |v| format!("got {}", v));
println!("none: {}", r1);
println!("some42:{}", r2);
// Scott Bool
let t: SBool<&str> = s_true();
let f: SBool<&str> = s_false();
println!("s_if true = {}", s_if(t, "yes", "no"));
println!("s_if false = {}", s_if(f, "yes", "no"));
// Scott encoding reveals: match IS function application
println!("Scott encoding: match is just function application over continuations");
}
#[cfg(test)]
mod tests {
use super::*;
#[test] fn scott_none_test() {
let n = scott_none::<i32,i32>();
let r = scott_match(&n, || -1, |v| v);
assert_eq!(r, -1);
}
#[test] fn scott_some_test() {
let s = scott_some::<i32,i32>(42);
let r = scott_match(&s, || -1, |v| v);
assert_eq!(r, 42);
}
}
(* Scott encoding in OCaml *)
(* Scott Bool: same as Church Bool *)
(* Scott List: cons h t nil_case cons_case = cons_case h t *)
(* Scott Maybe *)
let scott_none none_case _some_case = none_case
let scott_some v _none_case some_case = some_case v
(* Usage: match on encoded Maybe *)
let scott_match_maybe m on_none on_some = m on_none on_some
let () =
let none = scott_none in
let some42 = scott_some 42 in
Printf.printf "none: %d\n" (scott_match_maybe none 0 (fun v -> v));
Printf.printf "some 42: %d\n" (scott_match_maybe some42 0 (fun v -> v));
(* Scott List *)
let scott_nil nil_f _cons_f = nil_f () in
let scott_cons h t nil_f cons_f = cons_f h t in
let lst = scott_cons 1 (scott_cons 2 (scott_cons 3 scott_nil)) in
let rec to_list encoded =
encoded (fun () -> []) (fun h t -> h :: to_list t) in
Printf.printf "list: %s\n"
(String.concat "," (List.map string_of_int (to_list lst)))