๐Ÿฆ€ Functional Rust

176: Introduction to GADTs

Difficulty: โญโญโญ Level: Advanced Simulate Generalized Algebraic Data Types in Rust using phantom types, so different enum variants can carry provably different return types.

The Problem This Solves

In a normal Rust enum, every variant collapses to the same type. If you build an expression tree with `enum Expr { Int(i64), Bool(bool), Add(Box<Expr>, Box<Expr>) }`, the `eval` function must return `Box<dyn Any>` โ€” you've thrown away type information and now need runtime casts. This means `Add(Bool(true), Int(5))` is representable, and its wrongness is only caught at runtime. Generalized Algebraic Data Types (GADTs) solve this by letting each constructor carry its own type index. In OCaml, `Int : int -> int expr` is a constructor whose type says "I produce an `int expr`". The type system can then prove that `eval : 'a expr -> 'a` โ€” call `eval` on an `int expr` and you get an `int`; call it on a `bool expr` and you get a `bool`. No casts, no `Any`. The payoff is an expression tree where the type of the result is part of the node's type. You literally cannot write `Add(Bool(true), Int(5))` โ€” the compiler rejects it before your program ever runs.

The Intuition

Think of a vending machine with typed buttons. A standard enum is like a machine where all buttons return a bag of "something" โ€” you have to reach in and hope it's what you wanted. A GADT is like a machine where button A returns a `Drink` and button B returns a `Snack`, guaranteed by the machine's own type. The return type of each button is built into the button's type signature. In Rust we don't have native GADTs, but we can get close with phantom types: `Expr<T>` is a struct that carries a `PhantomData<T>`. The type `T` is never stored โ€” it takes zero bytes โ€” but the compiler tracks it statically. Smart constructors enforce the rules: `Expr::<i64>::int(5)` only compiles when building an integer node. `Expr::<bool>::bool_val(true)` only compiles for booleans. The type parameter flows through the tree and prevents mixing.

How It Works in Rust

use std::marker::PhantomData;

// The phantom type T says "what this expression evaluates to"
struct Expr<T> {
 inner: ExprInner,       // actual data, untyped
 _phantom: PhantomData<T>, // type tag โ€” zero bytes
}

// Only specific impl blocks exist:
impl Expr<i64> {
 fn int(n: i64) -> Self { /* only callable for i64 */ }
 fn add(a: Expr<i64>, b: Expr<i64>) -> Expr<i64> { /* enforces both i64 */ }
 fn eval(&self) -> i64 { /* return type matches phantom */ }
}

impl Expr<bool> {
 fn bool_val(b: bool) -> Self { /* only callable for bool */ }
 fn eval(&self) -> bool { /* return type matches phantom */ }
}

// This compiles:
let e: Expr<i64> = Expr::add(Expr::int(3), Expr::int(4));
let result: i64 = e.eval(); // statically known: returns i64

// This does NOT compile โ€” types mismatch:
// Expr::add(Expr::int(3), Expr::bool_val(true));
//                         ^^^^^^^^^^^^^^^^^ expected Expr<i64>, found Expr<bool>
The key insight: `PhantomData<T>` lets you brand a struct with a type without storing any value of that type. Different `impl` blocks on different `Expr<T>` specializations enforce GADT-like rules.

What This Unlocks

Key Differences

ConceptOCamlRust
GADT syntax`type _ expr = Int : int -> int expr``struct Expr<T>` + `PhantomData<T>` + separate `impl` blocks
Type refinement in matchAutomatic per branch โ€” compiler knows `a = int` inside `Int` branchNo refinement; must use `unreachable!()` for provably-impossible arms
Enforcing constructor rulesBuilt into the type system via GADT constructor return typesEnforced by only exposing smart constructors (no direct struct construction)
Runtime overheadNoneNone โ€” `PhantomData` is zero-sized
ExhaustivenessFull GADT checkingPattern matching on inner untyped enum; type safety at construction time only
// Example 176: Introduction to GADTs
// Rust doesn't have native GADTs, but we can simulate them with
// PhantomData, traits, and sealed type-level markers.

use std::marker::PhantomData;

// === Approach 1: PhantomData + sealed traits to simulate GADTs ===

mod sealed {
    pub trait ExprType {}
    impl ExprType for i64 {}
    impl ExprType for bool {}
}

#[derive(Debug)]
enum ExprInner {
    Int(i64),
    Bool(bool),
    Add(Box<ExprInner>, Box<ExprInner>),
    If(Box<ExprInner>, Box<ExprInner>, Box<ExprInner>),
}

#[derive(Debug)]
struct Expr<T: sealed::ExprType> {
    inner: ExprInner,
    _phantom: PhantomData<T>,
}

impl Expr<i64> {
    fn int(n: i64) -> Self {
        Expr { inner: ExprInner::Int(n), _phantom: PhantomData }
    }
    fn add(a: Expr<i64>, b: Expr<i64>) -> Self {
        Expr { inner: ExprInner::Add(Box::new(a.inner), Box::new(b.inner)), _phantom: PhantomData }
    }
    fn eval(&self) -> i64 {
        match &self.inner {
            ExprInner::Int(n) => *n,
            ExprInner::Add(a, b) => {
                let a = Expr::<i64> { inner: *a.clone(), _phantom: PhantomData };
                let b = Expr::<i64> { inner: *b.clone(), _phantom: PhantomData };
                a.eval() + b.eval()
            }
            ExprInner::If(c, t, f) => {
                let c = Expr::<bool> { inner: *c.clone(), _phantom: PhantomData };
                let t = Expr::<i64> { inner: *t.clone(), _phantom: PhantomData };
                let f = Expr::<i64> { inner: *f.clone(), _phantom: PhantomData };
                if c.eval() { t.eval() } else { f.eval() }
            }
            _ => unreachable!(),
        }
    }
}

impl Expr<bool> {
    fn bool_val(b: bool) -> Self {
        Expr { inner: ExprInner::Bool(b), _phantom: PhantomData }
    }
    fn eval(&self) -> bool {
        match &self.inner {
            ExprInner::Bool(b) => *b,
            _ => unreachable!(),
        }
    }
}

impl Clone for ExprInner {
    fn clone(&self) -> Self {
        match self {
            ExprInner::Int(n) => ExprInner::Int(*n),
            ExprInner::Bool(b) => ExprInner::Bool(*b),
            ExprInner::Add(a, b) => ExprInner::Add(Box::new((**a).clone()), Box::new((**b).clone())),
            ExprInner::If(a, b, c) => ExprInner::If(Box::new((**a).clone()), Box::new((**b).clone()), Box::new((**c).clone())),
        }
    }
}

fn if_expr(cond: Expr<bool>, then: Expr<i64>, else_: Expr<i64>) -> Expr<i64> {
    Expr {
        inner: ExprInner::If(Box::new(cond.inner), Box::new(then.inner), Box::new(else_.inner)),
        _phantom: PhantomData,
    }
}

// === Approach 2: Enum-based with trait for evaluation ===

trait Eval {
    type Output;
    fn eval(&self) -> Self::Output;
}

struct IntLit(i64);
struct BoolLit(bool);
struct AddExpr(Box<dyn Eval<Output = i64>>, Box<dyn Eval<Output = i64>>);

impl Eval for IntLit {
    type Output = i64;
    fn eval(&self) -> i64 { self.0 }
}

impl Eval for BoolLit {
    type Output = bool;
    fn eval(&self) -> bool { self.0 }
}

impl Eval for AddExpr {
    type Output = i64;
    fn eval(&self) -> i64 { self.0.eval() + self.1.eval() }
}

// === Approach 3: Type-safe heterogeneous list with tuples ===

// Rust's type system naturally supports this via nested tuples
// (42, ("hello", (true, ())))  is the HList equivalent

trait HList {}
impl HList for () {}
impl<H, T: HList> HList for (H, T) {}

trait Head {
    type Item;
    fn head(&self) -> &Self::Item;
}

impl<H, T: HList> Head for (H, T) {
    type Item = H;
    fn head(&self) -> &H { &self.0 }
}

fn main() {
    // Approach 1
    let e = Expr::int(42);
    println!("Int 42 = {}", e.eval());

    let sum = Expr::add(Expr::int(1), Expr::int(2));
    println!("1 + 2 = {}", sum.eval());

    let cond = if_expr(Expr::bool_val(true), Expr::int(10), Expr::int(20));
    println!("if true then 10 else 20 = {}", cond.eval());

    // Approach 2
    let expr = AddExpr(Box::new(IntLit(10)), Box::new(IntLit(32)));
    println!("10 + 32 = {}", expr.eval());

    // Approach 3
    let hlist = (42, ("hello", (true, ())));
    println!("head = {}", hlist.head());

    println!("โœ“ All examples running");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_phantom_gadt_int() {
        assert_eq!(Expr::int(42).eval(), 42);
    }

    #[test]
    fn test_phantom_gadt_add() {
        assert_eq!(Expr::add(Expr::int(1), Expr::int(2)).eval(), 3);
    }

    #[test]
    fn test_phantom_gadt_bool() {
        assert_eq!(Expr::bool_val(true).eval(), true);
    }

    #[test]
    fn test_phantom_gadt_if() {
        let e = if_expr(Expr::bool_val(true), Expr::int(10), Expr::int(20));
        assert_eq!(e.eval(), 10);
        let e2 = if_expr(Expr::bool_val(false), Expr::int(10), Expr::int(20));
        assert_eq!(e2.eval(), 20);
    }

    #[test]
    fn test_trait_eval() {
        assert_eq!(IntLit(5).eval(), 5);
        assert_eq!(BoolLit(false).eval(), false);
        assert_eq!(AddExpr(Box::new(IntLit(3)), Box::new(IntLit(4))).eval(), 7);
    }

    #[test]
    fn test_hlist() {
        let hlist = (42, ("hello", (true, ())));
        assert_eq!(*hlist.head(), 42);
        assert_eq!(*hlist.1.head(), "hello");
        assert_eq!(*hlist.1 .1.head(), true);
    }
}
(* Example 176: Introduction to GADTs *)
(* GADTs (Generalized Algebraic Data Types) allow constructors to return
   different type instantiations of the same type *)

(* Approach 1: Basic GADT with type-safe expressions *)
type _ expr =
  | Int  : int  -> int expr
  | Bool : bool -> bool expr
  | Add  : int expr * int expr -> int expr
  | If   : bool expr * 'a expr * 'a expr -> 'a expr

let rec eval : type a. a expr -> a = function
  | Int n -> n
  | Bool b -> b
  | Add (a, b) -> eval a + eval b
  | If (cond, t, f) -> if eval cond then eval t else eval f

(* Approach 2: GADT for type-safe formatting *)
type _ fmt =
  | Lit  : string -> string fmt
  | FInt : int fmt
  | FStr : string fmt
  | Cat  : 'a fmt * 'b fmt -> ('a * 'b) fmt

let rec format_to_string : type a. a fmt -> a -> string = fun fmt v ->
  match fmt with
  | Lit s -> s
  | FInt -> string_of_int v
  | FStr -> v
  | Cat (a, b) ->
    let (va, vb) = v in
    format_to_string a va ^ format_to_string b vb

(* Approach 3: GADT for type-safe heterogeneous lists *)
type _ hlist =
  | HNil  : unit hlist
  | HCons : 'a * 'b hlist -> ('a * 'b) hlist

let example_list = HCons (42, HCons ("hello", HCons (true, HNil)))

let hd : type a b. (a * b) hlist -> a = function
  | HCons (x, _) -> x

(* Tests *)
let () =
  (* Test Approach 1 *)
  assert (eval (Int 42) = 42);
  assert (eval (Bool true) = true);
  assert (eval (Add (Int 1, Int 2)) = 3);
  assert (eval (If (Bool true, Int 10, Int 20)) = 10);
  assert (eval (If (Bool false, Int 10, Int 20)) = 20);

  (* Test Approach 2 *)
  assert (format_to_string (Lit "hello") "hello" = "hello");
  assert (format_to_string FInt 42 = "42");
  assert (format_to_string FStr "world" = "world");
  assert (format_to_string (Cat (Lit "n=", FInt)) ("n=", 42) = "n=42");

  (* Test Approach 3 *)
  assert (hd example_list = 42);

  print_endline "โœ“ All tests passed"

๐Ÿ“Š Detailed Comparison

Comparison: Example 176 โ€” Introduction to GADTs

GADT Type Definition

OCaml

๐Ÿช Show OCaml equivalent
type _ expr =
| Int  : int  -> int expr
| Bool : bool -> bool expr
| Add  : int expr * int expr -> int expr
| If   : bool expr * 'a expr * 'a expr -> 'a expr

Rust

use std::marker::PhantomData;

enum ExprInner {
 Int(i64),
 Bool(bool),
 Add(Box<ExprInner>, Box<ExprInner>),
 If(Box<ExprInner>, Box<ExprInner>, Box<ExprInner>),
}

struct Expr<T> {
 inner: ExprInner,
 _phantom: PhantomData<T>,
}

Type-Safe Evaluation

OCaml

๐Ÿช Show OCaml equivalent
let rec eval : type a. a expr -> a = function
| Int n -> n
| Bool b -> b
| Add (a, b) -> eval a + eval b
| If (cond, t, f) -> if eval cond then eval t else eval f

Rust

impl Expr<i64> {
 fn eval(&self) -> i64 {
     match &self.inner {
         ExprInner::Int(n) => *n,
         ExprInner::Add(a, b) => { /* reconstruct typed wrappers */ }
         _ => unreachable!(),
     }
 }
}

impl Expr<bool> {
 fn eval(&self) -> bool {
     match &self.inner {
         ExprInner::Bool(b) => *b,
         _ => unreachable!(),
     }
 }
}

Trait-Based Alternative (Rust)

Rust

trait Eval {
 type Output;
 fn eval(&self) -> Self::Output;
}

struct IntLit(i64);
impl Eval for IntLit {
 type Output = i64;
 fn eval(&self) -> i64 { self.0 }
}

struct AddExpr(Box<dyn Eval<Output = i64>>, Box<dyn Eval<Output = i64>>);
impl Eval for AddExpr {
 type Output = i64;
 fn eval(&self) -> i64 { self.0.eval() + self.1.eval() }
}

Heterogeneous Lists

OCaml

๐Ÿช Show OCaml equivalent
type _ hlist =
| HNil  : unit hlist
| HCons : 'a * 'b hlist -> ('a * 'b) hlist

let hd : type a b. (a * b) hlist -> a = function
| HCons (x, _) -> x

Rust

trait HList {}
impl HList for () {}
impl<H, T: HList> HList for (H, T) {}

trait Head {
 type Item;
 fn head(&self) -> &Self::Item;
}

impl<H, T: HList> Head for (H, T) {
 type Item = H;
 fn head(&self) -> &H { &self.0 }
}

// Usage: (42, ("hello", (true, ())))