๐Ÿฆ€ Functional Rust

734: Typestate Pattern: Encode State in the Type

Difficulty: 4 Level: Expert Use zero-sized type parameters to make invalid state transitions a compile error โ€” the type system enforces your protocol, with zero runtime overhead.

The Problem This Solves

State machines are everywhere: a traffic light that can only go Red โ†’ Green โ†’ Yellow โ†’ Red. A TCP connection that must Connect before it can Send. A file handle that must be opened before it can be read. Traditionally, you enforce these transitions at runtime: check a flag, return an error, maybe panic. This means bugs are caught at runtime โ€” or worse, in production. With the typestate pattern, the state is encoded in the type, not in a field. `Light<Red>` and `Light<Green>` are different types. The `go()` method only exists on `Light<Red>`, not on `Light<Green>`. You can't call `.stop()` on a green light because there is no `stop` method on `Light<Green>` โ€” the compiler simply doesn't know what you're talking about. The result: invalid state transitions don't compile. There's no runtime check, no error handling, no test needed for the "impossible" case. The bug cannot exist in a compiled binary.

The Intuition

In Python, you'd use an enum for the state and check it in every method: `if self.state != State.RED: raise ValueError("not red")`. This works but requires defensive programming everywhere and can only catch errors at runtime. The typestate pattern moves that check to the type system. Instead of `self.state == State.RED`, the type `Light<Red>` itself is the check. If you have a `Light<Red>`, it's red โ€” the type guarantees it. No flag needed. `PhantomData<State>` is the key: it's a zero-sized field that holds the type parameter without occupying any bytes. `Light<Red>` and `Light<Green>` are distinct types despite being structurally identical at the byte level.

How It Works in Rust

use std::marker::PhantomData;

// Zero-sized state marker types โ€” no data, just type identity
pub struct Red;
pub struct Green;
pub struct Yellow;

// The light struct carries the state in its type parameter
// PhantomData<State> is 0 bytes at runtime
pub struct Light<State> {
 _state: PhantomData<State>,
}

// Methods only defined on specific states
impl Light<Red> {
 pub fn new() -> Self {                  // can only start Red
     Light { _state: PhantomData }
 }

 pub fn go(self) -> Light<Green> {       // consumes self: Red is GONE
     Light { _state: PhantomData }       // returns a new Light<Green>
 }
 // No `slow()` or `stop()` methods โ€” can't call them on Red
}

impl Light<Green> {
 pub fn slow(self) -> Light<Yellow> {
     Light { _state: PhantomData }
 }
 // No `go()` โ€” can't transition Green โ†’ Green
}

impl Light<Yellow> {
 pub fn stop(self) -> Light<Red> {
     Light { _state: PhantomData }
 }
}

// Valid โ€” the full cycle compiles
let red    = Light::<Red>::new();
let green  = red.go();
let yellow = green.slow();
let red2   = yellow.stop();

// These would NOT compile:
// red.slow();    // ERROR: no method `slow` found for `Light<Red>`
// green.go();    // ERROR: no method `go` found for `Light<Green>`
// red.stop();    // ERROR: no method `stop` found for `Light<Red>`

// Size check โ€” PhantomData is zero bytes
assert_eq!(std::mem::size_of::<Light<Red>>(), 0);
Key points:

What This Unlocks

Key Differences

ConceptOCamlRust
State in typePhantom types with GADTs`PhantomData<State>` with zero-sized markers
Invalid transitionsCompile error via type mismatchCompile error โ€” method doesn't exist on that type
State consumptionFunctional style: return new stateMove semantics: `self` is consumed, new type returned
Runtime overheadZero with phantom typesZero โ€” `PhantomData` and markers are erased
State marker type`type 'red light` phantom`struct Red;` zero-sized struct
Protocol machinesType-indexed by phantomGeneric struct `Session<State>` + per-state `impl`
/// 734: Typestate Basics โ€” compile-time state machine
/// Invalid transitions DO NOT compile. No runtime checks needed.

use std::marker::PhantomData;

// โ”€โ”€ State marker types (zero-sized) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

pub struct Red;
pub struct Green;
pub struct Yellow;

// โ”€โ”€ Traffic Light โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// `PhantomData<State>` is zero bytes at runtime.
/// The type parameter encodes which state we're in.
pub struct Light<State> {
    _state: PhantomData<State>,
}

impl Light<Red> {
    /// Only way to create a light โ€” must start Red.
    pub fn new() -> Self {
        println!("Light: Red");
        Light { _state: PhantomData }
    }

    /// Red โ†’ Green (consumes self, returns different type)
    pub fn go(self) -> Light<Green> {
        println!("Light: Green");
        Light { _state: PhantomData }
    }
}

impl Light<Green> {
    /// Green โ†’ Yellow
    pub fn slow(self) -> Light<Yellow> {
        println!("Light: Yellow");
        Light { _state: PhantomData }
    }
}

impl Light<Yellow> {
    /// Yellow โ†’ Red
    pub fn stop(self) -> Light<Red> {
        println!("Light: Red");
        Light { _state: PhantomData }
    }
}

// โ”€โ”€ Size check โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

fn main() {
    println!("Size of Light<Red>   = {} bytes", std::mem::size_of::<Light<Red>>());
    println!("Size of Light<Green> = {} bytes", std::mem::size_of::<Light<Green>>());
    println!();

    // Valid state machine
    let red    = Light::<Red>::new();
    let green  = red.go();
    let yellow = green.slow();
    let red2   = yellow.stop();

    // Cycle again
    let _green2 = red2.go();

    // โ”€โ”€ The following would NOT compile: โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
    // let r = Light::<Red>::new();
    // let _ = r.slow();   // ERROR: no method `slow` on `Light<Red>`
    // let _ = r.stop();   // ERROR: no method `stop` on `Light<Red>`
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn full_cycle_compiles() {
        let red    = Light::<Red>::new();
        let green  = red.go();
        let yellow = green.slow();
        let _red2  = yellow.stop();
    }

    #[test]
    fn light_is_zero_sized() {
        assert_eq!(std::mem::size_of::<Light<Red>>(), 0);
        assert_eq!(std::mem::size_of::<Light<Green>>(), 0);
        assert_eq!(std::mem::size_of::<Light<Yellow>>(), 0);
    }

    // Uncommenting the test below would fail to COMPILE โ€” which is the point:
    //
    // #[test]
    // fn invalid_transition_compile_error() {
    //     let r = Light::<Red>::new();
    //     let _ = r.slow();  // compile error: method not found
    // }
}
(* 734: Typestate Basics โ€” OCaml GADT approach
   OCaml doesn't have zero-cost phantom types like Rust, but GADTs let us
   encode state in the type safely. *)

(* State phantom types *)
type red = Red
type green = Green
type yellow = Yellow

(* Traffic light parameterized by phantom state *)
(* We use a simple record + witness value *)
type 'state light = {
  state_name: string;
  _phantom: 'state;  (* phantom field to carry state type *)
}

(* Type-safe constructors and transitions *)
let red_light () : red light = { state_name = "Red"; _phantom = Red }

let green_of_red (l : red light) : green light =
  let _ = l in
  { state_name = "Green"; _phantom = Green }

let yellow_of_green (l : green light) : yellow light =
  let _ = l in
  { state_name = "Yellow"; _phantom = Yellow }

let red_of_yellow (l : yellow light) : red light =
  let _ = l in
  { state_name = "Red"; _phantom = Red }

let print_light l = Printf.printf "Light is: %s\n" l.state_name

let () =
  let r = red_light () in
  print_light r;
  let g = green_of_red r in
  print_light g;
  let y = yellow_of_green g in
  print_light y;
  let r2 = red_of_yellow y in
  print_light r2;
  (* Invalid: green_of_red y  โ† would be a type error in OCaml too! *)
  ignore r2