๐Ÿฆ€ Functional Rust

433: State Machine via Macro

Difficulty: 4 Level: Expert Generate typestate state machine boilerplate โ€” states as zero-sized types, transitions as consuming methods โ€” so that invalid sequences are compile errors, not runtime panics.

The Problem This Solves

Protocol implementations โ€” network connections, file handles, authentication flows, payment processors โ€” have strict sequencing rules. You can only `send()` after `connect()`. You can only `commit()` inside a transaction. Calling methods out of order is a logic error that should be caught immediately, not at runtime in production when a user hits an unexpected state. Encoding these rules as runtime enums plus guards (`if self.state != State::Connected { panic!() }`) is fragile: the guard is easy to forget, every method has boilerplate, and the compiler won't help you find callers that got the sequence wrong. The typestate pattern moves state into the type system: each state is a zero-sized type, and methods are only defined on the correct state. Call `send()` on a `Connection<Disconnected>` and the compiler refuses. The compiler is your state machine validator. Implementing this by hand requires: state structs, a generic struct, `PhantomData`, and an `impl` block per state. A state machine macro generates all of this from a concise declaration.

The Intuition

Each state (`Disconnected`, `Connected`, `Authenticated`) is a zero-sized struct โ€” it carries no data, costs nothing at runtime. The connection struct `Connection<S>` is generic over state `S`. Methods exist only on specific `Connection<ConcreteState>` types:
Connection<Disconnected>  โ†’  .connect()  โ†’  Connection<Connected>
Connection<Connected>     โ†’  .authenticate()  โ†’  Connection<Authenticated>
Connection<Authenticated> โ†’  .send()  โ†’  same state
Connection<Authenticated> โ†’  .disconnect()  โ†’  Connection<Closed>
Each transition consumes `self` and returns a new typed value. Once consumed, you can't call the old state's methods โ€” the original variable is moved. Impossible sequences are literally unrepresentable.

How It Works in Rust

use std::marker::PhantomData;

// State marker types โ€” zero bytes at runtime
struct Disconnected;
struct Connected;
struct Authenticated;
struct Closed;

// The state machine struct โ€” generic over current state
struct Connection<State> {
 host: String,
 port: u16,
 messages_sent: u32,
 _state: PhantomData<State>,  // zero cost, carries type info
}

// Methods only on Disconnected
impl Connection<Disconnected> {
 fn new(host: &str, port: u16) -> Self {
     Connection { host: host.into(), port, messages_sent: 0, _state: PhantomData }
 }
 // Consuming transition: Disconnected โ†’ Connected
 fn connect(self) -> Connection<Connected> {
     println!("Connecting to {}:{}", self.host, self.port);
     Connection { _state: PhantomData, ..self }  // state type changes, data moves
 }
}

// Methods only on Connected
impl Connection<Connected> {
 // Consuming transition: Connected โ†’ Authenticated
 fn authenticate(self, _token: &str) -> Connection<Authenticated> {
     Connection { _state: PhantomData, ..self }
 }
}

// Methods only on Authenticated
impl Connection<Authenticated> {
 fn send(&mut self, msg: &str) {        // &mut self โ€” doesn't transition
     println!("โ†’ {}", msg);
     self.messages_sent += 1;
 }
 fn disconnect(self) -> Connection<Closed> {
     Connection { _state: PhantomData, ..self }
 }
}

// Valid sequence โ€” compiles
let conn = Connection::new("api.example.com", 443)
 .connect()
 .authenticate("secret-token");
// ... use conn.send() ...
let closed = conn.disconnect();

// These DO NOT COMPILE:
// Connection::new("h", 80).send("x");  // Disconnected has no send()
// closed.send("x");                    // Closed has no send()
The state machine macro (`state_machine!` in the example) generates the state structs, the generic struct with `PhantomData`, and the `impl` blocks for each transition from the same concise DSL.

What This Unlocks

Key Differences

ConceptOCamlRust
State encodingVariant in a GADT or module typeZero-sized struct as phantom type parameter
Invalid transitionsRuntime exception or explicit result typeCompile error โ€” method doesn't exist on the wrong type
Transition costPattern match + allocation if boxingZero cost โ€” state type is erased, only the generic changes
State machine DSLGADT or first-class modules`macro_rules!` generating `impl` blocks per state
Mutating within a stateRecord update syntax`&mut self` methods on specific `impl Connection<State>`
// State machine via macro in Rust โ€” typestate pattern

// Macro that generates a typestate state machine
macro_rules! state_machine {
    (
        struct $name:ident<$state_param:ident> {
            $($field:ident : $fty:ty),* $(,)?
        }
        states { $($state:ident),* $(,)? }
        transitions {
            $( $from:ident => $method:ident => $to:ident { $($body:tt)* } )*
        }
    ) => {
        // State marker types
        $(
            #[derive(Debug)]
            struct $state;
        )*

        // The state machine struct
        #[derive(Debug)]
        struct $name<S> {
            $($field: $fty,)*
            _state: std::marker::PhantomData<S>,
        }

        // Transition impls
        $(
            impl $name<$from> {
                fn $method(self) -> $name<$to> {
                    $name {
                        $($field: self.$field,)*
                        _state: std::marker::PhantomData,
                    }
                }
            }
        )*
    };
}

// Define a Connection state machine
#[derive(Debug)]
struct Disconnected;
#[derive(Debug)]
struct Connected;
#[derive(Debug)]
struct Authenticated;
#[derive(Debug)]
struct Closed;

#[derive(Debug)]
struct Connection<State> {
    host: String,
    port: u16,
    messages_sent: u32,
    _state: std::marker::PhantomData<State>,
}

impl Connection<Disconnected> {
    fn new(host: &str, port: u16) -> Self {
        Connection {
            host: host.to_string(),
            port,
            messages_sent: 0,
            _state: std::marker::PhantomData,
        }
    }

    fn connect(self) -> Connection<Connected> {
        println!("Connecting to {}:{}", self.host, self.port);
        Connection { _state: std::marker::PhantomData, ..self }
    }
}

impl Connection<Connected> {
    fn authenticate(self, _token: &str) -> Connection<Authenticated> {
        println!("Authenticating...");
        Connection { _state: std::marker::PhantomData, ..self }
    }

    fn disconnect(self) -> Connection<Closed> {
        println!("Disconnecting (unauthenticated)");
        Connection { _state: std::marker::PhantomData, ..self }
    }
}

impl Connection<Authenticated> {
    fn send(&mut self, message: &str) {
        println!("Sending: {}", message);
        self.messages_sent += 1;
    }

    fn disconnect(self) -> Connection<Closed> {
        println!("Disconnecting (sent {} messages)", self.messages_sent);
        Connection { _state: std::marker::PhantomData, ..self }
    }
}

impl Connection<Closed> {
    fn stats(&self) {
        println!("Connection to {} closed. Messages sent: {}",
                 self.host, self.messages_sent);
    }
}

fn main() {
    let conn = Connection::new("api.example.com", 443);
    let conn = conn.connect();
    let mut conn = conn.authenticate("secret-token");

    conn.send("GET /users HTTP/1.1");
    conn.send("Host: api.example.com");

    let closed = conn.disconnect();
    closed.stats();

    // Type safety: these would NOT compile:
    // conn.send("too late!"); // Connection<Closed> has no send()
    // let unauthenticated = Connection::new("h", 80).connect();
    // unauthenticated.send("no auth!"); // Connected has no send()

    println!("
State machine enforced at compile time!");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_state_machine() {
        let conn = Connection::new("test.com", 80)
            .connect()
            .authenticate("token");
        let closed = conn.disconnect();
        // Can only call stats() on Closed
        closed.stats();
    }

    fn assert_send_type<T: Send>() {}

    #[test]
    fn test_connection_types_distinct() {
        // These are different types at compile time:
        let _disc: Connection<Disconnected> = Connection::new("h", 80);
        let _conn: Connection<Connected> = Connection::new("h", 80).connect();
        // They don't have the same methods โ€” enforced by type system
    }
}
(* State machine via macros in OCaml with phantom types *)

(* Phantom types for state tracking *)
type unconnected
type connected
type closed

type 'state connection = {
  host: string;
  port: int;
  mutable data: string list;
}

(* Each function returns the "next state" *)
let connect host port : connected connection =
  Printf.printf "Connecting to %s:%d\n" host port;
  { host; port; data = [] }

let send (conn : connected connection) msg : connected connection =
  Printf.printf "Sending: %s\n" msg;
  { conn with data = msg :: conn.data }

let disconnect (conn : connected connection) : closed connection =
  Printf.printf "Disconnecting from %s\n" conn.host;
  { conn with data = [] }

(* Cannot send on closed connection โ€” type error! *)
(* let bad = send (disconnect (connect "localhost" 80)) "oops" *)

let () =
  let conn = connect "api.example.com" 443 in
  let conn2 = send conn "GET / HTTP/1.1" in
  let conn3 = send conn2 "Host: api.example.com" in
  let _closed = disconnect conn3 in
  Printf.printf "Protocol followed correctly\n"