๐Ÿฆ€ Functional Rust

181: Type-Safe SQL-Like Query Builder

Difficulty: โญโญโญ Level: Advanced Use phantom type states to enforce that a query builder is used in the right order โ€” `SELECT` before `FROM`, `FROM` before `WHERE` โ€” turning protocol violations into compile-time errors.

The Problem This Solves

Builder APIs have ordering requirements. A SQL query needs `SELECT` before `FROM`. An HTTP request needs a URL before headers. A test fixture needs setup before assertions. The typical builder pattern accepts calls in any order and validates at `.build()` time โ€” which means errors surface only at runtime, possibly in production, possibly intermittently when the code path that constructs a malformed query is hit. The subtler issue: a runtime-validated builder forces every consumer to handle `Result<Query, BuildError>`. Half your callers know the query is valid โ€” they're using the builder correctly โ€” but they still have to write `.unwrap()` or propagate an error they didn't cause. The type system has let them down by treating correct usage and incorrect usage identically. Type-state builders solve this by making each stage of the builder a distinct type. After calling `.select(...)` you hold a `Query<HasSelect, NoFrom, NoWhere>`. After `.from(...)` you hold a `Query<HasSelect, HasFrom, NoWhere>`. The `.build()` method only exists when all required stages are complete โ€” and it returns `Query` directly, not `Result<Query, Error>`, because correctness is guaranteed.

The Intuition

A passport application form has required sections. The clerk won't stamp "received" until you've filled in name, date of birth, and nationality. Each section you complete moves you forward in a process. You can't go to the "photo attached" step without the basics done first. In Rust, each completed stage is a type parameter. `PhantomData<(HasSelect, HasFrom, NoWhere)>` is a zero-sized triple that records exactly what you've done. The compiler reads it. When you call `.build()`, the bound says `S: IsReady` โ€” and `IsReady` is only implemented for the state where all required fields are set.

How It Works in Rust

use std::marker::PhantomData;

// Stage marker types โ€” zero-sized
struct NoSelect;  struct HasSelect;
struct NoFrom;    struct HasFrom;
struct NoWhere;   struct HasWhere;

struct QueryBuilder<S, F, W> {
 select_clause: Option<String>,
 from_clause:   Option<String>,
 where_clause:  Option<String>,
 _state: PhantomData<(S, F, W)>,
}

// Start: no stages set
impl QueryBuilder<NoSelect, NoFrom, NoWhere> {
 fn new() -> Self {
     QueryBuilder {
         select_clause: None,
         from_clause: None,
         where_clause: None,
         _state: PhantomData,
     }
 }
}

// select() transitions NoSelect -> HasSelect
impl<F, W> QueryBuilder<NoSelect, F, W> {
 fn select(self, cols: &str) -> QueryBuilder<HasSelect, F, W> {
     QueryBuilder {
         select_clause: Some(cols.to_string()),
         from_clause: self.from_clause,
         where_clause: self.where_clause,
         _state: PhantomData,
     }
 }
}

// from() requires HasSelect (you can't have FROM without SELECT)
impl<W> QueryBuilder<HasSelect, NoFrom, W> {
 fn from(self, table: &str) -> QueryBuilder<HasSelect, HasFrom, W> {
     QueryBuilder {
         select_clause: self.select_clause,
         from_clause: Some(table.to_string()),
         where_clause: self.where_clause,
         _state: PhantomData,
     }
 }
}

// build() only exists when both SELECT and FROM are set โ€” returns String, not Result
impl<W> QueryBuilder<HasSelect, HasFrom, W> {
 fn build(self) -> String {
     let mut sql = format!("SELECT {} FROM {}",
         self.select_clause.unwrap(),
         self.from_clause.unwrap());
     if let Some(w) = self.where_clause {
         sql.push_str(&format!(" WHERE {}", w));
     }
     sql
 }
}

// Correct usage โ€” builds cleanly with no Result:
let sql = QueryBuilder::new()
 .select("id, name")
 .from("users")
 .build(); // returns String directly

// This fails to compile:
// QueryBuilder::new().from("users").build();
//                    ^^^^  error: method `from` not found in `QueryBuilder<NoSelect, ...>`

What This Unlocks

Key Differences

ConceptOCamlRust
State encodingPhantom type parameters on a record, e.g. `type ('s, 'f, 'w) builder``PhantomData<(S, F, W)>` triple, or separate marker traits
Method gatingFunctions accept only specific phantom type combinations`impl` blocks scoped to specific type parameter combinations
StylePipeline `\>` with typed intermediate valuesMethod chaining; each call returns a new typed builder
Error quality"Type mismatch" with phantom types in error message"method not found in `QueryBuilder<NoSelect, ...>`" โ€” very readable
Zero-costYesYes โ€” PhantomData is erased entirely at runtime
// Example 181: Type-Safe SQL-like Query Builder
// Enforce SELECT before WHERE at compile time using phantom types

use std::marker::PhantomData;

// === Approach 1: Type-state builder ===

struct NoSelect;
struct HasSelect;
struct NoFrom;
struct HasFrom;
struct NoWhere;
struct HasWhere;

struct Query<S, F, W> {
    select: Option<String>,
    from: Option<String>,
    where_: Option<String>,
    order_by: Option<String>,
    _s: PhantomData<(S, F, W)>,
}

impl Query<NoSelect, NoFrom, NoWhere> {
    fn new() -> Self {
        Query {
            select: None, from: None, where_: None, order_by: None,
            _s: PhantomData,
        }
    }
}

impl<F, W> Query<NoSelect, F, W> {
    fn select(self, cols: &str) -> Query<HasSelect, F, W> {
        Query {
            select: Some(cols.to_string()),
            from: self.from, where_: self.where_, order_by: self.order_by,
            _s: PhantomData,
        }
    }
}

impl<W> Query<HasSelect, NoFrom, W> {
    fn from(self, table: &str) -> Query<HasSelect, HasFrom, W> {
        Query {
            select: self.select,
            from: Some(table.to_string()),
            where_: self.where_, order_by: self.order_by,
            _s: PhantomData,
        }
    }
}

impl Query<HasSelect, HasFrom, NoWhere> {
    fn where_(self, cond: &str) -> Query<HasSelect, HasFrom, HasWhere> {
        Query {
            select: self.select, from: self.from,
            where_: Some(cond.to_string()),
            order_by: self.order_by,
            _s: PhantomData,
        }
    }
}

impl<W> Query<HasSelect, HasFrom, W> {
    fn order_by(mut self, col: &str) -> Self {
        self.order_by = Some(col.to_string());
        self
    }

    fn build(&self) -> String {
        let mut sql = format!("SELECT {} FROM {}",
            self.select.as_ref().unwrap(),
            self.from.as_ref().unwrap());
        if let Some(w) = &self.where_ {
            sql.push_str(&format!(" WHERE {}", w));
        }
        if let Some(o) = &self.order_by {
            sql.push_str(&format!(" ORDER BY {}", o));
        }
        sql
    }
}

// === Approach 2: Trait-based builder with associated types ===

trait BuilderState {}
trait CanAddFrom: BuilderState {}
trait CanAddWhere: BuilderState {}
trait CanBuild: BuilderState {}

struct Selected;
struct FromAdded;
struct WhereAdded;

impl BuilderState for Selected {}
impl BuilderState for FromAdded {}
impl BuilderState for WhereAdded {}
impl CanAddFrom for Selected {}
impl CanAddWhere for FromAdded {}
impl CanBuild for FromAdded {}
impl CanBuild for WhereAdded {}

struct QueryBuilder<S: BuilderState> {
    parts: Vec<String>,
    _state: PhantomData<S>,
}

impl QueryBuilder<Selected> {
    fn select(cols: &str) -> Self {
        QueryBuilder {
            parts: vec![format!("SELECT {}", cols)],
            _state: PhantomData,
        }
    }
}

impl<S: CanAddFrom> QueryBuilder<S> {
    fn from(mut self, table: &str) -> QueryBuilder<FromAdded> {
        self.parts.push(format!("FROM {}", table));
        QueryBuilder { parts: self.parts, _state: PhantomData }
    }
}

impl<S: CanAddWhere> QueryBuilder<S> {
    fn where_clause(mut self, cond: &str) -> QueryBuilder<WhereAdded> {
        self.parts.push(format!("WHERE {}", cond));
        QueryBuilder { parts: self.parts, _state: PhantomData }
    }
}

impl<S: CanBuild> QueryBuilder<S> {
    fn build(&self) -> String {
        self.parts.join(" ")
    }
}

// === Approach 3: Runtime builder for comparison ===

#[derive(Default)]
struct FluentQuery {
    select: Option<String>,
    from: Option<String>,
    where_: Option<String>,
}

impl FluentQuery {
    fn select(mut self, cols: &str) -> Self { self.select = Some(cols.into()); self }
    fn from(mut self, table: &str) -> Self { self.from = Some(table.into()); self }
    fn where_(mut self, cond: &str) -> Self { self.where_ = Some(cond.into()); self }

    fn build(&self) -> Result<String, &'static str> {
        match (&self.select, &self.from) {
            (Some(s), Some(f)) => {
                let mut sql = format!("SELECT {} FROM {}", s, f);
                if let Some(w) = &self.where_ { sql.push_str(&format!(" WHERE {}", w)); }
                Ok(sql)
            }
            _ => Err("SELECT and FROM are required"),
        }
    }
}

fn main() {
    // Approach 1: Type-state
    let sql = Query::new()
        .select("*")
        .from("users")
        .where_("age > 18")
        .build();
    println!("{}", sql);

    // This won't compile:
    // Query::new().from("users");  // No from() on NoSelect
    // Query::new().select("*").where_("x=1");  // No where_() on NoFrom

    // Approach 2: Trait-based
    let sql2 = QueryBuilder::select("name")
        .from("products")
        .where_clause("price > 10")
        .build();
    println!("{}", sql2);

    // Approach 3: Runtime
    let sql3 = FluentQuery::default()
        .select("*").from("orders").where_("total > 100")
        .build().unwrap();
    println!("{}", sql3);

    println!("โœ“ All examples running");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_type_state_basic() {
        let sql = Query::new().select("*").from("users").build();
        assert_eq!(sql, "SELECT * FROM users");
    }

    #[test]
    fn test_type_state_where() {
        let sql = Query::new().select("name").from("users").where_("age > 18").build();
        assert_eq!(sql, "SELECT name FROM users WHERE age > 18");
    }

    #[test]
    fn test_type_state_order() {
        let sql = Query::new().select("*").from("users").order_by("name").build();
        assert_eq!(sql, "SELECT * FROM users ORDER BY name");
    }

    #[test]
    fn test_trait_builder() {
        let sql = QueryBuilder::select("*").from("t").build();
        assert_eq!(sql, "SELECT * FROM t");
    }

    #[test]
    fn test_trait_builder_where() {
        let sql = QueryBuilder::select("a").from("b").where_clause("c=1").build();
        assert_eq!(sql, "SELECT a FROM b WHERE c=1");
    }

    #[test]
    fn test_fluent_ok() {
        let r = FluentQuery::default().select("*").from("t").build();
        assert!(r.is_ok());
    }

    #[test]
    fn test_fluent_missing() {
        let r = FluentQuery::default().build();
        assert!(r.is_err());
    }
}
(* Example 181: Type-Safe SQL-like Query Builder *)
(* Enforce SELECT before WHERE, FROM before JOIN at the type level *)

(* Approach 1: GADT-based query builder *)
type empty_q
type has_select
type has_from
type has_where

type ('select, 'from, 'where_) query = {
  select_clause: string option;
  from_clause: string option;
  where_clause: string option;
}

let empty_query : (empty_q, empty_q, empty_q) query =
  { select_clause = None; from_clause = None; where_clause = None }

(* select requires nothing *)
let select cols (q : (empty_q, 'f, 'w) query) : (has_select, 'f, 'w) query =
  { q with select_clause = Some cols }

(* from requires select *)
let from table (q : (has_select, empty_q, 'w) query) : (has_select, has_from, 'w) query =
  { q with from_clause = Some table }

(* where_ requires from *)
let where_ cond (q : (has_select, has_from, empty_q) query) : (has_select, has_from, has_where) query =
  { q with where_clause = Some cond }

let to_sql (q : (has_select, has_from, _) query) : string =
  let base = Printf.sprintf "SELECT %s FROM %s"
    (Option.get q.select_clause)
    (Option.get q.from_clause) in
  match q.where_clause with
  | None -> base
  | Some w -> base ^ " WHERE " ^ w

(* Approach 2: Functor-based builder *)
module type BUILDER_STATE = sig
  type select_state
  type from_state
end

module Query (S : BUILDER_STATE) = struct
  type t = { select: string; from_: string; where_: string option }
end

(* Approach 3: Simple fluent builder with runtime checks for comparison *)
module FluentQuery = struct
  type t = {
    select: string option;
    from_: string option;
    where_: string option;
    order_by: string option;
  }

  let create () = { select = None; from_ = None; where_ = None; order_by = None }
  let select cols q = { q with select = Some cols }
  let from_ table q = { q with from_ = Some table }
  let where_ cond q = { q with where_ = Some cond }
  let order_by col q = { q with order_by = Some col }

  let build q =
    match q.select, q.from_ with
    | Some s, Some f ->
      let base = Printf.sprintf "SELECT %s FROM %s" s f in
      let base = match q.where_ with Some w -> base ^ " WHERE " ^ w | None -> base in
      let base = match q.order_by with Some o -> base ^ " ORDER BY " ^ o | None -> base in
      Ok base
    | _ -> Error "SELECT and FROM are required"
end

let () =
  (* Test Approach 1 *)
  let q = empty_query |> select "*" |> from "users" in
  assert (to_sql q = "SELECT * FROM users");

  let q2 = empty_query |> select "name, age" |> from "users" |> where_ "age > 18" in
  assert (to_sql q2 = "SELECT name, age FROM users WHERE age > 18");

  (* This would NOT compile:
     let bad = empty_query |> from "users"  -- needs select first
     let bad = empty_query |> select "*" |> where_ "x=1"  -- needs from first
  *)

  (* Test Approach 3 *)
  let q3 = FluentQuery.(create () |> select "*" |> from_ "products" |> where_ "price > 10" |> order_by "name") in
  (match FluentQuery.build q3 with
   | Ok sql -> assert (sql = "SELECT * FROM products WHERE price > 10 ORDER BY name")
   | Error _ -> assert false);

  (match FluentQuery.(build (create ())) with
   | Ok _ -> assert false
   | Error _ -> ());

  print_endline "โœ“ All tests passed"

๐Ÿ“Š Detailed Comparison

Comparison: Example 181 โ€” Type-Safe Query Builder

State-Tracked Builder

OCaml

๐Ÿช Show OCaml equivalent
let select cols (q : (empty_q, 'f, 'w) query) : (has_select, 'f, 'w) query =
{ q with select_clause = Some cols }

let from table (q : (has_select, empty_q, 'w) query) : (has_select, has_from, 'w) query =
{ q with from_clause = Some table }

let where_ cond (q : (has_select, has_from, empty_q) query) =
{ q with where_clause = Some cond }

(* Usage *)
let sql = empty_query |> select "*" |> from "users" |> where_ "age > 18"

Rust

impl<F, W> Query<NoSelect, F, W> {
 fn select(self, cols: &str) -> Query<HasSelect, F, W> { /* ... */ }
}
impl<W> Query<HasSelect, NoFrom, W> {
 fn from(self, table: &str) -> Query<HasSelect, HasFrom, W> { /* ... */ }
}
impl Query<HasSelect, HasFrom, NoWhere> {
 fn where_(self, cond: &str) -> Query<HasSelect, HasFrom, HasWhere> { /* ... */ }
}

// Usage
let sql = Query::new().select("*").from("users").where_("age > 18").build();

Compile-Time Error

OCaml

๐Ÿช Show OCaml equivalent
(* Won't compile: from needs has_select *)
let _ = empty_query |> from "users"
(* Error: This expression has type (empty_q, empty_q, empty_q) query
but expected (has_select, empty_q, 'w) query *)

Rust

// Won't compile: no from() on NoSelect
Query::new().from("users");
// Error: no method named `from` found for `Query<NoSelect, NoFrom, NoWhere>`