๐Ÿฆ€ Functional Rust

762: Custom Deserialization with Visitor Pattern

Difficulty: 4 Level: Advanced Implement the serde Visitor pattern from scratch โ€” understand how `Deserialize` and type-driven dispatch work.

The Problem This Solves

Deserialization is harder than serialization: you're converting untyped wire data into a strongly-typed value, and different types expect different input shapes. A `Person` expects a map; a `u32` expects an integer; a `bool` expects true/false. The Visitor pattern is how `serde` solves this: the type being deserialized describes what it expects, and the deserializer drives it through a type-specific visitor. When `#[derive(Deserialize)]` falls short โ€” non-standard wire formats, complex validation during parsing, types from other crates you can't annotate โ€” you implement `Deserialize` manually. Understanding the visitor pattern is the key to doing this correctly. It also explains why serde's error messages mention "expected a map" or "expected a string": that's the `Visitor::expecting()` method at work. This example implements the core of serde's machinery from scratch: a `Token` type (what the deserializer emits), a `Visitor` trait (what the type being deserialized expects), a `SimpleDeserializer` (drives the visitor), and a concrete `PersonVisitor` that consumes a map token and constructs a `Person`.

The Intuition

The visitor pattern inverts the usual control flow. Instead of the caller saying "give me a map", the deserializer says "I have a map โ€” call `visit_map`". The visitor implements only the methods for the types it can accept, returning `Err(InvalidType)` for everything else. This lets the same `Deserialize` impl work across different deserializers (JSON, TOML, binary) โ€” they all call the same visitor methods with different underlying data.

How It Works in Rust

// The Visitor trait: a type says what it expects
pub trait Visitor<'de>: Sized {
 type Value;
 fn expecting(&self) -> &'static str;  // human-readable type description

 // Default: return InvalidType error โ€” override the ones you accept
 fn visit_str(self, v: &'de str) -> Result<Self::Value, DeError> {
     Err(DeError::InvalidType { got: "str", expected: self.expecting() })
 }
 fn visit_map(self, m: Vec<(&'de str, &'de str)>) -> Result<Self::Value, DeError> {
     Err(DeError::InvalidType { got: "map", expected: self.expecting() })
 }
 // ... visit_i64, visit_f64, visit_bool ...
}

// Concrete visitor for Person: accepts only a map
pub struct PersonVisitor;
impl<'de> Visitor<'de> for PersonVisitor {
 type Value = Person;
 fn expecting(&self) -> &'static str { "a map with name and age" }

 fn visit_map(self, m: Vec<(&'de str, &'de str)>) -> Result<Person, DeError> {
     let name = m.iter().find(|(k, _)| *k == "name")
         .map(|(_, v)| v.to_string())
         .ok_or(DeError::MissingField("name"))?;
     let age = m.iter().find(|(k, _)| *k == "age")
         .and_then(|(_, v)| v.parse().ok())
         .ok_or(DeError::MissingField("age"))?;
     Ok(Person { name, age })
 }
}

// The Deserialize trait delegates to the visitor
impl<'de> Deserialize<'de> for Person {
 fn deserialize(de: SimpleDeserializer<'de>) -> Result<Self, DeError> {
     de.deserialize_any(PersonVisitor)  // "drive this visitor with my data"
 }
}
The `'de` lifetime is the key: it ties the visitor's output lifetime to the input data, enabling zero-copy deserialization (borrowed `&str` slices from the wire data without allocation). When the visitor returns owned `String` values, `'de` is still needed to satisfy the trait bounds.

What This Unlocks

Key Differences

ConceptOCamlRust
Visitor patternTypically via first-class modulesTrait with default methods for each token type
Lifetime in deserializationGarbage collected โ€” no lifetime tracking`'de` ties output borrows to input data lifetime
Error propagationExceptions or `Result``?` operator throughout โ€” clean early-return style
Type-driven dispatchGADT or polymorphic variantsTrait objects / monomorphization via generics
// 762. Custom Deserialization with Visitor Pattern
// Implements the core serde Visitor mechanism from scratch

// โ”€โ”€ Error โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

#[derive(Debug)]
pub enum DeError {
    InvalidType { got: &'static str, expected: &'static str },
    MissingField(&'static str),
    ParseError(String),
    Custom(String),
}

impl std::fmt::Display for DeError {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Self::InvalidType { got, expected } => write!(f, "expected {expected}, got {got}"),
            Self::MissingField(n) => write!(f, "missing field `{n}`"),
            Self::ParseError(s) => write!(f, "parse error: {s}"),
            Self::Custom(s) => write!(f, "{s}"),
        }
    }
}

// โ”€โ”€ Token (what the deserializer produces) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

#[derive(Debug, Clone)]
pub enum Token<'a> {
    Str(&'a str),
    Int(i64),
    Float(f64),
    Bool(bool),
    Map(Vec<(&'a str, &'a str)>),
    Seq(Vec<&'a str>),
}

// โ”€โ”€ Visitor trait โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

pub trait Visitor<'de>: Sized {
    type Value;
    fn expecting(&self) -> &'static str;

    fn visit_str(self, v: &'de str) -> Result<Self::Value, DeError> {
        Err(DeError::InvalidType { got: "str", expected: self.expecting() })
    }
    fn visit_i64(self, v: i64) -> Result<Self::Value, DeError> {
        Err(DeError::InvalidType { got: "i64", expected: self.expecting() })
    }
    fn visit_f64(self, v: f64) -> Result<Self::Value, DeError> {
        Err(DeError::InvalidType { got: "f64", expected: self.expecting() })
    }
    fn visit_bool(self, v: bool) -> Result<Self::Value, DeError> {
        Err(DeError::InvalidType { got: "bool", expected: self.expecting() })
    }
    fn visit_map(self, m: Vec<(&'de str, &'de str)>) -> Result<Self::Value, DeError> {
        Err(DeError::InvalidType { got: "map", expected: self.expecting() })
    }
}

// โ”€โ”€ Deserializer โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

pub struct SimpleDeserializer<'de>(&'de str);

impl<'de> SimpleDeserializer<'de> {
    pub fn new(s: &'de str) -> Self { Self(s) }

    pub fn deserialize_any<V: Visitor<'de>>(self, visitor: V) -> Result<V::Value, DeError> {
        let s = self.0;
        if let Some(rest) = s.strip_prefix("str:") {
            visitor.visit_str(rest)
        } else if let Some(rest) = s.strip_prefix("int:") {
            let n = rest.parse::<i64>().map_err(|e| DeError::ParseError(e.to_string()))?;
            visitor.visit_i64(n)
        } else if let Some(rest) = s.strip_prefix("float:") {
            let f = rest.parse::<f64>().map_err(|e| DeError::ParseError(e.to_string()))?;
            visitor.visit_f64(f)
        } else if s == "true" {
            visitor.visit_bool(true)
        } else if s == "false" {
            visitor.visit_bool(false)
        } else if let Some(rest) = s.strip_prefix("map:") {
            let pairs: Vec<(&'de str, &'de str)> = rest
                .split(',')
                .filter_map(|p| {
                    let mut it = p.splitn(2, '=');
                    Some((it.next()?, it.next()?))
                })
                .collect();
            visitor.visit_map(pairs)
        } else {
            visitor.visit_str(s) // fallback
        }
    }
}

// โ”€โ”€ Domain type and its Visitor โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

#[derive(Debug, PartialEq)]
pub struct Person {
    pub name: String,
    pub age: u32,
}

pub struct PersonVisitor;

impl<'de> Visitor<'de> for PersonVisitor {
    type Value = Person;
    fn expecting(&self) -> &'static str { "a map with name and age" }

    fn visit_map(self, m: Vec<(&'de str, &'de str)>) -> Result<Person, DeError> {
        let name = m.iter().find(|(k, _)| *k == "name")
            .map(|(_, v)| v.to_string())
            .ok_or(DeError::MissingField("name"))?;
        let age_str = m.iter().find(|(k, _)| *k == "age")
            .map(|(_, v)| *v)
            .ok_or(DeError::MissingField("age"))?;
        let age = age_str.parse::<u32>().map_err(|e| DeError::ParseError(e.to_string()))?;
        Ok(Person { name, age })
    }
}

pub trait Deserialize<'de>: Sized {
    fn deserialize(de: SimpleDeserializer<'de>) -> Result<Self, DeError>;
}

impl<'de> Deserialize<'de> for Person {
    fn deserialize(de: SimpleDeserializer<'de>) -> Result<Self, DeError> {
        de.deserialize_any(PersonVisitor)
    }
}

fn main() {
    let wire = "map:name=Alice,age=30";
    let de = SimpleDeserializer::new(wire);
    match Person::deserialize(de) {
        Ok(p)  => println!("Got: {p:?}"),
        Err(e) => println!("Error: {e}"),
    }

    // Wrong type โ€” should error with helpful message
    let wrong = SimpleDeserializer::new("str:hello");
    match Person::deserialize(wrong) {
        Ok(_)  => println!("unexpected ok"),
        Err(e) => println!("Expected error: {e}"),
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn parse_person_from_map() {
        let de = SimpleDeserializer::new("map:name=Bob,age=25");
        let p = Person::deserialize(de).unwrap();
        assert_eq!(p, Person { name: "Bob".into(), age: 25 });
    }

    #[test]
    fn wrong_type_returns_error() {
        let de = SimpleDeserializer::new("str:notamap");
        let result = Person::deserialize(de);
        assert!(matches!(result, Err(DeError::InvalidType { .. })));
    }

    #[test]
    fn missing_age_returns_error() {
        let de = SimpleDeserializer::new("map:name=Alice");
        let result = Person::deserialize(de);
        assert!(matches!(result, Err(DeError::MissingField("age"))));
    }
}
(* Custom deserialization with visitor pattern in OCaml
   We implement a visitor-style callback interface *)

(* The "deserializer" calls into a visitor *)
type 'a visit_result = Ok of 'a | Err of string

(* Visitor module type โ€” the type being built provides this *)
module type VISITOR = sig
  type output
  val visit_string : string -> output visit_result
  val visit_int    : int    -> output visit_result
  val visit_float  : float  -> output visit_result
  val visit_bool   : bool   -> output visit_result
  val visit_seq    : string list -> output visit_result
  val visit_map    : (string * string) list -> output visit_result
  val expecting    : string  (* human-readable description *)
end

(* Deserializer drives the process *)
module Deserializer = struct
  type token =
    | TString of string
    | TInt    of int
    | TFloat  of float
    | TBool   of bool
    | TMap    of (string * string) list

  (* Parse a simple wire format into a token *)
  let parse s =
    if String.length s > 4 && String.sub s 0 4 = "str:" then
      TString (String.sub s 4 (String.length s - 4))
    else if String.length s > 4 && String.sub s 0 4 = "int:" then
      TInt (int_of_string (String.sub s 4 (String.length s - 4)))
    else if String.length s > 6 && String.sub s 0 6 = "float:" then
      TFloat (float_of_string (String.sub s 6 (String.length s - 6)))
    else if s = "true" then TBool true
    else if s = "false" then TBool false
    else if String.length s > 4 && String.sub s 0 4 = "map:" then
      let rest = String.sub s 4 (String.length s - 4) in
      let pairs =
        String.split_on_char ',' rest
        |> List.filter_map (fun p ->
          match String.split_on_char '=' p with
          | [k; v] -> Some (k, v)
          | _ -> None)
      in
      TMap pairs
    else TString s   (* fallback *)

  let drive (type a) (module V : VISITOR with type output = a) token =
    match token with
    | TString s -> V.visit_string s
    | TInt    i -> V.visit_int i
    | TFloat  f -> V.visit_float f
    | TBool   b -> V.visit_bool b
    | TMap    m -> V.visit_map m
end

(* ---------- Domain type ---------- *)
type person = { name: string; age: int }

module PersonVisitor : VISITOR with type output = person = struct
  type output = person
  let expecting = "a map with name and age"
  let visit_string _ = Err "expected map, got string"
  let visit_int    _ = Err "expected map, got int"
  let visit_float  _ = Err "expected map, got float"
  let visit_bool   _ = Err "expected map, got bool"
  let visit_seq    _ = Err "expected map, got seq"
  let visit_map pairs =
    match List.assoc_opt "name" pairs, List.assoc_opt "age" pairs with
    | Some name, Some age_s ->
      (try Ok { name; age = int_of_string age_s }
       with Failure _ -> Err "age is not an int")
    | _ -> Err ("missing field; expecting: " ^ expecting)
end

let () =
  let wire = "map:name=Alice,age=30" in
  let token = Deserializer.parse wire in
  match Deserializer.drive (module PersonVisitor) token with
  | Ok p  -> Printf.printf "Got person: %s, age %d\n" p.name p.age
  | Err e -> Printf.printf "Error: %s\n" e