๐Ÿฆ€ Functional Rust

760: Derive-Based Serialization: How derive(Serialize) Works

Difficulty: 3 Level: Intermediate Understand what `#[derive(Serialize, Deserialize)]` generates by manually writing the equivalent code.

The Problem This Solves

`#[derive(Serialize, Deserialize)]` from the `serde` crate is magic to many Rust developers โ€” it "just works" for most structs, but when it doesn't (custom formats, enums with complex shapes, backwards compatibility), you need to implement the traits manually. Before you can do that, you need to understand what the derive macro generates. This example manually implements the Serialize/Deserialize trait pattern โ€” not using `serde`'s actual traits (which require the crate), but using a simplified analog that captures the same conceptual structure: a `Serialize` trait that emits fields into an output, and a `Deserialize` trait that reconstructs a value from a map of fields. Understanding this makes the real `serde` machinery approachable. The key insight is that `#[derive(Serialize)]` generates a `Serialize` implementation that calls the serializer's `serialize_struct` method for each field by name. There's no runtime reflection โ€” the field names and types are baked in at compile time by the proc macro.

The Intuition

`Serialize` is a trait with one method: emit your fields. `Deserialize` is a trait with one method: reconstruct yourself from field data. `#[derive(Serialize)]` is a proc macro that reads your struct definition and writes the `impl Serialize` block that lists every field by name. At runtime, there's no reflection โ€” just a sequence of `insert("field_name", value.to_string())` calls generated at compile time.

How It Works in Rust

// The trait shape (simplified analog of serde::Serialize)
pub trait Serialize {
 fn serialize_fields(&self, out: &mut HashMap<String, String>);
}

// For struct Color { r: u8, g: u8, b: u8 }
// This is what #[derive(Serialize)] generates conceptually:
impl Serialize for Color {
 fn serialize_fields(&self, out: &mut HashMap<String, String>) {
     out.insert("r".to_string(), self.r.to_string());  // field name โ†’ stringified value
     out.insert("g".to_string(), self.g.to_string());
     out.insert("b".to_string(), self.b.to_string());
 }
}

// This is what #[derive(Deserialize)] generates:
impl Deserialize for Color {
 fn deserialize_fields(map: &HashMap<String, String>) -> Option<Self> {
     Some(Color {
         r: map.get("r")?.parse().ok()?,  // lookup by field name, parse to type
         g: map.get("g")?.parse().ok()?,
         b: map.get("b")?.parse().ok()?,
     })
 }
}

// Usage: identical to how serde works
let red = Color { r: 255, g: 0, b: 0 };
let serialized = red.serialize();            // "b=0|g=0|r=255" (sorted, deterministic)
let decoded = Color::deserialize(&serialized).unwrap();
assert_eq!(red, decoded);
Real `serde` uses a `Serializer` trait (not a `HashMap`) to support many output formats without re-implementing the `Serialize` trait. The proc macro generates the same `serialize_fields`-equivalent code โ€” just against `serde`'s `Serializer` interface instead of a `HashMap`.

What This Unlocks

Key Differences

ConceptOCamlRust
Serialization derive`ppx_sexp_conv`, `ppx_yojson_conv``#[derive(Serialize, Deserialize)]` from `serde`
Proc macro outputGenerated OCaml codeGenerated Rust `impl Trait` blocks
Field name accessRuntime via `ppx`-generated codeCompile-time string literals in the generated impl
Format abstractionFormat-specific ppx per formatOne `Serialize` trait, many `Serializer` implementations
// 760. Derive-Based Serialization: How derive(Serialize) Works
// Manually writing what #[derive(Serialize)] would generate

use std::collections::HashMap;

// โ”€โ”€ Traits (hand-written derive targets) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

pub trait Serialize {
    /// Emit key=value pairs into the provided map
    fn serialize_fields(&self, out: &mut HashMap<String, String>);

    fn serialize(&self) -> String {
        let mut map = HashMap::new();
        self.serialize_fields(&mut map);
        let mut parts: Vec<String> = map.into_iter()
            .map(|(k, v)| format!("{k}={v}"))
            .collect();
        parts.sort(); // deterministic output
        parts.join("|")
    }
}

pub trait Deserialize: Sized {
    fn deserialize_fields(map: &HashMap<String, String>) -> Option<Self>;

    fn deserialize(s: &str) -> Option<Self> {
        let map = s.split('|').filter_map(|f| {
            let mut it = f.splitn(2, '=');
            Some((it.next()?.to_string(), it.next()?.to_string()))
        }).collect();
        Self::deserialize_fields(&map)
    }
}

// โ”€โ”€ Domain struct โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

#[derive(Debug, PartialEq)]
pub struct Color {
    pub r: u8,
    pub g: u8,
    pub b: u8,
}

// This is what #[derive(Serialize)] generates (conceptually):
impl Serialize for Color {
    fn serialize_fields(&self, out: &mut HashMap<String, String>) {
        out.insert("r".to_string(), self.r.to_string());
        out.insert("g".to_string(), self.g.to_string());
        out.insert("b".to_string(), self.b.to_string());
    }
}

// This is what #[derive(Deserialize)] generates (conceptually):
impl Deserialize for Color {
    fn deserialize_fields(map: &HashMap<String, String>) -> Option<Self> {
        Some(Color {
            r: map.get("r")?.parse().ok()?,
            g: map.get("g")?.parse().ok()?,
            b: map.get("b")?.parse().ok()?,
        })
    }
}

// โ”€โ”€ Nested example โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

#[derive(Debug, PartialEq)]
pub struct Pixel {
    pub x: i32,
    pub y: i32,
    // Note: nested structs need a flattening strategy โ€” shown here as prefix
    pub color_r: u8,
    pub color_g: u8,
    pub color_b: u8,
}

impl Serialize for Pixel {
    fn serialize_fields(&self, out: &mut HashMap<String, String>) {
        out.insert("x".to_string(), self.x.to_string());
        out.insert("y".to_string(), self.y.to_string());
        out.insert("color_r".to_string(), self.color_r.to_string());
        out.insert("color_g".to_string(), self.color_g.to_string());
        out.insert("color_b".to_string(), self.color_b.to_string());
    }
}

fn main() {
    let red = Color { r: 255, g: 0, b: 0 };
    let s = red.serialize();
    println!("Serialized Color : {s}");

    let decoded = Color::deserialize(&s).expect("decode failed");
    println!("Deserialized     : {decoded:?}");

    let pixel = Pixel { x: 10, y: 20, color_r: 128, color_g: 64, color_b: 32 };
    println!("Pixel serialized : {}", pixel.serialize());
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn color_round_trip() {
        let c = Color { r: 10, g: 20, b: 30 };
        let s = c.serialize();
        assert_eq!(Color::deserialize(&s), Some(Color { r: 10, g: 20, b: 30 }));
    }

    #[test]
    fn missing_field_returns_none() {
        assert_eq!(Color::deserialize("r=255|g=0"), None);
    }

    #[test]
    fn serialize_deterministic() {
        let c1 = Color { r: 1, g: 2, b: 3 };
        let c2 = Color { r: 1, g: 2, b: 3 };
        assert_eq!(c1.serialize(), c2.serialize());
    }
}
(* Derive-based serialization concept in OCaml
   We simulate what [@@deriving] does by writing the expansion manually,
   then compare with the PPX-generated version (commented). *)

(* ---------- The "trait" (module type) ---------- *)
module type SERIALIZABLE = sig
  type t
  val fields : string list           (* field names in order *)
  val to_assoc : t -> (string * string) list
  val of_assoc : (string * string) list -> t option
end

(* ---------- Generic serializer using the module ---------- *)
module JsonLike (S : SERIALIZABLE) = struct
  let serialize v =
    let pairs = S.to_assoc v in
    let inner =
      List.map (fun (k, v) -> Printf.sprintf "%S: %S" k v) pairs
      |> String.concat ", "
    in
    Printf.sprintf "{ %s }" inner

  let deserialize s =
    (* Toy parser: strip braces, split on ", ", split each on ": " *)
    let s = String.trim s in
    let s = String.sub s 1 (String.length s - 2) |> String.trim in
    let pairs =
      String.split_on_char ',' s
      |> List.filter_map (fun pair ->
        match String.split_on_char ':' (String.trim pair) with
        | [k; v] ->
          let unquote x =
            let x = String.trim x in
            if String.length x >= 2 && x.[0] = '"'
            then String.sub x 1 (String.length x - 2)
            else x
          in
          Some (unquote k, unquote v)
        | _ -> None)
    in
    S.of_assoc pairs
end

(* ---------- Domain type ---------- *)
(* What [@@deriving serialize] would generate: *)
type color = { r: int; g: int; b: int }

(* Manual "derive" expansion *)
module ColorSerializable : SERIALIZABLE with type t = color = struct
  type t = color
  let fields = ["r"; "g"; "b"]
  let to_assoc c = [
    ("r", string_of_int c.r);
    ("g", string_of_int c.g);
    ("b", string_of_int c.b);
  ]
  let of_assoc pairs =
    let find k = List.assoc_opt k pairs in
    match find "r", find "g", find "b" with
    | Some r, Some g, Some b ->
      (try Some { r = int_of_string r; g = int_of_string g; b = int_of_string b }
       with Failure _ -> None)
    | _ -> None
end

module ColorJson = JsonLike(ColorSerializable)

let () =
  let red = { r = 255; g = 0; b = 0 } in
  let encoded = ColorJson.serialize red in
  Printf.printf "Serialized: %s\n" encoded;
  match ColorJson.deserialize encoded with
  | Some c -> Printf.printf "Deserialized: r=%d g=%d b=%d\n" c.r c.g c.b
  | None   -> Printf.printf "Failed to deserialize\n"