πŸ¦€ Functional Rust

763: JSON-Like Format Built From Scratch

Difficulty: 4 Level: Advanced A complete recursive AST, hand-written serializer via `Display`, and a recursive-descent parser β€” all in ~200 lines of safe, std-only Rust.

The Problem This Solves

Every web developer uses JSON daily without understanding the parsing machinery behind it. When you need to parse a domain-specific language, a configuration format, or a protocol, you need to write a parser. The JSON grammar is small enough to implement in a few hundred lines, but rich enough to teach every technique you'll need: recursive types, string escaping, number parsing, whitespace handling, and proper error messages. Understanding how JSON works from the inside also makes you a better user of `serde_json`. You'll know why certain shapes are valid, why nested objects require recursion, and why string escaping is non-trivial. When `serde_json` produces an unexpected result, you'll be able to reason about what the parser is doing. This is also a practical foundation. Real-world formats like TOML, custom DSLs, and binary protocols use the same structural patterns: an AST (enum with recursive variants), a serializer (walking the tree), and a parser (consuming bytes and building the tree).

The Intuition

A JSON value is one of: null, bool, number, string, array (of values), or object (of string→value pairs). That's a recursive enum in Rust — `Vec<Json>` and `Vec<(String, Json)>` reference the same type being defined. Rust's `Box<T>` handles the heap allocation that makes recursive types work. The serializer is a `Display` implementation that walks the tree and writes output. The parser is a `Parser` struct that holds a byte slice and a position, advancing through the input and calling itself recursively for nested values. The same pattern appears in every parser you'll ever write: `peek()` looks at the current byte, `next()` consumes it, `skip_ws()` skips spaces, and specialized methods like `parse_string()` and `parse_value()` handle each grammar production.

How It Works in Rust

// Recursive enum β€” Vec<Json> means this type contains itself
#[derive(Debug, Clone, PartialEq)]
pub enum Json {
 Null,
 Bool(bool),
 Number(f64),
 Str(String),
 Array(Vec<Json>),
 Object(Vec<(String, Json)>),   // ordered like real JSON parsers
}

// Serializer β€” implement Display for pretty formatting
impl fmt::Display for Json {
 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
     match self {
         Json::Null          => write!(f, "null"),
         Json::Bool(b)       => write!(f, "{b}"),
         Json::Str(s)        => {
             write!(f, "\"")?;
             for c in s.chars() {
                 match c {
                     '"'  => write!(f, "\\\"")?,
                     '\\' => write!(f, "\\\\")?,
                     '\n' => write!(f, "\\n")?,
                     c    => write!(f, "{c}")?,
                 }
             }
             write!(f, "\"")
         }
         Json::Array(arr)    => {
             write!(f, "[")?;
             for (i, v) in arr.iter().enumerate() {
                 if i > 0 { write!(f, ", ")?; }
                 write!(f, "{v}")?;   // recursion via Display
             }
             write!(f, "]")
         }
         // Object and Number similar...
     }
 }
}

// Parser β€” position-based, recursive
struct Parser<'a> { s: &'a [u8], pos: usize }

impl<'a> Parser<'a> {
 fn parse_value(&mut self) -> Result<Json, ParseError> {
     self.skip_ws();
     match self.peek()? {
         b'"' => Ok(Json::Str(self.parse_string()?)),
         b't' => { self.pos += 4; Ok(Json::Bool(true)) }
         b'[' => {
             self.pos += 1;
             let mut arr = vec![];
             loop {
                 arr.push(self.parse_value()?);   // RECURSION
                 self.skip_ws();
                 match self.peek()? {
                     b',' => self.pos += 1,
                     b']' => { self.pos += 1; break }
                     _    => return Err(ParseError("expected ',' or ']'".into())),
                 }
             }
             Ok(Json::Array(arr))
         }
         // null, false, numbers, objects...
     }
 }
}

pub fn parse(s: &str) -> Result<Json, ParseError> {
 Parser::new(s).parse_value()
}
Key points:

What This Unlocks

Key Differences

ConceptOCamlRust
Recursive type`type json = Null \Array of json list``enum Json { Array(Vec<Json>) }` β€” `Vec` provides indirection
Serializer`Format.fprintf` or `Buffer.add_string``impl fmt::Display` β€” integrate with `println!`
Parser stateFunctional: thread `pos` through returnsImperative: `Parser` struct with mutable `pos`
String escapingManual char-by-charSame β€” match on `char` and write escape sequences
Error typeException or `result``struct ParseError(String)`
Production library`yojson`, `ezjsonm``serde_json`, `json`
// 763. JSON-Like Format Built From Scratch
// Recursive AST + hand-written serializer and parser

use std::collections::HashMap;
use std::fmt;

// ── Value type ─────────────────────────────────────────────────────────────────

#[derive(Debug, Clone, PartialEq)]
pub enum Json {
    Null,
    Bool(bool),
    Number(f64),
    Str(String),
    Array(Vec<Json>),
    Object(Vec<(String, Json)>),
}

// ── Serializer ─────────────────────────────────────────────────────────────────

impl fmt::Display for Json {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        match self {
            Json::Null        => write!(f, "null"),
            Json::Bool(b)     => write!(f, "{b}"),
            Json::Number(n)   => {
                if *n == n.floor() && n.abs() < 1e15 {
                    write!(f, "{}", *n as i64)
                } else {
                    write!(f, "{n}")
                }
            }
            Json::Str(s)      => {
                write!(f, "\"")?;
                for c in s.chars() {
                    match c {
                        '"'  => write!(f, "\\\"")?,
                        '\\' => write!(f, "\\\\")?,
                        '\n' => write!(f, "\\n")?,
                        '\t' => write!(f, "\\t")?,
                        c    => write!(f, "{c}")?,
                    }
                }
                write!(f, "\"")
            }
            Json::Array(arr)  => {
                write!(f, "[")?;
                for (i, v) in arr.iter().enumerate() {
                    if i > 0 { write!(f, ", ")?; }
                    write!(f, "{v}")?;
                }
                write!(f, "]")
            }
            Json::Object(obj) => {
                write!(f, "{{")?;
                for (i, (k, v)) in obj.iter().enumerate() {
                    if i > 0 { write!(f, ", ")?; }
                    write!(f, "\"{k}\": {v}")?;
                }
                write!(f, "}}")
            }
        }
    }
}

// ── Parser ─────────────────────────────────────────────────────────────────────

#[derive(Debug)]
pub struct ParseError(String);

impl fmt::Display for ParseError {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { write!(f, "{}", self.0) }
}

struct Parser<'a> {
    s: &'a [u8],
    pos: usize,
}

impl<'a> Parser<'a> {
    fn new(s: &'a str) -> Self { Parser { s: s.as_bytes(), pos: 0 } }

    fn peek(&self) -> Option<u8> { self.s.get(self.pos).copied() }

    fn next(&mut self) -> Option<u8> {
        let b = self.s.get(self.pos).copied()?;
        self.pos += 1;
        Some(b)
    }

    fn skip_ws(&mut self) {
        while matches!(self.peek(), Some(b' ' | b'\t' | b'\n' | b'\r')) {
            self.pos += 1;
        }
    }

    fn expect(&mut self, b: u8) -> Result<(), ParseError> {
        match self.next() {
            Some(c) if c == b => Ok(()),
            c => Err(ParseError(format!("expected '{}' got {:?}", b as char, c.map(|x| x as char)))),
        }
    }

    fn parse_string(&mut self) -> Result<String, ParseError> {
        self.expect(b'"')?;
        let mut out = String::new();
        loop {
            match self.next().ok_or_else(|| ParseError("unterminated string".into()))? {
                b'"' => return Ok(out),
                b'\\' => {
                    match self.next().ok_or_else(|| ParseError("escape at EOF".into()))? {
                        b'n' => out.push('\n'),
                        b't' => out.push('\t'),
                        b'"' => out.push('"'),
                        b'\\' => out.push('\\'),
                        c => out.push(c as char),
                    }
                }
                c => out.push(c as char),
            }
        }
    }

    fn parse_value(&mut self) -> Result<Json, ParseError> {
        self.skip_ws();
        match self.peek().ok_or_else(|| ParseError("unexpected EOF".into()))? {
            b'"' => Ok(Json::Str(self.parse_string()?)),
            b't' => { self.pos += 4; Ok(Json::Bool(true)) }
            b'f' => { self.pos += 5; Ok(Json::Bool(false)) }
            b'n' => { self.pos += 4; Ok(Json::Null) }
            b'[' => {
                self.pos += 1;
                self.skip_ws();
                if self.peek() == Some(b']') { self.pos += 1; return Ok(Json::Array(vec![])); }
                let mut arr = vec![];
                loop {
                    arr.push(self.parse_value()?);
                    self.skip_ws();
                    match self.peek() {
                        Some(b',') => { self.pos += 1; }
                        Some(b']') => { self.pos += 1; break; }
                        _ => return Err(ParseError("expected ',' or ']'".into())),
                    }
                }
                Ok(Json::Array(arr))
            }
            b'{' => {
                self.pos += 1;
                self.skip_ws();
                if self.peek() == Some(b'}') { self.pos += 1; return Ok(Json::Object(vec![])); }
                let mut obj = vec![];
                loop {
                    self.skip_ws();
                    let k = self.parse_string()?;
                    self.skip_ws();
                    self.expect(b':')?;
                    let v = self.parse_value()?;
                    obj.push((k, v));
                    self.skip_ws();
                    match self.peek() {
                        Some(b',') => { self.pos += 1; }
                        Some(b'}') => { self.pos += 1; break; }
                        _ => return Err(ParseError("expected ',' or '}'".into())),
                    }
                }
                Ok(Json::Object(obj))
            }
            c if c == b'-' || c.is_ascii_digit() => {
                let start = self.pos;
                while matches!(self.peek(), Some(b'0'..=b'9' | b'.' | b'-' | b'e' | b'E' | b'+')) {
                    self.pos += 1;
                }
                let tok = std::str::from_utf8(&self.s[start..self.pos]).unwrap();
                tok.parse::<f64>().map(Json::Number)
                    .map_err(|e| ParseError(format!("bad number {tok}: {e}")))
            }
            c => Err(ParseError(format!("unexpected char '{}'", c as char))),
        }
    }
}

pub fn parse(s: &str) -> Result<Json, ParseError> {
    Parser::new(s).parse_value()
}

fn main() {
    let v = Json::Object(vec![
        ("name".into(), Json::Str("Alice".into())),
        ("age".into(), Json::Number(30.0)),
        ("scores".into(), Json::Array(vec![Json::Number(95.0), Json::Number(87.0)])),
        ("active".into(), Json::Bool(true)),
        ("address".into(), Json::Null),
    ]);

    let s = v.to_string();
    println!("Serialized:\n{s}\n");

    let v2 = parse(&s).expect("parse failed");
    println!("Re-serialized:\n{v2}");
    println!("\nEqual: {}", v == v2);
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn round_trip_primitives() {
        for s in ["null", "true", "false", "42", "\"hello\""] {
            let v = parse(s).expect(s);
            assert_eq!(v.to_string(), s, "round-trip {s}");
        }
    }

    #[test]
    fn round_trip_array() {
        let s = "[1, 2, 3]";
        let v = parse(s).unwrap();
        assert!(matches!(v, Json::Array(_)));
        assert_eq!(v.to_string(), "[1, 2, 3]");
    }

    #[test]
    fn nested_object() {
        let s = r#"{"a": {"b": 1}}"#;
        let v = parse(s).unwrap();
        assert!(matches!(v, Json::Object(_)));
    }

    #[test]
    fn escaped_string() {
        let s = r#""hel\"lo""#;
        let v = parse(s).unwrap();
        assert_eq!(v, Json::Str("hel\"lo".into()));
    }
}
(* JSON-like format built from scratch in OCaml *)

(* ── Value type ─────────────────────────────────────────────────────────────── *)
type json =
  | JNull
  | JBool   of bool
  | JInt    of int
  | JFloat  of float
  | JString of string
  | JArray  of json list
  | JObject of (string * json) list

(* ── Serializer ──────────────────────────────────────────────────────────────── *)
let rec to_string = function
  | JNull       -> "null"
  | JBool true  -> "true"
  | JBool false -> "false"
  | JInt n      -> string_of_int n
  | JFloat f    -> Printf.sprintf "%g" f
  | JString s   -> Printf.sprintf "%S" s
  | JArray arr  ->
    let items = String.concat ", " (List.map to_string arr) in
    Printf.sprintf "[%s]" items
  | JObject obj ->
    let pairs =
      List.map (fun (k, v) -> Printf.sprintf "%S: %s" k (to_string v)) obj
      |> String.concat ", "
    in
    Printf.sprintf "{%s}" pairs

(* ── Minimal parser ──────────────────────────────────────────────────────────── *)
let parse_error msg = failwith ("JSON parse error: " ^ msg)

let skip_ws s pos =
  while !pos < String.length s && (s.[!pos] = ' ' || s.[!pos] = '\n' || s.[!pos] = '\t') do
    incr pos
  done

let parse_string s pos =
  (* pos points to opening '"' *)
  incr pos;
  let buf = Buffer.create 16 in
  let continue = ref true in
  while !continue do
    if !pos >= String.length s then parse_error "unterminated string";
    let c = s.[!pos] in
    incr pos;
    if c = '"' then continue := false
    else if c = '\\' then begin
      if !pos >= String.length s then parse_error "escape at EOF";
      let esc = s.[!pos] in incr pos;
      Buffer.add_char buf (match esc with
        | 'n' -> '\n' | 't' -> '\t' | '"' -> '"' | '\\' -> '\\'
        | c -> c)
    end else Buffer.add_char buf c
  done;
  Buffer.contents buf

let rec parse_value s pos =
  skip_ws s pos;
  if !pos >= String.length s then parse_error "unexpected EOF";
  match s.[!pos] with
  | '"' -> JString (parse_string s pos)
  | 't' -> pos := !pos + 4; JBool true
  | 'f' -> pos := !pos + 5; JBool false
  | 'n' -> pos := !pos + 4; JNull
  | '[' ->
    incr pos; skip_ws s pos;
    if s.[!pos] = ']' then (incr pos; JArray [])
    else begin
      let items = ref [] in
      let go = ref true in
      while !go do
        items := parse_value s pos :: !items;
        skip_ws s pos;
        if !pos < String.length s && s.[!pos] = ',' then incr pos
        else go := false
      done;
      skip_ws s pos;
      if s.[!pos] <> ']' then parse_error "expected ']'";
      incr pos;
      JArray (List.rev !items)
    end
  | '{' ->
    incr pos; skip_ws s pos;
    if s.[!pos] = '}' then (incr pos; JObject [])
    else begin
      let pairs = ref [] in
      let go = ref true in
      while !go do
        skip_ws s pos;
        let k = parse_string s pos in
        skip_ws s pos;
        if s.[!pos] <> ':' then parse_error "expected ':'";
        incr pos;
        let v = parse_value s pos in
        pairs := (k, v) :: !pairs;
        skip_ws s pos;
        if !pos < String.length s && s.[!pos] = ',' then incr pos
        else go := false
      done;
      skip_ws s pos;
      if s.[!pos] <> '}' then parse_error "expected '}'";
      incr pos;
      JObject (List.rev !pairs)
    end
  | c when c >= '0' && c <= '9' || c = '-' ->
    let start = !pos in
    while !pos < String.length s &&
          (let ch = s.[!pos] in
           (ch >= '0' && ch <= '9') || ch = '.' || ch = '-' || ch = 'e' || ch = 'E')
    do incr pos done;
    let tok = String.sub s start (!pos - start) in
    (try JInt (int_of_string tok)
     with _ -> try JFloat (float_of_string tok)
     with _ -> parse_error ("bad number: " ^ tok))
  | c -> parse_error (Printf.sprintf "unexpected char '%c'" c)

let parse s =
  let pos = ref 0 in
  parse_value s pos

let () =
  let v = JObject [
    ("name", JString "Alice");
    ("age", JInt 30);
    ("scores", JArray [JInt 95; JInt 87; JInt 100]);
    ("active", JBool true);
    ("address", JNull);
  ] in
  let s = to_string v in
  Printf.printf "Serialized:\n%s\n\n" s;
  let v2 = parse s in
  Printf.printf "Re-serialized:\n%s\n" (to_string v2)