๐Ÿฆ€ Functional Rust

161: Digit Parser

Difficulty: โญโญโญ Level: Foundations A practical application: compose `satisfy`, `many1`, `map`, and `opt` to parse single digits, natural numbers, and signed integers.

The Problem This Solves

Numbers are everywhere in text. Config values, JSON numbers, source code literals, protocol fields. Every time you need to parse a number, you face the same sub-problems: which characters are digits, how do you turn multiple digit characters into a numeric value, how do you handle the optional sign. This example is less about new combinators and more about putting it all together. Everything from examples 153โ€“159 comes together here into something practical: a `Parser<i64>` that correctly parses `"42"`, `"-17"`, and `"+100"`. The implementation also shows a subtle point: the correct conversion from a `char` digit to its numeric value isn't `'3' - '0'` in the usual sense โ€” in Rust, you cast both to `u32` first. Getting this right is one of those details that matters in practice.

The Intuition

A number is a sequence of digits. "Sequence of digits" means `many1(satisfy(is_digit))` โ€” which gives you a `Vec<char>`. But a `Vec<char>` of `['4', '2']` isn't the number 42 yet. You need to fold it: start with 0, for each digit character multiply the accumulator by 10 and add the digit's value. Digit value: `'7'` as Unicode/ASCII has code point 55, and `'0'` has code point 48. So `'7' as u32 - '0' as u32` = 7. This is the standard trick for char-to-digit conversion. A signed integer is an optional sign character followed by a natural number. `opt(satisfy(|c| c == '+' || c == '-', "sign"))` returns `Some('+')`, `Some('-')`, or `None`. Pattern-match on that to decide whether to negate.

How It Works in Rust

Single digit โ†’ `u32`:
fn digit<'a>() -> Parser<'a, u32> {
 map(
     satisfy(|c| c.is_ascii_digit(), "digit"),  // parse one digit char
     |c| c as u32 - '0' as u32,                 // convert char to numeric value
 )
}
// digit()("5rest") = Ok((5, "rest"))
Natural number (unsigned) โ†’ `u64`:
fn natural<'a>() -> Parser<'a, u64> {
 map(
     many1(satisfy(|c| c.is_ascii_digit(), "digit")),  // Vec<char>: ['4','2']
     |digits| digits.iter().fold(0u64, |acc, &d| {
         acc * 10 + (d as u64 - '0' as u64)  // positional value: 0 โ†’ 4 โ†’ 42
     }),
 )
}
// natural()("42rest") = Ok((42, "rest"))
// natural()("abc")    = Err (many1 requires at least one digit)
`iter().fold(init, f)` replaces OCaml's `List.fold_left`. It starts with `0u64`, and for each digit char `d`, computes `acc * 10 + digit_value(d)`. Signed integer โ†’ `i64`:
fn integer<'a>() -> Parser<'a, i64> {
 Box::new(|input: &'a str| {
     // opt returns Some('+'), Some('-'), or None
     let (sign, rest) = opt(satisfy(|c| c == '+' || c == '-', "sign"))(input)?;
     let (n, rem) = natural()(rest)?;
     let value = match sign {
         Some('-') => -(n as i64),  // negate for minus
         _         => n as i64,     // plus or absent: positive
     };
     Ok((value, rem))
 })
}
// integer()("42")   = Ok((42,  ""))
// integer()("-42")  = Ok((-42, ""))
// integer()("+42")  = Ok((42,  ""))
// integer()("abc")  = Err
`n as i64` is safe as long as `n` fits in `i64`. For production code you'd add a bounds check. The `_` arm handles both `Some('+')` and `None` โ€” both mean positive.

What This Unlocks

Key Differences

ConceptOCamlRust
Digit-to-int`Char.code c - Char.code '0'``c as u32 - '0' as u32`
Number types`int` (GC-managed, platform-sized)`u32`, `u64`, `i64` (explicit width, stack-allocated)
Fold over list`List.fold_left (fun acc d -> ...) 0 digits``digits.iter().fold(0u64, \acc, &d\...)`
Negation`- n``-(n as i64)` (cast needed: `u64 โ†’ i64`)
Optional sign`opt sign >>= fun s -> ...``opt(satisfy(...))(input)?` then `match sign`
// Example 161: Digit Parser
// Parse digits: single digit, multi-digit integer, positive/negative

type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;

fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where F: Fn(char) -> bool + 'a {
    let desc = desc.to_string();
    Box::new(move |input: &'a str| match input.chars().next() {
        Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
        _ => Err(format!("Expected {}", desc)),
    })
}

fn many1<'a, T: 'a>(p: Parser<'a, T>) -> Parser<'a, Vec<T>> {
    Box::new(move |input: &'a str| {
        let (first, mut rem) = p(input)?;
        let mut v = vec![first];
        while let Ok((val, r)) = p(rem) { v.push(val); rem = r; }
        Ok((v, rem))
    })
}

fn map<'a, A: 'a, B: 'a, F>(p: Parser<'a, A>, f: F) -> Parser<'a, B>
where F: Fn(A) -> B + 'a {
    Box::new(move |input: &'a str| { let (v, r) = p(input)?; Ok((f(v), r)) })
}

fn opt<'a, T: 'a>(p: Parser<'a, T>) -> Parser<'a, Option<T>> {
    Box::new(move |input: &'a str| match p(input) {
        Ok((v, r)) => Ok((Some(v), r)),
        Err(_) => Ok((None, input)),
    })
}

// ============================================================
// Approach 1: Single digit โ†’ u32
// ============================================================

fn digit<'a>() -> Parser<'a, u32> {
    map(satisfy(|c| c.is_ascii_digit(), "digit"), |c| c as u32 - '0' as u32)
}

// ============================================================
// Approach 2: Natural number (unsigned) โ†’ u64
// ============================================================

fn natural<'a>() -> Parser<'a, u64> {
    map(
        many1(satisfy(|c| c.is_ascii_digit(), "digit")),
        |digits| digits.iter().fold(0u64, |acc, &d| acc * 10 + (d as u64 - '0' as u64)),
    )
}

// ============================================================
// Approach 3: Signed integer โ†’ i64
// ============================================================

fn integer<'a>() -> Parser<'a, i64> {
    Box::new(|input: &'a str| {
        let (sign, rest) = opt(satisfy(|c| c == '+' || c == '-', "sign"))(input)?;
        let (n, rem) = natural()(rest)?;
        let value = match sign {
            Some('-') => -(n as i64),
            _ => n as i64,
        };
        Ok((value, rem))
    })
}

fn main() {
    println!("=== digit ===");
    let p = digit();
    println!("{:?}", p("5rest")); // Ok((5, "rest"))

    println!("\n=== natural ===");
    let p = natural();
    println!("{:?}", p("42rest")); // Ok((42, "rest"))
    println!("{:?}", p("0"));      // Ok((0, ""))

    println!("\n=== integer ===");
    let p = integer();
    println!("{:?}", p("42"));   // Ok((42, ""))
    println!("{:?}", p("-42"));  // Ok((-42, ""))
    println!("{:?}", p("+42"));  // Ok((42, ""))

    println!("\nโœ“ All examples completed");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_digit() {
        assert_eq!(digit()("5rest"), Ok((5, "rest")));
    }

    #[test]
    fn test_digit_zero() {
        assert_eq!(digit()("0x"), Ok((0, "x")));
    }

    #[test]
    fn test_digit_fail() {
        assert!(digit()("abc").is_err());
    }

    #[test]
    fn test_natural() {
        assert_eq!(natural()("42rest"), Ok((42, "rest")));
    }

    #[test]
    fn test_natural_zero() {
        assert_eq!(natural()("0"), Ok((0, "")));
    }

    #[test]
    fn test_natural_large() {
        assert_eq!(natural()("123456"), Ok((123456, "")));
    }

    #[test]
    fn test_integer_positive() {
        assert_eq!(integer()("42"), Ok((42, "")));
    }

    #[test]
    fn test_integer_negative() {
        assert_eq!(integer()("-42"), Ok((-42, "")));
    }

    #[test]
    fn test_integer_plus() {
        assert_eq!(integer()("+42"), Ok((42, "")));
    }

    #[test]
    fn test_integer_zero() {
        assert_eq!(integer()("0"), Ok((0, "")));
    }

    #[test]
    fn test_integer_fail() {
        assert!(integer()("abc").is_err());
    }
}
(* Example 161: Digit Parser *)
(* Parse digits: single digit, multi-digit integer, positive/negative *)

type 'a parse_result = ('a * string, string) result
type 'a parser = string -> 'a parse_result

let satisfy pred desc : char parser = fun input ->
  if String.length input > 0 && pred input.[0] then
    Ok (input.[0], String.sub input 1 (String.length input - 1))
  else Error (Printf.sprintf "Expected %s" desc)

let many0 p : 'a list parser = fun input ->
  let rec go acc r = match p r with Ok (v, r') -> go (v::acc) r' | Error _ -> Ok (List.rev acc, r)
  in go [] input

let many1 p : 'a list parser = fun input ->
  match p input with
  | Error e -> Error e
  | Ok (v, r) -> match many0 p r with Ok (vs, r') -> Ok (v::vs, r') | Error e -> Error e

let map f p : 'b parser = fun input ->
  match p input with Ok (v, r) -> Ok (f v, r) | Error e -> Error e

let opt p : 'a option parser = fun input ->
  match p input with Ok (v, r) -> Ok (Some v, r) | Error _ -> Ok (None, input)

(* Approach 1: Single digit *)
let digit : int parser =
  map (fun c -> Char.code c - Char.code '0')
    (satisfy (fun c -> c >= '0' && c <= '9') "digit")

(* Approach 2: Natural number (unsigned) *)
let natural : int parser =
  map (fun digits -> List.fold_left (fun acc d -> acc * 10 + d) 0 digits)
    (many1 digit)

(* Approach 3: Signed integer *)
let integer : int parser = fun input ->
  match opt (satisfy (fun c -> c = '+' || c = '-') "sign") input with
  | Ok (sign, rest) ->
    (match natural rest with
     | Ok (n, rem) ->
       let value = match sign with Some '-' -> -n | _ -> n in
       Ok (value, rem)
     | Error e -> Error e)
  | Error e -> Error e

(* Tests *)
let () =
  assert (digit "5rest" = Ok (5, "rest"));
  assert (Result.is_error (digit "abc"));

  assert (natural "42rest" = Ok (42, "rest"));
  assert (natural "0" = Ok (0, ""));
  assert (natural "100" = Ok (100, ""));

  assert (integer "42" = Ok (42, ""));
  assert (integer "-42" = Ok (-42, ""));
  assert (integer "+42" = Ok (42, ""));
  assert (integer "0" = Ok (0, ""));
  assert (Result.is_error (integer "abc"));

  print_endline "โœ“ All tests passed"

๐Ÿ“Š Detailed Comparison

Comparison: Example 161 โ€” Digit Parser

Single digit

OCaml:

๐Ÿช Show OCaml equivalent
let digit : int parser =
map (fun c -> Char.code c - Char.code '0')
 (satisfy (fun c -> c >= '0' && c <= '9') "digit")

Rust:

fn digit<'a>() -> Parser<'a, u32> {
 map(satisfy(|c| c.is_ascii_digit(), "digit"), |c| c as u32 - '0' as u32)
}

Natural number

OCaml:

๐Ÿช Show OCaml equivalent
let natural : int parser =
map (fun digits -> List.fold_left (fun acc d -> acc * 10 + d) 0 digits)
 (many1 digit)

Rust:

fn natural<'a>() -> Parser<'a, u64> {
 map(
     many1(satisfy(|c| c.is_ascii_digit(), "digit")),
     |digits| digits.iter().fold(0u64, |acc, &d| acc * 10 + (d as u64 - '0' as u64)),
 )
}

Signed integer

OCaml:

๐Ÿช Show OCaml equivalent
let integer : int parser = fun input ->
match opt (satisfy (fun c -> c = '+' || c = '-') "sign") input with
| Ok (sign, rest) ->
 (match natural rest with
  | Ok (n, rem) ->
    let value = match sign with Some '-' -> -n | _ -> n in
    Ok (value, rem)
  | Error e -> Error e)
| Error e -> Error e

Rust:

fn integer<'a>() -> Parser<'a, i64> {
 Box::new(|input: &'a str| {
     let (sign, rest) = opt(satisfy(|c| c == '+' || c == '-', "sign"))(input)?;
     let (n, rem) = natural()(rest)?;
     let value = match sign {
         Some('-') => -(n as i64),
         _ => n as i64,
     };
     Ok((value, rem))
 })
}