๐Ÿฆ€ Functional Rust

155: Many Parser

Difficulty: โญโญโญ Level: Foundations `many0` and `many1` repeat a parser until it fails โ€” the parser equivalent of regex `*` and `+`.

The Problem This Solves

Single-shot parsers are useful, but most real tokens are sequences: a number is many digits, an identifier is many letters-and-underscores, a string is many non-quote characters. Every real parser needs "repeat this until it stops working." Without a repetition combinator, you'd need to manually write a loop every time: call the parser, check if it succeeded, if yes accumulate the result and call again, if no stop. That loop is identical for every repeating pattern โ€” it belongs in a combinator, not duplicated across your codebase. `many0` (zero or more) and `many1` (one or more) are those combinators. They're the difference between writing a number parser in one line versus ten.

The Intuition

`many0` is like a greedy regex `*`: keep going as long as the parser succeeds, stop the moment it fails. Crucially, zero successes is still a success โ€” `many0` never returns an error. If the parser fails on the first try, you just get an empty `Vec`. `many1` is like regex `+`: same as `many0`, but requires at least one match. If the very first attempt fails, `many1` fails too. `many_till` adds a terminator: "keep parsing until this other parser succeeds." Useful for parsing strings ("take chars until you hit a `"`") or comments ("take chars until you hit `*/`").

How It Works in Rust

`many0` โ€” zero or more:
fn many0<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
 Box::new(move |mut input: &'a str| {
     let mut results = Vec::new();
     while let Ok((value, rest)) = parser(input) {
         results.push(value);
         input = rest;   // advance: next iteration starts where this one left off
     }
     Ok((results, input))  // always Ok โ€” even if results is empty
 })
}
`while let Ok(...) = parser(input)` is the idiom: destructure the success, or break on error. The `mut input` variable acts as a cursor โ€” we rebind it to `rest` each iteration. No recursion, no `List.rev`, just a simple loop. `many1` โ€” one or more:
fn many1<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
 Box::new(move |input: &'a str| {
     let (first, mut remaining) = parser(input)?;  // ? propagates failure immediately
     let mut results = vec![first];
     while let Ok((value, rest)) = parser(remaining) {
         results.push(value);
         remaining = rest;
     }
     Ok((results, remaining))
 })
}
The `?` on the first parse does the work: if the parser fails even once, we return `Err` immediately. After that, it's `many0` logic. `many_till` โ€” parse until a terminator:
fn many_till<'a, T: 'a, U: 'a>(
 parser: Parser<'a, T>,
 stop: Parser<'a, U>,
) -> Parser<'a, (Vec<T>, U)> {
 Box::new(move |mut input: &'a str| {
     let mut results = Vec::new();
     loop {
         if let Ok((term, rest)) = stop(input) {
             // Terminator matched โ€” return everything collected plus the terminator
             return Ok(((results, term), rest));
         }
         // Terminator didn't match โ€” try the main parser; fail if it can't advance either
         let (value, rest) = parser(input)?;
         results.push(value);
         input = rest;
     }
 })
}
`many0_str` โ€” collect chars into a `String`:
fn many0_str<'a>(parser: Parser<'a, char>) -> Parser<'a, String> {
 Box::new(move |mut input: &'a str| {
     let mut s = String::new();
     while let Ok((c, rest)) = parser(input) {
         s.push(c);
         input = rest;
     }
     Ok((s, input))
 })
}
Usage:
let digits = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("{:?}", digits("123abc")); // Ok((vec!['1','2','3'], "abc"))
println!("{:?}", digits("abc"));    // Ok((vec![], "abc")) โ€” zero, still Ok

let digits1 = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("{:?}", digits1("abc"));   // Err โ€” at least one required

let digit_str = many0_str(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("{:?}", digit_str("456xy")); // Ok(("456", "xy"))

What This Unlocks

Key Differences

ConceptOCamlRust
Loop styleTail-recursive `go` helper`while let Ok(...) = ...` loop
Result accumulationReverse a list at the end (`List.rev`)`Vec::push` in order โ€” no reverse needed
Always succeeds`many0` returns `Ok ([], input)``many0` returns `Ok((vec![], input))`
String collection`String.concat` from char list`String::push` directly in loop
Early exit (`many1`)Pattern match on first call`?` operator on first call
// Example 155: Many Parser
// many0 and many1: parse zero or more / one or more

type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;

fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where F: Fn(char) -> bool + 'a {
    let desc = desc.to_string();
    Box::new(move |input: &'a str| {
        match input.chars().next() {
            Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
            _ => Err(format!("Expected {}", desc)),
        }
    })
}

// ============================================================
// Approach 1: many0 โ€” zero or more, always succeeds
// ============================================================

fn many0<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
    Box::new(move |mut input: &'a str| {
        let mut results = Vec::new();
        while let Ok((value, rest)) = parser(input) {
            results.push(value);
            input = rest;
        }
        Ok((results, input))
    })
}

// ============================================================
// Approach 2: many1 โ€” one or more, fails if zero
// ============================================================

fn many1<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
    Box::new(move |input: &'a str| {
        let (first, mut remaining) = parser(input)?;
        let mut results = vec![first];
        while let Ok((value, rest)) = parser(remaining) {
            results.push(value);
            remaining = rest;
        }
        Ok((results, remaining))
    })
}

// ============================================================
// Approach 3: many_till โ€” parse until terminator succeeds
// ============================================================

fn many_till<'a, T: 'a, U: 'a>(
    parser: Parser<'a, T>,
    stop: Parser<'a, U>,
) -> Parser<'a, (Vec<T>, U)> {
    Box::new(move |mut input: &'a str| {
        let mut results = Vec::new();
        loop {
            if let Ok((term, rest)) = stop(input) {
                return Ok(((results, term), rest));
            }
            let (value, rest) = parser(input)?;
            results.push(value);
            input = rest;
        }
    })
}

/// Collect many0 chars into a String
fn many0_str<'a>(parser: Parser<'a, char>) -> Parser<'a, String> {
    Box::new(move |mut input: &'a str| {
        let mut s = String::new();
        while let Ok((c, rest)) = parser(input) {
            s.push(c);
            input = rest;
        }
        Ok((s, input))
    })
}

fn main() {
    let digits = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
    println!("many0 digits on '123abc': {:?}", digits("123abc"));
    println!("many0 digits on 'abc': {:?}", digits("abc"));

    let digits1 = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
    println!("many1 digits on '123abc': {:?}", digits1("123abc"));
    println!("many1 digits on 'abc': {:?}", digits1("abc"));

    let digit_str = many0_str(satisfy(|c| c.is_ascii_digit(), "digit"));
    println!("many0_str on '456xy': {:?}", digit_str("456xy"));

    println!("\nโœ“ All examples completed");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_many0_some() {
        let p = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
        let (digits, rest) = p("123abc").unwrap();
        assert_eq!(digits, vec!['1', '2', '3']);
        assert_eq!(rest, "abc");
    }

    #[test]
    fn test_many0_none() {
        let p = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
        let (digits, rest) = p("abc").unwrap();
        assert!(digits.is_empty());
        assert_eq!(rest, "abc");
    }

    #[test]
    fn test_many0_empty_input() {
        let p = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
        let (digits, rest) = p("").unwrap();
        assert!(digits.is_empty());
        assert_eq!(rest, "");
    }

    #[test]
    fn test_many1_some() {
        let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
        let (digits, rest) = p("123abc").unwrap();
        assert_eq!(digits, vec!['1', '2', '3']);
        assert_eq!(rest, "abc");
    }

    #[test]
    fn test_many1_none_fails() {
        let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
        assert!(p("abc").is_err());
    }

    #[test]
    fn test_many_till() {
        let letter = satisfy(|c| c.is_ascii_alphabetic(), "letter");
        let dot = satisfy(|c| c == '.', "dot");
        let p = many_till(letter, dot);
        let ((letters, term), rest) = p("abc.rest").unwrap();
        assert_eq!(letters, vec!['a', 'b', 'c']);
        assert_eq!(term, '.');
        assert_eq!(rest, "rest");
    }

    #[test]
    fn test_many0_str() {
        let p = many0_str(satisfy(|c| c.is_ascii_digit(), "digit"));
        let (s, rest) = p("456xy").unwrap();
        assert_eq!(s, "456");
        assert_eq!(rest, "xy");
    }

    #[test]
    fn test_many1_single() {
        let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
        let (digits, rest) = p("5abc").unwrap();
        assert_eq!(digits, vec!['5']);
        assert_eq!(rest, "abc");
    }
}
(* Example 155: Many Parser *)
(* many0 and many1: parse zero or more / one or more *)

type 'a parse_result = ('a * string, string) result
type 'a parser = string -> 'a parse_result

let satisfy pred desc : char parser = fun input ->
  if String.length input > 0 && pred input.[0] then
    Ok (input.[0], String.sub input 1 (String.length input - 1))
  else Error (Printf.sprintf "Expected %s" desc)

(* Approach 1: many0 โ€” zero or more, always succeeds *)
let many0 (p : 'a parser) : 'a list parser = fun input ->
  let rec go acc remaining =
    match p remaining with
    | Ok (v, rest) -> go (v :: acc) rest
    | Error _ -> Ok (List.rev acc, remaining)
  in
  go [] input

(* Approach 2: many1 โ€” one or more, fails if zero matches *)
let many1 (p : 'a parser) : 'a list parser = fun input ->
  match p input with
  | Error e -> Error e
  | Ok (first, rest) ->
    match many0 p rest with
    | Ok (others, remaining) -> Ok (first :: others, remaining)
    | Error e -> Error e

(* Approach 3: many_till โ€” parse until a terminator succeeds *)
let many_till (p : 'a parser) (stop : 'b parser) : ('a list * 'b) parser = fun input ->
  let rec go acc remaining =
    match stop remaining with
    | Ok (term, rest) -> Ok ((List.rev acc, term), rest)
    | Error _ ->
      match p remaining with
      | Ok (v, rest) -> go (v :: acc) rest
      | Error e -> Error e
  in
  go [] input

(* Collect chars into string *)
let many0_str (p : char parser) : string parser = fun input ->
  match many0 p input with
  | Ok (chars, rest) -> Ok (String.init (List.length chars) (List.nth chars), rest)
  | Error e -> Error e

let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
let is_letter = satisfy (fun c -> (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"

(* Tests *)
let () =
  (* many0 *)
  (match many0 is_digit "123abc" with
   | Ok (digits, "abc") -> assert (digits = ['1'; '2'; '3'])
   | _ -> failwith "Test 1 failed");

  (match many0 is_digit "abc" with
   | Ok ([], "abc") -> ()
   | _ -> failwith "Test 2 failed");

  (* many1 *)
  (match many1 is_digit "123abc" with
   | Ok (digits, "abc") -> assert (digits = ['1'; '2'; '3'])
   | _ -> failwith "Test 3 failed");

  assert (Result.is_error (many1 is_digit "abc"));

  (* many_till *)
  let stop = satisfy (fun c -> c = '.') "dot" in
  (match many_till is_letter stop "abc.rest" with
   | Ok ((letters, '.'), "rest") -> assert (letters = ['a'; 'b'; 'c'])
   | _ -> failwith "Test 5 failed");

  print_endline "โœ“ All tests passed"

๐Ÿ“Š Detailed Comparison

Comparison: Example 155 โ€” Many Parser

many0

OCaml:

๐Ÿช Show OCaml equivalent
let many0 (p : 'a parser) : 'a list parser = fun input ->
let rec go acc remaining =
 match p remaining with
 | Ok (v, rest) -> go (v :: acc) rest
 | Error _ -> Ok (List.rev acc, remaining)
in
go [] input

Rust:

fn many0<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
 Box::new(move |mut input: &'a str| {
     let mut results = Vec::new();
     while let Ok((value, rest)) = parser(input) {
         results.push(value);
         input = rest;
     }
     Ok((results, input))
 })
}

many1

OCaml:

๐Ÿช Show OCaml equivalent
let many1 (p : 'a parser) : 'a list parser = fun input ->
match p input with
| Error e -> Error e
| Ok (first, rest) ->
 match many0 p rest with
 | Ok (others, remaining) -> Ok (first :: others, remaining)
 | Error e -> Error e

Rust:

fn many1<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
 Box::new(move |input: &'a str| {
     let (first, mut remaining) = parser(input)?;
     let mut results = vec![first];
     while let Ok((value, rest)) = parser(remaining) {
         results.push(value);
         remaining = rest;
     }
     Ok((results, remaining))
 })
}

many_till

OCaml:

๐Ÿช Show OCaml equivalent
let many_till (p : 'a parser) (stop : 'b parser) : ('a list * 'b) parser = fun input ->
let rec go acc remaining =
 match stop remaining with
 | Ok (term, rest) -> Ok ((List.rev acc, term), rest)
 | Error _ ->
   match p remaining with
   | Ok (v, rest) -> go (v :: acc) rest
   | Error e -> Error e
in
go [] input

Rust:

fn many_till<'a, T: 'a, U: 'a>(
 parser: Parser<'a, T>,
 stop: Parser<'a, U>,
) -> Parser<'a, (Vec<T>, U)> {
 Box::new(move |mut input: &'a str| {
     let mut results = Vec::new();
     loop {
         if let Ok((term, rest)) = stop(input) {
             return Ok(((results, term), rest));
         }
         let (value, rest) = parser(input)?;
         results.push(value);
         input = rest;
     }
 })
}