155: Many Parser
Difficulty: โญโญโญ Level: Foundations `many0` and `many1` repeat a parser until it fails โ the parser equivalent of regex `*` and `+`.The Problem This Solves
Single-shot parsers are useful, but most real tokens are sequences: a number is many digits, an identifier is many letters-and-underscores, a string is many non-quote characters. Every real parser needs "repeat this until it stops working." Without a repetition combinator, you'd need to manually write a loop every time: call the parser, check if it succeeded, if yes accumulate the result and call again, if no stop. That loop is identical for every repeating pattern โ it belongs in a combinator, not duplicated across your codebase. `many0` (zero or more) and `many1` (one or more) are those combinators. They're the difference between writing a number parser in one line versus ten.The Intuition
`many0` is like a greedy regex `*`: keep going as long as the parser succeeds, stop the moment it fails. Crucially, zero successes is still a success โ `many0` never returns an error. If the parser fails on the first try, you just get an empty `Vec`. `many1` is like regex `+`: same as `many0`, but requires at least one match. If the very first attempt fails, `many1` fails too. `many_till` adds a terminator: "keep parsing until this other parser succeeds." Useful for parsing strings ("take chars until you hit a `"`") or comments ("take chars until you hit `*/`").How It Works in Rust
`many0` โ zero or more:fn many0<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |mut input: &'a str| {
let mut results = Vec::new();
while let Ok((value, rest)) = parser(input) {
results.push(value);
input = rest; // advance: next iteration starts where this one left off
}
Ok((results, input)) // always Ok โ even if results is empty
})
}
`while let Ok(...) = parser(input)` is the idiom: destructure the success, or break on error. The `mut input` variable acts as a cursor โ we rebind it to `rest` each iteration. No recursion, no `List.rev`, just a simple loop.
`many1` โ one or more:
fn many1<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = parser(input)?; // ? propagates failure immediately
let mut results = vec![first];
while let Ok((value, rest)) = parser(remaining) {
results.push(value);
remaining = rest;
}
Ok((results, remaining))
})
}
The `?` on the first parse does the work: if the parser fails even once, we return `Err` immediately. After that, it's `many0` logic.
`many_till` โ parse until a terminator:
fn many_till<'a, T: 'a, U: 'a>(
parser: Parser<'a, T>,
stop: Parser<'a, U>,
) -> Parser<'a, (Vec<T>, U)> {
Box::new(move |mut input: &'a str| {
let mut results = Vec::new();
loop {
if let Ok((term, rest)) = stop(input) {
// Terminator matched โ return everything collected plus the terminator
return Ok(((results, term), rest));
}
// Terminator didn't match โ try the main parser; fail if it can't advance either
let (value, rest) = parser(input)?;
results.push(value);
input = rest;
}
})
}
`many0_str` โ collect chars into a `String`:
fn many0_str<'a>(parser: Parser<'a, char>) -> Parser<'a, String> {
Box::new(move |mut input: &'a str| {
let mut s = String::new();
while let Ok((c, rest)) = parser(input) {
s.push(c);
input = rest;
}
Ok((s, input))
})
}
Usage:
let digits = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("{:?}", digits("123abc")); // Ok((vec!['1','2','3'], "abc"))
println!("{:?}", digits("abc")); // Ok((vec![], "abc")) โ zero, still Ok
let digits1 = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("{:?}", digits1("abc")); // Err โ at least one required
let digit_str = many0_str(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("{:?}", digit_str("456xy")); // Ok(("456", "xy"))
What This Unlocks
- Practical token parsers โ numbers, identifiers, strings, whitespace โ all require "repeat until fail."
- `many1(satisfy(is_digit))` + `map` is how `parse_nat` (example 159) works: collect digit chars, then fold into a `u64`.
- `many_till` enables parsing balanced delimiters, comments, and string literals without scanning for the terminator manually.
Key Differences
| Concept | OCaml | Rust |
|---|---|---|
| Loop style | Tail-recursive `go` helper | `while let Ok(...) = ...` loop |
| Result accumulation | Reverse a list at the end (`List.rev`) | `Vec::push` in order โ no reverse needed |
| Always succeeds | `many0` returns `Ok ([], input)` | `many0` returns `Ok((vec![], input))` |
| String collection | `String.concat` from char list | `String::push` directly in loop |
| Early exit (`many1`) | Pattern match on first call | `?` operator on first call |
// Example 155: Many Parser
// many0 and many1: parse zero or more / one or more
type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where F: Fn(char) -> bool + 'a {
let desc = desc.to_string();
Box::new(move |input: &'a str| {
match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
_ => Err(format!("Expected {}", desc)),
}
})
}
// ============================================================
// Approach 1: many0 โ zero or more, always succeeds
// ============================================================
fn many0<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |mut input: &'a str| {
let mut results = Vec::new();
while let Ok((value, rest)) = parser(input) {
results.push(value);
input = rest;
}
Ok((results, input))
})
}
// ============================================================
// Approach 2: many1 โ one or more, fails if zero
// ============================================================
fn many1<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = parser(input)?;
let mut results = vec![first];
while let Ok((value, rest)) = parser(remaining) {
results.push(value);
remaining = rest;
}
Ok((results, remaining))
})
}
// ============================================================
// Approach 3: many_till โ parse until terminator succeeds
// ============================================================
fn many_till<'a, T: 'a, U: 'a>(
parser: Parser<'a, T>,
stop: Parser<'a, U>,
) -> Parser<'a, (Vec<T>, U)> {
Box::new(move |mut input: &'a str| {
let mut results = Vec::new();
loop {
if let Ok((term, rest)) = stop(input) {
return Ok(((results, term), rest));
}
let (value, rest) = parser(input)?;
results.push(value);
input = rest;
}
})
}
/// Collect many0 chars into a String
fn many0_str<'a>(parser: Parser<'a, char>) -> Parser<'a, String> {
Box::new(move |mut input: &'a str| {
let mut s = String::new();
while let Ok((c, rest)) = parser(input) {
s.push(c);
input = rest;
}
Ok((s, input))
})
}
fn main() {
let digits = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("many0 digits on '123abc': {:?}", digits("123abc"));
println!("many0 digits on 'abc': {:?}", digits("abc"));
let digits1 = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("many1 digits on '123abc': {:?}", digits1("123abc"));
println!("many1 digits on 'abc': {:?}", digits1("abc"));
let digit_str = many0_str(satisfy(|c| c.is_ascii_digit(), "digit"));
println!("many0_str on '456xy': {:?}", digit_str("456xy"));
println!("\nโ All examples completed");
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_many0_some() {
let p = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
let (digits, rest) = p("123abc").unwrap();
assert_eq!(digits, vec!['1', '2', '3']);
assert_eq!(rest, "abc");
}
#[test]
fn test_many0_none() {
let p = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
let (digits, rest) = p("abc").unwrap();
assert!(digits.is_empty());
assert_eq!(rest, "abc");
}
#[test]
fn test_many0_empty_input() {
let p = many0(satisfy(|c| c.is_ascii_digit(), "digit"));
let (digits, rest) = p("").unwrap();
assert!(digits.is_empty());
assert_eq!(rest, "");
}
#[test]
fn test_many1_some() {
let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
let (digits, rest) = p("123abc").unwrap();
assert_eq!(digits, vec!['1', '2', '3']);
assert_eq!(rest, "abc");
}
#[test]
fn test_many1_none_fails() {
let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
assert!(p("abc").is_err());
}
#[test]
fn test_many_till() {
let letter = satisfy(|c| c.is_ascii_alphabetic(), "letter");
let dot = satisfy(|c| c == '.', "dot");
let p = many_till(letter, dot);
let ((letters, term), rest) = p("abc.rest").unwrap();
assert_eq!(letters, vec!['a', 'b', 'c']);
assert_eq!(term, '.');
assert_eq!(rest, "rest");
}
#[test]
fn test_many0_str() {
let p = many0_str(satisfy(|c| c.is_ascii_digit(), "digit"));
let (s, rest) = p("456xy").unwrap();
assert_eq!(s, "456");
assert_eq!(rest, "xy");
}
#[test]
fn test_many1_single() {
let p = many1(satisfy(|c| c.is_ascii_digit(), "digit"));
let (digits, rest) = p("5abc").unwrap();
assert_eq!(digits, vec!['5']);
assert_eq!(rest, "abc");
}
}
(* Example 155: Many Parser *)
(* many0 and many1: parse zero or more / one or more *)
type 'a parse_result = ('a * string, string) result
type 'a parser = string -> 'a parse_result
let satisfy pred desc : char parser = fun input ->
if String.length input > 0 && pred input.[0] then
Ok (input.[0], String.sub input 1 (String.length input - 1))
else Error (Printf.sprintf "Expected %s" desc)
(* Approach 1: many0 โ zero or more, always succeeds *)
let many0 (p : 'a parser) : 'a list parser = fun input ->
let rec go acc remaining =
match p remaining with
| Ok (v, rest) -> go (v :: acc) rest
| Error _ -> Ok (List.rev acc, remaining)
in
go [] input
(* Approach 2: many1 โ one or more, fails if zero matches *)
let many1 (p : 'a parser) : 'a list parser = fun input ->
match p input with
| Error e -> Error e
| Ok (first, rest) ->
match many0 p rest with
| Ok (others, remaining) -> Ok (first :: others, remaining)
| Error e -> Error e
(* Approach 3: many_till โ parse until a terminator succeeds *)
let many_till (p : 'a parser) (stop : 'b parser) : ('a list * 'b) parser = fun input ->
let rec go acc remaining =
match stop remaining with
| Ok (term, rest) -> Ok ((List.rev acc, term), rest)
| Error _ ->
match p remaining with
| Ok (v, rest) -> go (v :: acc) rest
| Error e -> Error e
in
go [] input
(* Collect chars into string *)
let many0_str (p : char parser) : string parser = fun input ->
match many0 p input with
| Ok (chars, rest) -> Ok (String.init (List.length chars) (List.nth chars), rest)
| Error e -> Error e
let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
let is_letter = satisfy (fun c -> (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"
(* Tests *)
let () =
(* many0 *)
(match many0 is_digit "123abc" with
| Ok (digits, "abc") -> assert (digits = ['1'; '2'; '3'])
| _ -> failwith "Test 1 failed");
(match many0 is_digit "abc" with
| Ok ([], "abc") -> ()
| _ -> failwith "Test 2 failed");
(* many1 *)
(match many1 is_digit "123abc" with
| Ok (digits, "abc") -> assert (digits = ['1'; '2'; '3'])
| _ -> failwith "Test 3 failed");
assert (Result.is_error (many1 is_digit "abc"));
(* many_till *)
let stop = satisfy (fun c -> c = '.') "dot" in
(match many_till is_letter stop "abc.rest" with
| Ok ((letters, '.'), "rest") -> assert (letters = ['a'; 'b'; 'c'])
| _ -> failwith "Test 5 failed");
print_endline "โ All tests passed"
๐ Detailed Comparison
Comparison: Example 155 โ Many Parser
many0
OCaml:
๐ช Show OCaml equivalent
let many0 (p : 'a parser) : 'a list parser = fun input ->
let rec go acc remaining =
match p remaining with
| Ok (v, rest) -> go (v :: acc) rest
| Error _ -> Ok (List.rev acc, remaining)
in
go [] input
Rust:
fn many0<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |mut input: &'a str| {
let mut results = Vec::new();
while let Ok((value, rest)) = parser(input) {
results.push(value);
input = rest;
}
Ok((results, input))
})
}many1
OCaml:
๐ช Show OCaml equivalent
let many1 (p : 'a parser) : 'a list parser = fun input ->
match p input with
| Error e -> Error e
| Ok (first, rest) ->
match many0 p rest with
| Ok (others, remaining) -> Ok (first :: others, remaining)
| Error e -> Error e
Rust:
fn many1<'a, T: 'a>(parser: Parser<'a, T>) -> Parser<'a, Vec<T>> {
Box::new(move |input: &'a str| {
let (first, mut remaining) = parser(input)?;
let mut results = vec![first];
while let Ok((value, rest)) = parser(remaining) {
results.push(value);
remaining = rest;
}
Ok((results, remaining))
})
}many_till
OCaml:
๐ช Show OCaml equivalent
let many_till (p : 'a parser) (stop : 'b parser) : ('a list * 'b) parser = fun input ->
let rec go acc remaining =
match stop remaining with
| Ok (term, rest) -> Ok ((List.rev acc, term), rest)
| Error _ ->
match p remaining with
| Ok (v, rest) -> go (v :: acc) rest
| Error e -> Error e
in
go [] input
Rust:
fn many_till<'a, T: 'a, U: 'a>(
parser: Parser<'a, T>,
stop: Parser<'a, U>,
) -> Parser<'a, (Vec<T>, U)> {
Box::new(move |mut input: &'a str| {
let mut results = Vec::new();
loop {
if let Ok((term, rest)) = stop(input) {
return Ok(((results, term), rest));
}
let (value, rest) = parser(input)?;
results.push(value);
input = rest;
}
})
}