158: Sequence Parser
Difficulty: โญโญโญ Level: Foundations `pair`, `preceded`, `terminated`, and `delimited` run parsers one after another, keeping only the results you care about.The Problem This Solves
Real grammar elements are sequences: a key-value pair is `key ":" value`. A parenthesized expression is `"(" expr ")"`. A terminated statement is `statement ";"`. You need parsers that run in sequence, thread the remaining input through each step, and let you choose which results to keep. Without sequence combinators, you'd write nested pattern matches: run p1, get `(v1, r1)`, run p2 on `r1`, get `(v2, r2)`, run p3 on `r2`, etc. Each step manually threads the remainder. That's mechanical repetition โ exactly what a combinator should eliminate. `pair` keeps both results. `preceded` runs two parsers but discards the first (the "prefix"). `terminated` keeps the first and discards the second (the "suffix"). `delimited` keeps only the middle, discarding both open and close delimiters. These are the four combinations of "keep or discard" for two-sided sequencing.The Intuition
Think of `delimited(tag("("), expr, tag(")"))` as: "require a `(`, parse `expr`, require a `)`, give me `expr`." You care about the content, not the brackets. The brackets are purely syntactic scaffolding. `preceded` is "require X before Y, give me Y." `terminated` is "give me X, then require Y." Both are `delimited` in disguise, just with one side dropped. The Rust `?` operator makes sequencing elegant: each step either succeeds (binding the value and remainder) or immediately propagates the error up. No nested matches needed.How It Works in Rust
`pair` โ run two parsers, keep both:fn pair<'a, A: 'a, B: 'a>(p1: Parser<'a, A>, p2: Parser<'a, B>) -> Parser<'a, (A, B)> {
Box::new(move |input: &'a str| {
let (v1, rest) = p1(input)?; // run p1; ? propagates Err
let (v2, remaining) = p2(rest)?; // run p2 on what's left
Ok(((v1, v2), remaining))
})
}
`?` is the key โ it's equivalent to `match { Ok(x) => x, Err(e) => return Err(e) }`. Each step only runs if the previous one succeeded, and threading `rest` to `p2` ensures parsers advance sequentially.
`preceded` โ discard prefix, keep suffix:
fn preceded<'a, A: 'a, B: 'a>(prefix: Parser<'a, A>, p: Parser<'a, B>) -> Parser<'a, B> {
Box::new(move |input: &'a str| {
let (_, rest) = prefix(input)?; // run prefix, discard its value (underscore)
p(rest) // run p on remainder, return its result
})
}
`terminated` โ keep value, discard suffix:
fn terminated<'a, A: 'a, B: 'a>(p: Parser<'a, A>, suffix: Parser<'a, B>) -> Parser<'a, A> {
Box::new(move |input: &'a str| {
let (value, rest) = p(input)?; // parse the thing we care about
let (_, remaining) = suffix(rest)?; // parse (and discard) the suffix
Ok((value, remaining)) // return the value, not the suffix
})
}
`delimited` โ keep middle, discard both sides:
fn delimited<'a, A: 'a, B: 'a, C: 'a>(
open: Parser<'a, A>,
p: Parser<'a, B>,
close: Parser<'a, C>,
) -> Parser<'a, B> {
Box::new(move |input: &'a str| {
let (_, r1) = open(input)?; // consume open delimiter
let (value, r2) = p(r1)?; // parse content
let (_, r3) = close(r2)?; // consume close delimiter
Ok((value, r3)) // return only the content
})
}
Usage:
// Parse (x) and return x
let p = delimited(tag("("), satisfy(|c| c.is_ascii_lowercase(), "letter"), tag(")"));
println!("{:?}", p("(x)rest")); // Ok(('x', "rest"))
// Parse "key: value" and return both
let p = pair(tag("key"), preceded(tag(": "), tag("value")));
println!("{:?}", p("key: value")); // Ok((("key", "value"), ""))
// Parse "stmt;" and return "stmt"
let p = terminated(tag("return"), tag(";"));
println!("{:?}", p("return;")); // Ok(("return", ""))
What This Unlocks
- Bracketed expressions โ `delimited(tag("("), expr, tag(")"))` handles parentheses, brackets, braces universally.
- Token separation โ `terminated(key, tag(":"))` strips the colon; `preceded(tag("let "), ident)` strips the keyword.
- Structured data โ chain `pair` calls (or use `triple`) to parse tuples, records, and structured tokens.
Key Differences
| Concept | OCaml | Rust |
|---|---|---|
| Error propagation | Nested `match` / `bind` | `?` operator โ cleaner, same semantics |
| Tuple return | `('a * 'b)` | `(A, B)` |
| Discarding values | `let _ = ...` | `let (_, rest) = ...` |
| Composability | Manual nesting in practice | `?` chains read top-to-bottom |
| Three-way sequence | Explicit nested pair | `triple(p1, p2, p3)` returning `(A, B, C)` |
// Example 158: Sequence Parser
// pair, preceded, terminated, delimited: sequence combinators
type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where F: Fn(char) -> bool + 'a {
let desc = desc.to_string();
Box::new(move |input: &'a str| match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
_ => Err(format!("Expected {}", desc)),
})
}
fn tag<'a>(expected: &str) -> Parser<'a, &'a str> {
let exp = expected.to_string();
Box::new(move |input: &'a str| {
if input.starts_with(&exp) {
Ok((&input[..exp.len()], &input[exp.len()..]))
} else {
Err(format!("Expected \"{}\"", exp))
}
})
}
// ============================================================
// Approach 1: pair โ run two parsers, return both results
// ============================================================
fn pair<'a, A: 'a, B: 'a>(p1: Parser<'a, A>, p2: Parser<'a, B>) -> Parser<'a, (A, B)> {
Box::new(move |input: &'a str| {
let (v1, rest) = p1(input)?;
let (v2, remaining) = p2(rest)?;
Ok(((v1, v2), remaining))
})
}
// ============================================================
// Approach 2: preceded, terminated โ discard one side
// ============================================================
fn preceded<'a, A: 'a, B: 'a>(prefix: Parser<'a, A>, p: Parser<'a, B>) -> Parser<'a, B> {
Box::new(move |input: &'a str| {
let (_, rest) = prefix(input)?;
p(rest)
})
}
fn terminated<'a, A: 'a, B: 'a>(p: Parser<'a, A>, suffix: Parser<'a, B>) -> Parser<'a, A> {
Box::new(move |input: &'a str| {
let (value, rest) = p(input)?;
let (_, remaining) = suffix(rest)?;
Ok((value, remaining))
})
}
// ============================================================
// Approach 3: delimited โ discard both sides, keep middle
// ============================================================
fn delimited<'a, A: 'a, B: 'a, C: 'a>(
open: Parser<'a, A>,
p: Parser<'a, B>,
close: Parser<'a, C>,
) -> Parser<'a, B> {
Box::new(move |input: &'a str| {
let (_, r1) = open(input)?;
let (value, r2) = p(r1)?;
let (_, r3) = close(r2)?;
Ok((value, r3))
})
}
/// Triple โ three parsers in sequence
fn triple<'a, A: 'a, B: 'a, C: 'a>(
p1: Parser<'a, A>,
p2: Parser<'a, B>,
p3: Parser<'a, C>,
) -> Parser<'a, (A, B, C)> {
Box::new(move |input: &'a str| {
let (v1, r1) = p1(input)?;
let (v2, r2) = p2(r1)?;
let (v3, r3) = p3(r2)?;
Ok(((v1, v2, v3), r3))
})
}
fn main() {
let letter = satisfy(|c| c.is_ascii_lowercase(), "letter");
let digit = satisfy(|c| c.is_ascii_digit(), "digit");
println!("=== pair ===");
let p = pair(letter, digit);
println!("{:?}", p("a1rest")); // Ok((('a', '1'), "rest"))
let letter = satisfy(|c| c.is_ascii_lowercase(), "letter");
println!("\n=== preceded ===");
let p = preceded(tag("("), letter);
println!("{:?}", p("(a)")); // Ok(('a', ")"))
let letter = satisfy(|c| c.is_ascii_lowercase(), "letter");
println!("\n=== terminated ===");
let p = terminated(letter, tag(";"));
println!("{:?}", p("a;rest")); // Ok(('a', "rest"))
let letter = satisfy(|c| c.is_ascii_lowercase(), "letter");
println!("\n=== delimited ===");
let p = delimited(tag("("), letter, tag(")"));
println!("{:?}", p("(x)rest")); // Ok(('x', "rest"))
println!("\nโ All examples completed");
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_pair() {
let p = pair(
satisfy(|c| c.is_ascii_lowercase(), "letter"),
satisfy(|c| c.is_ascii_digit(), "digit"),
);
assert_eq!(p("a1"), Ok((('a', '1'), "")));
}
#[test]
fn test_pair_fail_first() {
let p = pair(
satisfy(|c| c.is_ascii_lowercase(), "letter"),
satisfy(|c| c.is_ascii_digit(), "digit"),
);
assert!(p("1a").is_err());
}
#[test]
fn test_preceded() {
let p = preceded(tag("("), satisfy(|c| c.is_ascii_lowercase(), "letter"));
assert_eq!(p("(a)"), Ok(('a', ")")));
}
#[test]
fn test_terminated() {
let p = terminated(satisfy(|c| c.is_ascii_lowercase(), "letter"), tag(";"));
assert_eq!(p("a;rest"), Ok(('a', "rest")));
}
#[test]
fn test_delimited() {
let p = delimited(
tag("("),
satisfy(|c| c.is_ascii_lowercase(), "letter"),
tag(")"),
);
assert_eq!(p("(x)rest"), Ok(('x', "rest")));
}
#[test]
fn test_delimited_fail_close() {
let p = delimited(
tag("("),
satisfy(|c| c.is_ascii_lowercase(), "letter"),
tag(")"),
);
assert!(p("(x]").is_err());
}
#[test]
fn test_triple() {
let p = triple(
satisfy(|c| c.is_ascii_lowercase(), "letter"),
satisfy(|c| c.is_ascii_digit(), "digit"),
satisfy(|c| c.is_ascii_lowercase(), "letter"),
);
assert_eq!(p("a1b"), Ok((('a', '1', 'b'), "")));
}
}
(* Example 158: Sequence Parser *)
(* pair, preceded, terminated, delimited: sequence combinators *)
type 'a parse_result = ('a * string, string) result
type 'a parser = string -> 'a parse_result
let satisfy pred desc : char parser = fun input ->
if String.length input > 0 && pred input.[0] then
Ok (input.[0], String.sub input 1 (String.length input - 1))
else Error (Printf.sprintf "Expected %s" desc)
let tag expected : string parser = fun input ->
let len = String.length expected in
if String.length input >= len && String.sub input 0 len = expected then
Ok (expected, String.sub input len (String.length input - len))
else Error (Printf.sprintf "Expected \"%s\"" expected)
(* Approach 1: pair โ run two parsers in sequence, return both results *)
let pair (p1 : 'a parser) (p2 : 'b parser) : ('a * 'b) parser = fun input ->
match p1 input with
| Error e -> Error e
| Ok (v1, rest) ->
match p2 rest with
| Error e -> Error e
| Ok (v2, remaining) -> Ok ((v1, v2), remaining)
(* Approach 2: preceded, terminated โ discard one side *)
let preceded (prefix : 'a parser) (p : 'b parser) : 'b parser = fun input ->
match prefix input with
| Error e -> Error e
| Ok (_, rest) -> p rest
let terminated (p : 'a parser) (suffix : 'b parser) : 'a parser = fun input ->
match p input with
| Error e -> Error e
| Ok (v, rest) ->
match suffix rest with
| Error e -> Error e
| Ok (_, remaining) -> Ok (v, remaining)
(* Approach 3: delimited โ discard both sides *)
let delimited (open_p : 'a parser) (p : 'b parser) (close_p : 'c parser) : 'b parser =
fun input ->
preceded open_p (terminated p close_p) input
(* triple โ three in sequence *)
let triple (p1 : 'a parser) (p2 : 'b parser) (p3 : 'c parser) : ('a * 'b * 'c) parser =
fun input ->
match p1 input with
| Error e -> Error e
| Ok (v1, r1) ->
match p2 r1 with
| Error e -> Error e
| Ok (v2, r2) ->
match p3 r2 with
| Error e -> Error e
| Ok (v3, r3) -> Ok ((v1, v2, v3), r3)
(* Tests *)
let () =
let letter = satisfy (fun c -> (c >= 'a' && c <= 'z')) "letter" in
let digit = satisfy (fun c -> c >= '0' && c <= '9') "digit" in
(* pair *)
assert (pair letter digit "a1" = Ok (('a', '1'), ""));
assert (Result.is_error (pair letter digit "1a"));
(* preceded: skip prefix *)
assert (preceded (tag "(") letter "(a)" = Ok ('a', ")"));
(* terminated: skip suffix *)
assert (terminated letter (tag ";") "a;rest" = Ok ('a', "rest"));
(* delimited: extract middle *)
assert (delimited (tag "(") letter (tag ")") "(x)rest" = Ok ('x', "rest"));
(* triple *)
assert (triple letter digit letter "a1b" = Ok (('a', '1', 'b'), ""));
print_endline "โ All tests passed"
๐ Detailed Comparison
Comparison: Example 158 โ Sequence Parser
pair
OCaml:
๐ช Show OCaml equivalent
let pair (p1 : 'a parser) (p2 : 'b parser) : ('a * 'b) parser = fun input ->
match p1 input with
| Error e -> Error e
| Ok (v1, rest) ->
match p2 rest with
| Error e -> Error e
| Ok (v2, remaining) -> Ok ((v1, v2), remaining)
Rust:
fn pair<'a, A: 'a, B: 'a>(p1: Parser<'a, A>, p2: Parser<'a, B>) -> Parser<'a, (A, B)> {
Box::new(move |input: &'a str| {
let (v1, rest) = p1(input)?;
let (v2, remaining) = p2(rest)?;
Ok(((v1, v2), remaining))
})
}preceded / terminated
OCaml:
๐ช Show OCaml equivalent
let preceded (prefix : 'a parser) (p : 'b parser) : 'b parser = fun input ->
match prefix input with
| Error e -> Error e
| Ok (_, rest) -> p rest
Rust:
fn preceded<'a, A: 'a, B: 'a>(prefix: Parser<'a, A>, p: Parser<'a, B>) -> Parser<'a, B> {
Box::new(move |input: &'a str| {
let (_, rest) = prefix(input)?;
p(rest)
})
}delimited
OCaml:
๐ช Show OCaml equivalent
let delimited open_p p close_p = fun input ->
preceded open_p (terminated p close_p) input
Rust:
fn delimited<'a, A: 'a, B: 'a, C: 'a>(
open: Parser<'a, A>, p: Parser<'a, B>, close: Parser<'a, C>,
) -> Parser<'a, B> {
Box::new(move |input: &'a str| {
let (_, r1) = open(input)?;
let (value, r2) = p(r1)?;
let (_, r3) = close(r2)?;
Ok((value, r3))
})
}