โข Option
โข Result
โข The ? operator propagates errors up the call stack concisely
โข Combinators like .map(), .and_then(), .unwrap_or() chain fallible operations
โข The compiler forces you to handle every error case โ no silent failures
โข Option
โข Result
โข The ? operator propagates errors up the call stack concisely
โข Combinators like .map(), .and_then(), .unwrap_or() chain fallible operations
โข The compiler forces you to handle every error case โ no silent failures
pub type Input<'a> = &'a str;
pub type PResult<'a, T> = Result<(Input<'a>, T), String>;
Primitives โ return closures (parsers):
pub fn tag<'a>(prefix: &'static str) -> impl Fn(Input<'a>) -> PResult<'a, &'a str> {
move |s| {
if s.starts_with(prefix) { Ok((&s[prefix.len()..], &s[..prefix.len()])) }
else { Err(format!("expected {prefix:?}")) }
}
}
pub fn take_while<F: Fn(char) -> bool>(pred: F) -> impl Fn(Input<'_>) -> PResult<'_, &str> {
move |s| {
let end = s.find(|c| !pred(c)).unwrap_or(s.len());
Ok((&s[end..], &s[..end]))
}
}
Combinators โ compose parsers:
pub fn pair<'a, A, B, P1, P2>(p1: P1, p2: P2) -> impl Fn(Input<'a>) -> PResult<'a, (A, B)>
where P1: Fn(Input<'a>) -> PResult<'a, A>, P2: Fn(Input<'a>) -> PResult<'a, B> {
move |s| {
let (rest, a) = p1(s)?;
let (rest, b) = p2(rest)?;
Ok((rest, (a, b)))
}
}
pub fn many0<'a, T, P: Fn(Input<'a>) -> PResult<'a, T>>(p: P)
-> impl Fn(Input<'a>) -> PResult<'a, Vec<T>>
{
move |mut s| {
let mut acc = Vec::new();
loop {
match p(s) {
Ok((rest, v)) => { acc.push(v); s = rest; }
Err(_) => break,
}
}
Ok((s, acc))
}
}
Building a real parser from combinators:
fn key_value(s: Input<'_>) -> PResult<'_, (&str, &str)> {
pair(
terminated(alpha, char_p('=')), // key before '='
take_while(|c| c != ',' && c != '\n'), // value until delimiter
)(s)
}
fn kv_list(s: Input<'_>) -> PResult<'_, Vec<(&str, &str)>> {
sep_by(key_value, char_p(','))(s)
}
// Usage:
let (rest, pairs) = kv_list("name=Alice,age=30,city=Berlin").unwrap();
| Concept | OCaml | Rust | ||
|---|---|---|---|---|
| Parser type | `string -> ('a * string)` or `angstrom` | `Fn(&str) -> Result<(&str, T), E>` | ||
| Higher-order combinators | `let ( >>= ) p f s = ...` | `fn pair<P1,P2>(p1: P1, p2: P2) -> impl Fn(...)` | ||
| Many/repeat | `many : 'a t -> 'a list t` | `many0(p)` returning `impl Fn โ Vec<T>` | ||
| Returning closures | `fun s -> ...` | `move | s | ...` returned as `impl Fn` |
//! # Parser Combinator Pattern
//!
//! Building parsers from small composable pieces.
/// Parser result
pub type ParseResult<'a, T> = Option<(T, &'a str)>;
/// Parser type alias
pub type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
// โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Primitive Parsers
// โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
/// Parse a specific character
pub fn char_p(c: char) -> impl Fn(&str) -> ParseResult<char> {
move |input| {
input
.chars()
.next()
.filter(|&ch| ch == c)
.map(|ch| (ch, &input[ch.len_utf8()..]))
}
}
/// Parse any character satisfying a predicate
pub fn satisfy<F>(pred: F) -> impl Fn(&str) -> ParseResult<char>
where
F: Fn(char) -> bool,
{
move |input| {
input
.chars()
.next()
.filter(|&c| pred(c))
.map(|c| (c, &input[c.len_utf8()..]))
}
}
/// Parse a specific string
pub fn string_p(s: &str) -> impl Fn(&str) -> ParseResult<&str> + '_ {
move |input| {
if input.starts_with(s) {
Some((&input[..s.len()], &input[s.len()..]))
} else {
None
}
}
}
/// Parse one or more digits
pub fn digits(input: &str) -> ParseResult<&str> {
let end = input
.char_indices()
.take_while(|(_, c)| c.is_ascii_digit())
.last()
.map(|(i, c)| i + c.len_utf8())
.unwrap_or(0);
if end > 0 {
Some((&input[..end], &input[end..]))
} else {
None
}
}
/// Parse whitespace
pub fn whitespace(input: &str) -> ParseResult<&str> {
let end = input
.char_indices()
.take_while(|(_, c)| c.is_whitespace())
.last()
.map(|(i, c)| i + c.len_utf8())
.unwrap_or(0);
Some((&input[..end], &input[end..]))
}
// โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Combinators
// โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
/// Map over parser result
pub fn map<'a, A, B, P, F>(parser: P, f: F) -> impl Fn(&'a str) -> ParseResult<'a, B>
where
P: Fn(&'a str) -> ParseResult<'a, A>,
F: Fn(A) -> B,
{
move |input| parser(input).map(|(a, rest)| (f(a), rest))
}
/// Sequence two parsers
pub fn then<'a, A, B, PA, PB>(p1: PA, p2: PB) -> impl Fn(&'a str) -> ParseResult<'a, (A, B)>
where
PA: Fn(&'a str) -> ParseResult<'a, A>,
PB: Fn(&'a str) -> ParseResult<'a, B>,
{
move |input| {
let (a, rest) = p1(input)?;
let (b, rest) = p2(rest)?;
Some(((a, b), rest))
}
}
/// Try first parser, if fails try second
pub fn or<'a, A, P1, P2>(p1: P1, p2: P2) -> impl Fn(&'a str) -> ParseResult<'a, A>
where
P1: Fn(&'a str) -> ParseResult<'a, A>,
P2: Fn(&'a str) -> ParseResult<'a, A>,
{
move |input| p1(input).or_else(|| p2(input))
}
/// Parse zero or more
pub fn many<'a, A, P>(parser: P) -> impl Fn(&'a str) -> ParseResult<'a, Vec<A>>
where
P: Fn(&'a str) -> ParseResult<'a, A>,
{
move |mut input| {
let mut results = Vec::new();
while let Some((item, rest)) = parser(input) {
results.push(item);
input = rest;
}
Some((results, input))
}
}
/// Parse one or more
pub fn many1<'a, A, P>(parser: P) -> impl Fn(&'a str) -> ParseResult<'a, Vec<A>>
where
P: Fn(&'a str) -> ParseResult<'a, A>,
{
move |input| {
let (first, mut rest) = parser(input)?;
let mut results = vec![first];
while let Some((item, new_rest)) = parser(rest) {
results.push(item);
rest = new_rest;
}
Some((results, rest))
}
}
/// Parse with separator
pub fn sep_by<'a, A, S, PA, PS>(parser: PA, sep: PS) -> impl Fn(&'a str) -> ParseResult<'a, Vec<A>>
where
PA: Fn(&'a str) -> ParseResult<'a, A>,
PS: Fn(&'a str) -> ParseResult<'a, S>,
{
move |input| {
let Some((first, mut rest)) = parser(input) else {
return Some((Vec::new(), input));
};
let mut results = vec![first];
while let Some((_, after_sep)) = sep(rest) {
if let Some((item, new_rest)) = parser(after_sep) {
results.push(item);
rest = new_rest;
} else {
break;
}
}
Some((results, rest))
}
}
// โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Example: Simple expression parser
// โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
pub fn parse_number(input: &str) -> ParseResult<i64> {
digits(input).and_then(|(s, rest)| s.parse().ok().map(|n| (n, rest)))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_char_p() {
assert_eq!(char_p('a')("abc"), Some(('a', "bc")));
assert_eq!(char_p('a')("xyz"), None);
}
#[test]
fn test_string_p() {
assert_eq!(string_p("hello")("hello world"), Some(("hello", " world")));
assert_eq!(string_p("hello")("hi"), None);
}
#[test]
fn test_digits() {
assert_eq!(digits("123abc"), Some(("123", "abc")));
assert_eq!(digits("abc"), None);
}
#[test]
fn test_map() {
let num = map(digits, |s| s.parse::<i32>().unwrap());
assert_eq!(num("42x"), Some((42, "x")));
}
#[test]
fn test_then() {
let parser = then(char_p('a'), char_p('b'));
assert_eq!(parser("abc"), Some((('a', 'b'), "c")));
}
#[test]
fn test_or() {
let parser = or(char_p('a'), char_p('b'));
assert_eq!(parser("abc"), Some(('a', "bc")));
assert_eq!(parser("bcd"), Some(('b', "cd")));
}
#[test]
fn test_many() {
let parser = many(char_p('a'));
assert_eq!(parser("aaab"), Some((vec!['a', 'a', 'a'], "b")));
assert_eq!(parser("bbb"), Some((vec![], "bbb")));
}
#[test]
fn test_sep_by() {
let parser = sep_by(parse_number, char_p(','));
assert_eq!(parser("1,2,3"), Some((vec![1, 2, 3], "")));
}
}
(* Parser combinator in OCaml โ nom-style *)
(* โโ Core type โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ *)
(* A parser: input โ (value, rest) option *)
type 'a parser = string -> ('a * string) option
(* โโ Primitives โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ *)
let char_p c : char parser = fun s ->
if s = "" then None
else if s.[0] = c then Some (c, String.sub s 1 (String.length s - 1))
else None
let tag prefix : string parser = fun s ->
let n = String.length prefix in
if String.length s >= n && String.sub s 0 n = prefix
then Some (prefix, String.sub s n (String.length s - n))
else None
let take_while pred : string parser = fun s ->
let i = ref 0 in
let len = String.length s in
while !i < len && pred s.[!i] do incr i done;
Some (String.sub s 0 !i, String.sub s !i (len - !i))
let digit = take_while (fun c -> c >= '0' && c <= '9')
let alpha = take_while (fun c -> (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'))
(* โโ Combinators โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ *)
let map p f : 'b parser = fun s ->
match p s with None -> None | Some (v, rest) -> Some (f v, rest)
let ( *> ) p1 p2 : 'b parser = fun s ->
match p1 s with None -> None | Some (_, rest) -> p2 rest
let ( <* ) p1 p2 : 'a parser = fun s ->
match p1 s with None -> None
| Some (v, rest) ->
match p2 rest with None -> None | Some (_, rest2) -> Some (v, rest2)
let pair p1 p2 : ('a * 'b) parser = fun s ->
match p1 s with None -> None
| Some (a, rest) ->
match p2 rest with None -> None | Some (b, rest2) -> Some ((a, b), rest2)
let sep_by p sep : 'a list parser = fun s ->
match p s with
| None -> Some ([], s)
| Some (first, rest) ->
let acc = ref [first] in
let current = ref rest in
let continue = ref true in
while !continue do
match (sep *> p) !current with
| None -> continue := false
| Some (v, rest2) -> acc := v :: !acc; current := rest2
done;
Some (List.rev !acc, !current)
(* โโ Key-value parser โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ *)
let key_value =
pair (alpha <* char_p '=') (take_while (fun c -> c <> ',' && c <> '\n'))
let kv_list = sep_by key_value (char_p ',')
let () =
let input = "name=Alice,age=30,city=Berlin" in
match kv_list input with
| Some (pairs, rest) ->
Printf.printf "Parsed %d pairs (rest=%S):\n" (List.length pairs) rest;
List.iter (fun (k, v) -> Printf.printf " %s => %s\n" k v) pairs
| None -> Printf.printf "Parse failed\n"