153: Satisfy Parser
Difficulty: โญโญโญ Level: Foundations `satisfy(pred)` accepts any character where a predicate returns `true` โ the most powerful atomic parser, from which all character-class parsers are built.The Problem This Solves
Example 152 showed `char_parser`, `any_char`, `none_of`, and `one_of`. But what if you want "any digit"? You'd need to either list all ten digit characters in `one_of`, or write a new function from scratch. That's not composable โ every new character class means a new function. `satisfy` solves this by abstracting the condition. Instead of hardcoding what the character must be, you pass a predicate: any function `char -> bool`. "Is it a digit?" is just `|c| c.is_ascii_digit()`. "Is it a hex digit?" is `|c| c.is_ascii_hexdigit()`. "Is it a lowercase letter or a dash?" is `|c| c.is_ascii_lowercase() || c == '-'`. You never need to write a new atomic parser โ just pass a different predicate. This is the essence of higher-order programming: instead of writing many specialized functions, you write one general function and parameterize the varying part.The Intuition
You already use predicates all the time. `Vec::retain(|x| x > 0)` keeps only positive elements. `Iterator::filter(|s| s.starts_with("http"))` keeps matching strings. `satisfy` is the same idea applied to character parsing: keep the character if the predicate says yes, reject it otherwise. Rust's `char` type has rich built-in predicates:- `c.is_ascii_digit()` โ `'0'`..`'9'`
- `c.is_ascii_alphabetic()` โ `'a'`..`'z'` and `'A'`..`'Z'`
- `c.is_ascii_alphanumeric()` โ both of the above
- `c.is_ascii_whitespace()` โ space, tab, newline, carriage return
- `c.is_ascii_uppercase()` / `c.is_ascii_lowercase()`
How It Works in Rust
The `satisfy` combinator:fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a, // pred must be a function from char to bool,
// and must live at least as long as 'a
{
let desc = desc.to_string(); // own the description for the closure
Box::new(move |input: &'a str| {
match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])), // predicate passes
Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
None => Err(format!("Expected {}, got EOF", desc)),
}
})
}
The `where F: Fn(char) -> bool + 'a` bound means: any callable that takes a `char` and returns `bool`, as long as it doesn't hold references that outlive `'a`. In practice, closures with no captures satisfy this trivially. The `desc` string is for error messages โ "digit", "letter", etc.
Building specific parsers from `satisfy`:
fn is_digit<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_digit(), "digit")
}
fn is_letter<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_alphabetic(), "letter")
}
// Custom predicate inline โ no new function needed:
let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
let sign = satisfy(|c| c == '+' || c == '-', "sign");
let vowel = satisfy(|c| "aeiou".contains(c), "vowel");
`satisfy_or` for richer error messages:
fn satisfy_or<'a, F, E>(pred: F, on_fail: E) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a,
E: Fn(char) -> String + 'a, // error message depends on what was found
{
Box::new(move |input: &'a str| {
match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
Some(c) => Err(on_fail(c)), // custom error using the actual character
None => Err("Unexpected EOF".to_string()),
}
})
}
What This Unlocks
- Every character class parser you'll ever need โ digits, letters, whitespace, hex, punctuation โ all expressible in one line with `satisfy`.
- Custom parsers for domain-specific needs โ "valid URL character", "valid identifier character", "printable ASCII" โ just write the predicate.
- The foundation of examples 155โ162 โ `many0(satisfy(...))`, `map(satisfy(...), ...)`, and every parser combinator in the series uses `satisfy` internally.
Key Differences
| Concept | OCaml | Rust | ||
|---|---|---|---|---|
| Predicate type | `char -> bool` (inferred) | `F: Fn(char) -> bool + 'a` (explicit bound) | ||
| Char classification | Manual: `c >= '0' && c <= '9'` | Built-in: `c.is_ascii_digit()` | ||
| Closure syntax | `fun c -> c >= '0' && c <= '9'` | `\ | c\ | c.is_ascii_digit()` |
| Type annotations | None (inference handles it) | Bound on type parameter `F` needed | ||
| Custom errors | Description string | Can be a closure `Fn(char) -> String` |
// Example 153: Satisfy Parser
// Parse a character matching a predicate
type ParseResult<'a, T> = Result<(T, &'a str), String>;
type Parser<'a, T> = Box<dyn Fn(&'a str) -> ParseResult<'a, T> + 'a>;
// ============================================================
// Approach 1: satisfy with predicate and description
// ============================================================
fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a,
{
let desc = desc.to_string();
Box::new(move |input: &'a str| {
match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
None => Err(format!("Expected {}, got EOF", desc)),
}
})
}
// ============================================================
// Approach 2: Build specific parsers from satisfy
// ============================================================
fn is_digit<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_digit(), "digit")
}
fn is_letter<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_alphabetic(), "letter")
}
fn is_alphanumeric<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_alphanumeric(), "alphanumeric")
}
fn is_whitespace_char<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_whitespace(), "whitespace")
}
fn is_uppercase<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_uppercase(), "uppercase letter")
}
fn is_lowercase<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_lowercase(), "lowercase letter")
}
// ============================================================
// Approach 3: satisfy_or with custom error function
// ============================================================
fn satisfy_or<'a, F, E>(pred: F, on_fail: E) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a,
E: Fn(char) -> String + 'a,
{
Box::new(move |input: &'a str| {
match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
Some(c) => Err(on_fail(c)),
None => Err("Unexpected EOF".to_string()),
}
})
}
fn main() {
println!("=== satisfy-based parsers ===");
let digit = is_digit();
println!("digit on '42': {:?}", digit("42"));
println!("digit on 'ab': {:?}", digit("ab"));
let letter = is_letter();
println!("letter on 'hi': {:?}", letter("hi"));
let upper = is_uppercase();
println!("upper on 'Hi': {:?}", upper("Hi"));
println!("upper on 'hi': {:?}", upper("hi"));
let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
println!("hex on 'ff': {:?}", hex("ff"));
println!("hex on 'zz': {:?}", hex("zz"));
println!("\nโ All examples completed");
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_digit_success() {
let p = is_digit();
assert_eq!(p("42"), Ok(('4', "2")));
}
#[test]
fn test_digit_failure() {
let p = is_digit();
assert!(p("abc").is_err());
}
#[test]
fn test_letter_success() {
let p = is_letter();
assert_eq!(p("hello"), Ok(('h', "ello")));
}
#[test]
fn test_letter_failure() {
let p = is_letter();
assert!(p("123").is_err());
}
#[test]
fn test_alphanumeric() {
let p = is_alphanumeric();
assert_eq!(p("a1"), Ok(('a', "1")));
assert_eq!(p("1a"), Ok(('1', "a")));
assert!(p("!x").is_err());
}
#[test]
fn test_whitespace() {
let p = is_whitespace_char();
assert_eq!(p(" x"), Ok((' ', "x")));
assert_eq!(p("\tx"), Ok(('\t', "x")));
assert!(p("x").is_err());
}
#[test]
fn test_uppercase() {
let p = is_uppercase();
assert_eq!(p("Hello"), Ok(('H', "ello")));
assert!(p("hello").is_err());
}
#[test]
fn test_lowercase() {
let p = is_lowercase();
assert_eq!(p("hello"), Ok(('h', "ello")));
assert!(p("Hello").is_err());
}
#[test]
fn test_custom_predicate() {
let hex = satisfy(|c| c.is_ascii_hexdigit(), "hex digit");
assert_eq!(hex("ff"), Ok(('f', "f")));
assert!(hex("zz").is_err());
}
#[test]
fn test_satisfy_or_custom_error() {
let p = satisfy_or(
|c| c == '@',
|c| format!("Expected '@', found '{}'", c),
);
assert_eq!(p("@hello"), Ok(('@', "hello")));
assert_eq!(p("hello"), Err("Expected '@', found 'h'".to_string()));
}
#[test]
fn test_empty_input() {
let p = is_digit();
assert!(p("").is_err());
}
}
(* Example 153: Satisfy Parser *)
(* Parse a character matching a predicate *)
type 'a parse_result = ('a * string, string) result
type 'a parser = string -> 'a parse_result
let advance input =
if String.length input > 0 then
Some (input.[0], String.sub input 1 (String.length input - 1))
else None
(* Approach 1: satisfy with a predicate *)
let satisfy (pred : char -> bool) (desc : string) : char parser = fun input ->
match advance input with
| Some (ch, rest) when pred ch -> Ok (ch, rest)
| Some (ch, _) -> Error (Printf.sprintf "Character '%c' does not satisfy %s" ch desc)
| None -> Error (Printf.sprintf "Expected %s, got EOF" desc)
(* Approach 2: Build specific parsers from satisfy *)
let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
let is_letter = satisfy (fun c ->
(c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"
let is_alphanumeric = satisfy (fun c ->
(c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
(c >= '0' && c <= '9')) "alphanumeric"
let is_whitespace = satisfy (fun c -> c = ' ' || c = '\t' || c = '\n' || c = '\r') "whitespace"
(* Approach 3: Satisfy with custom error message *)
let satisfy_or (pred : char -> bool) (on_fail : char -> string) : char parser = fun input ->
match advance input with
| Some (ch, rest) when pred ch -> Ok (ch, rest)
| Some (ch, _) -> Error (on_fail ch)
| None -> Error "Unexpected EOF"
let is_uppercase = satisfy_or
(fun c -> c >= 'A' && c <= 'Z')
(fun c -> Printf.sprintf "'%c' is not uppercase" c)
(* Tests *)
let () =
assert (is_digit "42" = Ok ('4', "2"));
assert (Result.is_error (is_digit "abc"));
assert (is_letter "hello" = Ok ('h', "ello"));
assert (Result.is_error (is_letter "123"));
assert (is_alphanumeric "a1" = Ok ('a', "1"));
assert (is_alphanumeric "1a" = Ok ('1', "a"));
assert (is_whitespace " x" = Ok (' ', "x"));
assert (is_uppercase "Hello" = Ok ('H', "ello"));
assert (Result.is_error (is_uppercase "hello"));
assert (Result.is_error (is_digit ""));
print_endline "โ All tests passed"
๐ Detailed Comparison
Comparison: Example 153 โ Satisfy Parser
Core satisfy
OCaml:
๐ช Show OCaml equivalent
let satisfy (pred : char -> bool) (desc : string) : char parser = fun input ->
match advance input with
| Some (ch, rest) when pred ch -> Ok (ch, rest)
| Some (ch, _) -> Error (Printf.sprintf "Character '%c' does not satisfy %s" ch desc)
| None -> Error (Printf.sprintf "Expected %s, got EOF" desc)
Rust:
fn satisfy<'a, F>(pred: F, desc: &str) -> Parser<'a, char>
where
F: Fn(char) -> bool + 'a,
{
let desc = desc.to_string();
Box::new(move |input: &'a str| {
match input.chars().next() {
Some(c) if pred(c) => Ok((c, &input[c.len_utf8()..])),
Some(c) => Err(format!("'{}' does not satisfy {}", c, desc)),
None => Err(format!("Expected {}, got EOF", desc)),
}
})
}Building specific parsers
OCaml:
๐ช Show OCaml equivalent
let is_digit = satisfy (fun c -> c >= '0' && c <= '9') "digit"
let is_letter = satisfy (fun c ->
(c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) "letter"
Rust:
fn is_digit<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_digit(), "digit")
}
fn is_letter<'a>() -> Parser<'a, char> {
satisfy(|c| c.is_ascii_alphabetic(), "letter")
}