Number Parser
Functional Programming
Tutorial
The Problem
Floating-point numbers in text formats (JSON, CSV, scientific data) require parsing optional sign, integer digits, optional decimal point and fractional digits, and optional exponent notation (1.5e-10). Each component is optional or required in a specific combination. This example builds a full floating-point parser using combinators, demonstrating how complex lexical rules reduce to composed simple rules with clear, testable components.
🎯 Learning Outcomes
opt and many1 combine to handle optional and required componentsstr::parseCode Example
fn float_string<'a>() -> Parser<'a, &'a str> {
Box::new(|input: &'a str| {
let bytes = input.as_bytes();
let mut pos = 0;
if pos < bytes.len() && (bytes[pos] == b'+' || bytes[pos] == b'-') { pos += 1; }
while pos < bytes.len() && bytes[pos].is_ascii_digit() { pos += 1; }
// ... decimal, exponent ...
Ok((&input[..pos], &input[pos..]))
})
}Key Differences
take_while1 + float_of_string is concise but permissive; Rust's combinator parser is strict but verbose.float_of_string raises Failure on invalid input; Rust's str::parse::<f64>() returns Result, propagated via ?.take_while1 works directly on the buffer; Rust's combinator version collects Vec<char> before converting.. as decimal separator); locale-aware parsing requires additional handling.OCaml Approach
Angstrom provides a direct approach:
let number =
take_while1 (fun c -> Char.is_digit c || c = '.' || c = 'e' || c = 'E'
|| c = '+' || c = '-')
>>| float_of_string
This is a common shortcut, though it accepts invalid strings like "1.2.3" that float_of_string rejects with an exception. A stricter combinator parser follows the BNF more closely.
Full Source
//! # Number Parser
//!
//! Parse integers and floats from `&str` with validation and error handling.
//!
//! Four approaches mirror the OCaml source:
//! * [`parse_int_safe`] — delegate to the standard library's `str::parse`.
//! * [`parse_int_custom`] — scan digits ourselves, rejecting any non-digit.
//! * [`parse_int_with_sign`] — extend the custom scanner with an optional `+`/`-` prefix.
//! * [`parse_float_safe`] — standard-library float parsing, same shape as the integer version.
//!
//! Each function returns `Result<T, String>` so callers can distinguish success from
//! a malformed input and get a human-readable message back.
/// Parse an unsigned decimal integer using the standard library.
///
/// Returns `Err` with a human-readable message if the input is not a valid `i64`.
pub fn parse_int_safe(s: &str) -> Result<i64, String> {
s.parse::<i64>()
.map_err(|_| format!("Not a valid integer: {s}"))
}
/// Parse an unsigned decimal integer by scanning each character.
///
/// Any non-digit (including a leading sign or a trailing letter) is rejected.
/// An empty input is also rejected, matching the OCaml version.
pub fn parse_int_custom(s: &str) -> Result<i64, String> {
if s.is_empty() || !s.bytes().all(|b| b.is_ascii_digit()) {
return Err(format!("Invalid characters: {s}"));
}
s.bytes()
.try_fold(0i64, |acc, b| {
acc.checked_mul(10)?.checked_add(i64::from(b - b'0'))
})
.ok_or_else(|| format!("Invalid characters: {s}"))
}
/// Parse a decimal integer with an optional leading `+` or `-`.
pub fn parse_int_with_sign(s: &str) -> Result<i64, String> {
let (sign, digits) = match s.as_bytes().first() {
Some(b'-') => (-1, &s[1..]),
Some(b'+') => (1, &s[1..]),
_ => (1, s),
};
if digits.is_empty() || !digits.bytes().all(|b| b.is_ascii_digit()) {
return Err(if sign == -1 {
format!("Invalid negative number: {s}")
} else {
format!("Invalid positive number: {s}")
});
}
parse_int_custom(digits).map(|n| sign * n).map_err(|_| {
if sign == -1 {
format!("Invalid negative number: {s}")
} else {
format!("Invalid positive number: {s}")
}
})
}
/// Parse a floating-point number using the standard library.
pub fn parse_float_safe(s: &str) -> Result<f64, String> {
s.parse::<f64>()
.map_err(|_| format!("Not a valid float: {s}"))
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn int_safe_accepts_valid_input() {
assert_eq!(parse_int_safe("42"), Ok(42));
assert_eq!(parse_int_safe("-17"), Ok(-17));
assert_eq!(parse_int_safe("0"), Ok(0));
}
#[test]
fn int_safe_rejects_invalid_input() {
assert_eq!(
parse_int_safe("abc"),
Err("Not a valid integer: abc".to_string())
);
assert_eq!(parse_int_safe(""), Err("Not a valid integer: ".to_string()));
assert!(parse_int_safe("1.5").is_err());
}
#[test]
fn int_custom_accepts_only_digits() {
assert_eq!(parse_int_custom("123"), Ok(123));
assert_eq!(parse_int_custom("0"), Ok(0));
}
#[test]
fn int_custom_rejects_non_digits() {
assert_eq!(
parse_int_custom("12a3"),
Err("Invalid characters: 12a3".to_string())
);
assert_eq!(
parse_int_custom(""),
Err("Invalid characters: ".to_string())
);
assert!(
parse_int_custom("-5").is_err(),
"sign belongs to parse_int_with_sign"
);
}
#[test]
fn int_with_sign_handles_prefixes() {
assert_eq!(parse_int_with_sign("+5"), Ok(5));
assert_eq!(parse_int_with_sign("-5"), Ok(-5));
assert_eq!(parse_int_with_sign("5"), Ok(5));
}
#[test]
fn int_with_sign_reports_direction() {
assert_eq!(
parse_int_with_sign("-abc"),
Err("Invalid negative number: -abc".to_string())
);
assert_eq!(
parse_int_with_sign("+abc"),
Err("Invalid positive number: +abc".to_string())
);
assert_eq!(
parse_int_with_sign("abc"),
Err("Invalid positive number: abc".to_string())
);
}
#[test]
fn float_safe_accepts_valid_input() {
assert_eq!(parse_float_safe("2.5"), Ok(2.5));
assert_eq!(parse_float_safe("-2.0"), Ok(-2.0));
assert_eq!(parse_float_safe("1e10"), Ok(1e10));
}
#[test]
fn float_safe_rejects_invalid_input() {
assert_eq!(
parse_float_safe("abc"),
Err("Not a valid float: abc".to_string())
);
}
#[test]
fn int_custom_detects_overflow() {
// 2^63 = 9223372036854775808, one past i64::MAX.
assert!(parse_int_custom("9223372036854775808").is_err());
}
}
✓ Tests
Rust test suite
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn int_safe_accepts_valid_input() {
assert_eq!(parse_int_safe("42"), Ok(42));
assert_eq!(parse_int_safe("-17"), Ok(-17));
assert_eq!(parse_int_safe("0"), Ok(0));
}
#[test]
fn int_safe_rejects_invalid_input() {
assert_eq!(
parse_int_safe("abc"),
Err("Not a valid integer: abc".to_string())
);
assert_eq!(parse_int_safe(""), Err("Not a valid integer: ".to_string()));
assert!(parse_int_safe("1.5").is_err());
}
#[test]
fn int_custom_accepts_only_digits() {
assert_eq!(parse_int_custom("123"), Ok(123));
assert_eq!(parse_int_custom("0"), Ok(0));
}
#[test]
fn int_custom_rejects_non_digits() {
assert_eq!(
parse_int_custom("12a3"),
Err("Invalid characters: 12a3".to_string())
);
assert_eq!(
parse_int_custom(""),
Err("Invalid characters: ".to_string())
);
assert!(
parse_int_custom("-5").is_err(),
"sign belongs to parse_int_with_sign"
);
}
#[test]
fn int_with_sign_handles_prefixes() {
assert_eq!(parse_int_with_sign("+5"), Ok(5));
assert_eq!(parse_int_with_sign("-5"), Ok(-5));
assert_eq!(parse_int_with_sign("5"), Ok(5));
}
#[test]
fn int_with_sign_reports_direction() {
assert_eq!(
parse_int_with_sign("-abc"),
Err("Invalid negative number: -abc".to_string())
);
assert_eq!(
parse_int_with_sign("+abc"),
Err("Invalid positive number: +abc".to_string())
);
assert_eq!(
parse_int_with_sign("abc"),
Err("Invalid positive number: abc".to_string())
);
}
#[test]
fn float_safe_accepts_valid_input() {
assert_eq!(parse_float_safe("2.5"), Ok(2.5));
assert_eq!(parse_float_safe("-2.0"), Ok(-2.0));
assert_eq!(parse_float_safe("1e10"), Ok(1e10));
}
#[test]
fn float_safe_rejects_invalid_input() {
assert_eq!(
parse_float_safe("abc"),
Err("Not a valid float: abc".to_string())
);
}
#[test]
fn int_custom_detects_overflow() {
// 2^63 = 9223372036854775808, one past i64::MAX.
assert!(parse_int_custom("9223372036854775808").is_err());
}
}
Deep Comparison
Comparison: Example 164 — Number Parser
Imperative scanner
OCaml:
let float_string : string parser = fun input ->
let buf = Buffer.create 16 in
let pos = ref 0 in
let len = String.length input in
if !pos < len && (input.[!pos] = '+' || input.[!pos] = '-') then begin
Buffer.add_char buf input.[!pos]; incr pos end;
while !pos < len && is_digit input.[!pos] do
Buffer.add_char buf input.[!pos]; incr pos done;
(* ... decimal, exponent ... *)
Ok (Buffer.contents buf, String.sub input !pos (len - !pos))
Rust:
fn float_string<'a>() -> Parser<'a, &'a str> {
Box::new(|input: &'a str| {
let bytes = input.as_bytes();
let mut pos = 0;
if pos < bytes.len() && (bytes[pos] == b'+' || bytes[pos] == b'-') { pos += 1; }
while pos < bytes.len() && bytes[pos].is_ascii_digit() { pos += 1; }
// ... decimal, exponent ...
Ok((&input[..pos], &input[pos..]))
})
}
String to float conversion
OCaml:
float_of_string "3.14" (* 3.14 *)
Rust:
"3.14".parse::<f64>() // Ok(3.14)
Exercises
"1.5e-10", "2.0E+3" should parse correctly."01" is invalid in JSON)."3/4" → (3, 4) as a pair of integers.