Example 070: Hamming Distance
Difficulty: โญ Category: String Processing Concept: Counting positional differences between two equal-length strings using zip and filter. A clean demonstration of pairwise character comparison as a higher-order function pipeline. OCaml โ Rust insight: OCaml needs `List.combine` to zip two lists; Rust's `Iterator::zip` works lazily on any iterator โ including `chars()` on strings โ without intermediate allocation./// Hamming Distance
///
/// Count positional differences between two equal-length strings.
/// A clean example of zip + filter + count pattern in both languages.
/// Using zip + filter + count โ idiomatic Rust.
pub fn hamming_distance(s1: &str, s2: &str) -> Result<usize, String> {
if s1.len() != s2.len() {
return Err("strands must be of equal length".to_string());
}
Ok(s1.chars().zip(s2.chars()).filter(|(a, b)| a != b).count())
}
/// Using fold โ mirrors OCaml's fold_left2 approach.
pub fn hamming_fold(s1: &str, s2: &str) -> Result<usize, String> {
if s1.len() != s2.len() {
return Err("strands must be of equal length".to_string());
}
Ok(s1
.chars()
.zip(s2.chars())
.fold(0, |acc, (a, b)| if a != b { acc + 1 } else { acc }))
}
/// Byte-level comparison โ faster for ASCII strings.
pub fn hamming_bytes(s1: &[u8], s2: &[u8]) -> Result<usize, String> {
if s1.len() != s2.len() {
return Err("strands must be of equal length".to_string());
}
Ok(s1.iter().zip(s2.iter()).filter(|(a, b)| a != b).count())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_hamming_basic() {
assert_eq!(
hamming_distance("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT"),
Ok(7)
);
}
#[test]
fn test_identical() {
assert_eq!(hamming_distance("AAAA", "AAAA"), Ok(0));
}
#[test]
fn test_completely_different() {
assert_eq!(hamming_distance("AAAA", "TTTT"), Ok(4));
}
#[test]
fn test_unequal_length() {
assert!(hamming_distance("AA", "AAA").is_err());
}
#[test]
fn test_empty() {
assert_eq!(hamming_distance("", ""), Ok(0));
}
#[test]
fn test_fold_variant() {
assert_eq!(hamming_fold("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT"), Ok(7));
}
#[test]
fn test_single_char() {
assert_eq!(hamming_distance("A", "G"), Ok(1));
assert_eq!(hamming_distance("A", "A"), Ok(0));
}
}
fn main() {
println!("{:?}", hamming_distance("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT"),
Ok(7));
println!("{:?}", hamming_distance("AAAA", "AAAA"), Ok(0));
println!("{:?}", hamming_distance("AAAA", "TTTT"), Ok(4));
}
exception Invalid_argument of string
let chars_of_string s =
List.init (String.length s) (String.get s)
let hamming_distance s1 s2 =
if String.length s1 <> String.length s2 then
raise (Invalid_argument "strands must be of equal length");
List.combine (chars_of_string s1) (chars_of_string s2)
|> List.filter (fun (a, b) -> a <> b)
|> List.length
let hamming_fold s1 s2 =
if String.length s1 <> String.length s2 then
raise (Invalid_argument "strands must be of equal length");
List.fold_left2
(fun acc c1 c2 -> if c1 <> c2 then acc + 1 else acc)
0
(chars_of_string s1)
(chars_of_string s2)
let () =
assert (hamming_distance "GAGCCTACTAACGGGAT" "CATCGTAATGACGGCCT" = 7);
assert (hamming_distance "AAAA" "AAAA" = 0);
assert (hamming_fold "AAAA" "TTTT" = 4);
print_endline "All assertions passed."
๐ Detailed Comparison
Hamming Distance: OCaml vs Rust
The Core Insight
Hamming distance is a perfect showcase for the zip-filter-count pipeline pattern. Both languages express it elegantly, but Rust's lazy iterators avoid the intermediate list allocations that OCaml's eager `List.combine` creates.OCaml Approach
OCaml's approach first converts strings to character lists with `chars_of_string`, then uses `List.combine` to pair corresponding characters, `List.filter` to keep mismatches, and `List.length` to count them. Each step creates a new list. The `fold_left2` variant avoids intermediate lists by folding over two lists simultaneously. Error handling uses exceptions (`raise Invalid_argument`).Rust Approach
Rust's `s1.chars().zip(s2.chars()).filter(|(a, b)| a != b).count()` is a single lazy pipeline โ no intermediate collections are allocated. Each iterator adapter transforms the stream without materializing it. Error handling uses `Result<usize, String>`, forcing callers to handle the error case at compile time. The byte-level variant (`&[u8]`) avoids UTF-8 decoding overhead for ASCII strings.Side-by-Side
| Concept | OCaml | Rust |
|---|---|---|
| String to chars | `chars_of_string` (custom) | `.chars()` (built-in iterator) |
| Zip | `List.combine` (eager, allocates) | `.zip()` (lazy, zero allocation) |
| Filter | `List.filter` (eager) | `.filter()` (lazy) |
| Count | `List.length` | `.count()` |
| Errors | `raise (Invalid_argument ...)` | `Err("...".to_string())` |
| Fold variant | `List.fold_left2` | `.zip().fold(...)` |
What Rust Learners Should Notice
- Rust iterators are lazy โ `.zip().filter().count()` makes a single pass with zero allocation, compared to OCaml's multiple intermediate lists
- `Result<T, E>` in the return type forces callers to handle the unequal-length case โ no hidden exceptions
- `.chars()` iterates over Unicode scalar values; for ASCII-only strings, working with `.bytes()` or `&[u8]` is faster
- The tuple destructuring `|(a, b)|` in closures mirrors OCaml's `(fun (a, b) -> ...)`
- Pattern: `iter().zip().filter().count()` is one of Rust's most common and powerful iterator patterns
Further Reading
- [Rust Iterator::zip](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.zip)
- [Exercism OCaml Hamming](https://exercism.org/tracks/ocaml/exercises/hamming)