๐Ÿฆ€ Functional Rust

068: Frequency Counter

Difficulty: โญ Level: Foundations Count how many times each word (or any value) appears โ€” using `HashMap` and the `entry` API.

The Problem This Solves

You have a list of items and you want to know how often each one appears. Word counts in a document. Event types in a log file. HTTP status codes in a request trace. Every programmer needs this pattern constantly. The painful way: for each unique item, scan the whole list and count. O(nยฒ). The dictionary approach is O(n) and every modern language has it. Python: `collections.Counter`. JavaScript: build an object with `{}`. Java: `HashMap<String, Integer>` with a null-check. Rust's `HashMap` is the same idea, but the update pattern is more explicit โ€” and more expressive. The `entry()` API is idiomatic Rust: "get the entry for this key, inserting a default if it's absent, then give me a mutable reference to the value."

The Intuition

The key insight is `entry().or_insert(0)`. It means: "look up this key; if it's not there, insert 0; either way, give me a mutable reference to the value." Then you `+= 1` through that reference. In Python: `counts[word] = counts.get(word, 0) + 1`. In JavaScript: `counts[word] = (counts[word] || 0) + 1`. Rust's version is more structured but expresses the same idea, and it avoids two hash lookups (one to check, one to insert) by doing it in one step.

How It Works in Rust

use std::collections::HashMap;

pub fn word_freq(text: &str) -> HashMap<String, usize> {
 let mut freq = HashMap::new();
 for word in text.split_whitespace() {
     let lower = word.to_lowercase();
     // entry(): find or create the slot for this key
     // or_insert(0): if absent, insert 0 and return &mut 0
     *freq.entry(lower).or_insert(0) += 1;
 }
 freq
}
Functional style with `fold`:
pub fn word_freq_functional(text: &str) -> HashMap<String, usize> {
 text.split_whitespace()
     .map(|w| w.to_lowercase())
     .fold(HashMap::new(), |mut acc, word| {
         *acc.entry(word).or_insert(0) += 1;
         acc
     })
}
Use `BTreeMap` when you need sorted output (like OCaml's `Map`, which is always ordered):
use std::collections::BTreeMap;

pub fn word_freq_sorted(text: &str) -> BTreeMap<String, usize> {
 let mut freq = BTreeMap::new();
 for word in text.split_whitespace() {
     *freq.entry(word.to_lowercase()).or_insert(0) += 1;
 }
 freq
}

What This Unlocks

Key Differences

ConceptOCamlRust
Map type`Map.Make(String)` (ordered, immutable)`HashMap` (unordered) or `BTreeMap` (ordered)
Update pattern`find` + `add` (two steps, replace whole map)`entry().or_insert()` (one step, mutable ref)
Missing key`try โ€ฆ with Not_found -> 0``.or_insert(0)` handles it inline
Ordered outputDefault (always ordered)Use `BTreeMap` explicitly
/// # Map Module โ€” Frequency Counter
///
/// Word frequency counting using HashMap โ€” the Rust equivalent of OCaml's Map.Make.

use std::collections::HashMap;

/// Idiomatic Rust: split, lowercase, count with entry API.
/// The `entry().or_insert(0)` pattern is the standard way to update-or-insert.
pub fn word_freq(text: &str) -> HashMap<String, usize> {
    let mut freq = HashMap::new();
    for word in text.split_whitespace() {
        let lower = word.to_lowercase();
        // `entry` API: get mutable reference, inserting default if absent
        *freq.entry(lower).or_insert(0) += 1;
    }
    freq
}

/// Functional style using fold (iterator chain).
pub fn word_freq_functional(text: &str) -> HashMap<String, usize> {
    text.split_whitespace()
        .map(|w| w.to_lowercase())
        .fold(HashMap::new(), |mut acc, word| {
            *acc.entry(word).or_insert(0) += 1;
            acc
        })
}

/// Using BTreeMap for sorted output (like OCaml's Map which is ordered).
pub fn word_freq_sorted(text: &str) -> std::collections::BTreeMap<String, usize> {
    let mut freq = std::collections::BTreeMap::new();
    for word in text.split_whitespace() {
        *freq.entry(word.to_lowercase()).or_insert(0) += 1;
    }
    freq
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        let freq = word_freq("the cat sat on the mat the cat");
        assert_eq!(freq["the"], 3);
        assert_eq!(freq["cat"], 2);
        assert_eq!(freq["sat"], 1);
    }

    #[test]
    fn test_empty() {
        let freq = word_freq("");
        assert!(freq.is_empty());
    }

    #[test]
    fn test_single_word() {
        let freq = word_freq("hello");
        assert_eq!(freq["hello"], 1);
    }

    #[test]
    fn test_case_insensitive() {
        let freq = word_freq("Hello hello HELLO");
        assert_eq!(freq["hello"], 3);
    }

    #[test]
    fn test_functional_version() {
        let freq = word_freq_functional("a b a c b a");
        assert_eq!(freq["a"], 3);
        assert_eq!(freq["b"], 2);
    }

    #[test]
    fn test_sorted() {
        let freq = word_freq_sorted("b a c a b");
        let keys: Vec<&String> = freq.keys().collect();
        assert_eq!(keys, vec!["a", "b", "c"]); // sorted!
    }
}

fn main() {
    println!("{:?}", freq["the"], 3);
    println!("{:?}", freq["cat"], 2);
    println!("{:?}", freq["sat"], 1);
}
(* Map Module โ€” Frequency Counter *)

module SMap = Map.Make(String)

let word_freq text =
  text |> String.split_on_char ' '
  |> List.map String.lowercase_ascii
  |> List.fold_left (fun acc w ->
    let count = try SMap.find w acc with Not_found -> 0 in
    SMap.add w (count + 1) acc
  ) SMap.empty

let () =
  let freq = word_freq "the cat sat on the mat the cat" in
  SMap.iter (Printf.printf "%s: %d\n") freq;
  assert (SMap.find "the" freq = 3);
  assert (SMap.find "cat" freq = 2);
  Printf.printf "Frequency counter tests passed!\n"

๐Ÿ“Š Detailed Comparison

Frequency Counter โ€” OCaml vs Rust Comparison

Core Insight

Counting frequencies is the classic map use case. OCaml's `Map.Make` functor creates an immutable balanced tree map โ€” each `add` creates a new map. Rust's `HashMap` is mutable with an `entry` API that makes update-or-insert a one-liner. For sorted output, Rust offers `BTreeMap`.

OCaml Approach

`Map.Make(String)` creates a module with an ordered map backed by balanced binary trees. Each update via `SMap.add` returns a new immutable map (structural sharing keeps this efficient). The `find` + `Not_found` exception pattern is common but slightly awkward.

Rust Approach

`HashMap::entry()` returns an `Entry` enum (Occupied or Vacant) โ€” calling `.or_insert(0)` provides a mutable reference to the count, which can be incremented in place. No exception handling needed. For ordered iteration, use `BTreeMap` instead.

Comparison Table

AspectOCamlRust
MemoryImmutable tree (shared nodes)Mutable hash table
Null safety`Not_found` exception`entry` API (no missing key panic)
ErrorsException on missing key`Entry` enum handles presence
Iteration`SMap.iter` (sorted by key)Unordered (HashMap) or sorted (BTreeMap)
Update`find` + `add` (new map)`entry().or_insert()` (in-place)

Things Rust Learners Should Notice

1. `entry()` API โ€” the killer feature for maps; avoids double-lookup (check then insert) 2. `*freq.entry(k).or_insert(0) += 1` โ€” dereference the mutable ref, then increment 3. `HashMap` vs `BTreeMap` โ€” hash is O(1) average but unordered; BTree is O(log n) but sorted 4. OCaml's Map is immutable โ€” functional purity at the cost of creating new nodes on each update 5. `split_whitespace()` โ€” handles multiple spaces, tabs, newlines; better than `split(' ')`

Further Reading

  • [HashMap::entry](https://doc.rust-lang.org/std/collections/struct.HashMap.html#method.entry)
  • [BTreeMap](https://doc.rust-lang.org/std/collections/struct.BTreeMap.html)