🦀 Functional Rust
🎬 How Rust Iterators Work Lazy evaluation, chaining, collect(), and zero-cost abstractions.
📝 Text version (for readers / accessibility)

• Iterators are lazy — .map(), .filter(), .take() build a chain but do no work until consumed

• .collect() triggers evaluation, transforming the chain into a Vec, HashMap, or other collection

• Zero-cost abstraction: iterator chains compile to the same machine code as hand-written loops

• .iter() borrows, .into_iter() consumes, .iter_mut() borrows mutably

• Chaining replaces nested loops with a readable, composable pipeline

082: Nucleotide Count — Bioinformatics

Difficulty: Beginner Category: String Processing / Maps Concept: Counting character frequencies using maps and pattern matching Key Insight: OCaml uses persistent Map (immutable); Rust uses mutable HashMap with in-place updates via `get_mut`.
use std::collections::HashMap;

/// Nucleotide Count — counting character frequencies
///
/// Ownership: input DNA string is borrowed (&str).
/// Result HashMap is owned by the caller.

pub fn nucleotide_count(dna: &str) -> Result<HashMap<char, usize>, char> {
    let mut counts: HashMap<char, usize> =
        [('A', 0), ('C', 0), ('G', 0), ('T', 0)].into();

    for c in dna.chars() {
        match counts.get_mut(&c) {
            Some(n) => *n += 1,
            None => return Err(c),
        }
    }
    Ok(counts)
}

/// Version 2: Using fold
pub fn nucleotide_count_fold(dna: &str) -> Result<HashMap<char, usize>, char> {
    dna.chars().try_fold(
        [('A', 0), ('C', 0), ('G', 0), ('T', 0)].into_iter().collect::<HashMap<_, _>>(),
        |mut acc, c| {
            *acc.get_mut(&c).ok_or(c)? += 1;
            Ok(acc)
        },
    )
}

/// Version 3: Using array instead of HashMap for performance
pub fn nucleotide_count_array(dna: &str) -> Result<[usize; 4], char> {
    let mut counts = [0usize; 4];
    for c in dna.chars() {
        match c {
            'A' => counts[0] += 1,
            'C' => counts[1] += 1,
            'G' => counts[2] += 1,
            'T' => counts[3] += 1,
            _ => return Err(c),
        }
    }
    Ok(counts)
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_gattaca() {
        let counts = nucleotide_count("GATTACA").unwrap();
        assert_eq!(counts[&'A'], 3);
        assert_eq!(counts[&'T'], 2);
        assert_eq!(counts[&'G'], 1);
        assert_eq!(counts[&'C'], 1);
    }

    #[test]
    fn test_empty() {
        let counts = nucleotide_count("").unwrap();
        assert!(counts.values().all(|&v| v == 0));
    }

    #[test]
    fn test_invalid() {
        assert_eq!(nucleotide_count("GATTXCA"), Err('X'));
    }

    #[test]
    fn test_fold_version() {
        let counts = nucleotide_count_fold("GATTACA").unwrap();
        assert_eq!(counts[&'A'], 3);
    }

    #[test]
    fn test_array_version() {
        let counts = nucleotide_count_array("GATTACA").unwrap();
        assert_eq!(counts, [3, 1, 1, 2]); // A, C, G, T
    }
}

fn main() {
    println!("{:?}", counts[&'A'], 3);
    println!("{:?}", counts[&'T'], 2);
    println!("{:?}", counts[&'G'], 1);
}
(* Nucleotide Count — Bioinformatics *)

module CMap = Map.Make(Char)

(* Version 1: Using Map for counting *)
let nucleotide_count dna =
  let init = List.fold_left (fun m c -> CMap.add c 0 m)
    CMap.empty ['A';'C';'G';'T'] in
  String.fold_left (fun m c ->
    match CMap.find_opt c m with
    | Some n -> CMap.add c (n + 1) m
    | None -> failwith ("invalid nucleotide: " ^ String.make 1 c)
  ) init dna

(* Version 2: Using a simple assoc list *)
let nucleotide_count_simple dna =
  let counts = [('A', ref 0); ('C', ref 0); ('G', ref 0); ('T', ref 0)] in
  String.iter (fun c ->
    (List.assoc c counts) := !(List.assoc c counts) + 1
  ) dna;
  List.map (fun (c, r) -> (c, !r)) counts

let () =
  let counts = nucleotide_count "GATTACA" in
  assert (CMap.find 'A' counts = 3);
  assert (CMap.find 'T' counts = 2)

📊 Detailed Comparison

Nucleotide Count — Comparison

Core Insight

Character frequency counting shows the difference between immutable maps (OCaml) and mutable maps (Rust). OCaml rebuilds the map on each update; Rust mutates in place. Both handle invalid input, but with different error mechanisms.

OCaml Approach

  • `Map.Make(Char)` — persistent (immutable) map, O(log n) per update
  • `String.fold_left` to iterate and accumulate
  • `failwith` for invalid nucleotides
  • Each update creates a new map (old one still valid)

Rust Approach

  • `HashMap<char, usize>` — mutable, O(1) amortized per update
  • `get_mut(&c)` for in-place mutation
  • `Result<HashMap, char>` for error handling
  • Array variant `[usize; 4]` avoids allocation entirely

Comparison Table

AspectOCamlRust
Map type`Map.Make(Char)` persistent`HashMap<char, usize>` mutable
UpdateNew map per insertIn-place `*n += 1`
Error`failwith` exception`Result<T, char>`
PerformanceO(log n) per updateO(1) amortized
Zero-alloc optionNoYes (array variant)

Learner Notes

  • `HashMap::from([...])` initializes from array of tuples
  • `get_mut` returns `Option<&mut V>` — the mutable reference pattern
  • `try_fold` is the Rust equivalent of OCaml fold with early exit
  • Array variant shows Rust's strength: zero-allocation when structure is known