🦀 Functional Rust

844: Greedy Algorithm Patterns — Activity Selection and Huffman Coding

Difficulty: 3 Level: Intermediate Make the locally optimal choice at each step — and prove it gives the global optimum for the right problems.

The Problem This Solves

A greedy algorithm makes the locally optimal decision at each step without revisiting previous choices. It's correct when the problem has two properties: the greedy choice property (a globally optimal solution can always be built by making the locally optimal choice) and optimal substructure (the optimal solution to the whole problem contains optimal solutions to subproblems). When these hold, greedy is far simpler and faster than dynamic programming. This example implements two canonical greedy problems: Both problems have O(n log n) greedy solutions — orders of magnitude faster than their DP counterparts would be if you tried them on large inputs.

The Intuition

Activity selection: sort by finish time. Greedily pick each activity if it doesn't overlap the last selected one. Why does earliest-finish work? Any activity that finishes earlier leaves more room for future activities — swapping any optimal choice with an earlier-finishing one can only help. Huffman coding: build a priority queue (min-heap) of characters by frequency. Repeatedly extract the two least-frequent nodes, create a new internal node with their combined frequency, and re-insert. The final tree assigns shorter codes to frequent characters and longer codes to rare ones — provably minimising total encoded length. Activity selection: O(n log n) to sort, O(n) to scan. Huffman: O(n log n) with a binary heap. Both are elegant proofs by exchange argument: assume a better solution exists, swap its first differing choice with the greedy choice, and show the result is no worse.

How It Works in Rust

// Activity Selection: maximum non-overlapping intervals
fn activity_selection(intervals: &[(usize, usize)]) -> Vec<usize> {
 let mut sorted: Vec<(usize, usize, usize)> = intervals.iter()
     .enumerate()
     .map(|(i, &(s, e))| (e, s, i)) // (end, start, id) for sort-by-end
     .collect();
 sorted.sort_unstable(); // sort by end time (tuple sort: first key wins)

 let mut selected = vec![];
 let mut last_end = 0;

 for (end, start, id) in sorted {
     if start >= last_end {
         selected.push(id);
         last_end = end;
     }
 }
 selected
}

// Huffman Coding: optimal prefix-free code from character frequencies
use std::collections::BinaryHeap;
use std::cmp::Reverse;

#[derive(Debug)]
enum HuffTree {
 Leaf { ch: char, freq: usize },
 Node { freq: usize, left: Box<HuffTree>, right: Box<HuffTree> },
}

impl HuffTree {
 fn freq(&self) -> usize {
     match self { HuffTree::Leaf { freq, .. } | HuffTree::Node { freq, .. } => *freq }
 }
}

fn huffman(freqs: &[(char, usize)]) -> Option<HuffTree> {
 // Min-heap via Reverse wrapper: Reverse(freq) gives min at top
 let mut heap: BinaryHeap<(Reverse<usize>, usize)> = BinaryHeap::new();
 let mut trees: Vec<HuffTree> = freqs.iter()
     .map(|&(ch, freq)| HuffTree::Leaf { ch, freq })
     .collect();

 // Push indices into heap sorted by frequency
 for (i, t) in trees.iter().enumerate() {
     heap.push((Reverse(t.freq()), i));
 }

 // Repeatedly merge two minimum-frequency trees
 while heap.len() > 1 {
     let (_, i) = heap.pop().unwrap();
     let (_, j) = heap.pop().unwrap();
     // Move out of trees vec using swap with placeholder (avoids clone)
     let left  = std::mem::replace(&mut trees[i], HuffTree::Leaf { ch: '\0', freq: 0 });
     let right = std::mem::replace(&mut trees[j], HuffTree::Leaf { ch: '\0', freq: 0 });
     let combined_freq = left.freq() + right.freq();
     let new_idx = trees.len();
     trees.push(HuffTree::Node {
         freq: combined_freq,
         left: Box::new(left),
         right: Box::new(right),
     });
     heap.push((Reverse(combined_freq), new_idx));
 }

 heap.pop().map(|(_, i)| std::mem::replace(
     &mut trees[i], HuffTree::Leaf { ch: '\0', freq: 0 }
 ))
}

// Extract codes: left = '0', right = '1'
fn huffman_codes(tree: &HuffTree) -> Vec<(char, String)> {
 let mut codes = vec![];
 fn walk(node: &HuffTree, prefix: String, codes: &mut Vec<(char, String)>) {
     match node {
         HuffTree::Leaf { ch, .. } => codes.push((*ch, prefix)),
         HuffTree::Node { left, right, .. } => {
             walk(left,  prefix.clone() + "0", codes);
             walk(right, prefix        + "1", codes);
         }
     }
 }
 walk(tree, String::new(), &mut codes);
 codes
}
`Reverse<usize>` wraps a value to invert `BinaryHeap`'s default max-heap behavior — `Reverse(3) < Reverse(5)` is true, so the heap pops smallest frequency first. This is the idiomatic Rust min-heap pattern. `std::mem::replace` moves values out of a Vec without cloning — essential when the type doesn't implement `Copy`. The placeholder `HuffTree::Leaf { ch: '\0', freq: 0 }` is never used; it's just dropped. For activity selection, packing `(end, start, id)` as a tuple sorts by `end` first naturally — Rust's derived `Ord` on tuples compares lexicographically.

What This Unlocks

Key Differences

ConceptOCamlRust
Min-heap`priority_queue` library or manual`BinaryHeap` + `Reverse<T>` wrapper — standard library
Recursive tree ADT`type tree = Leaf of ... \Node of ...``enum HuffTree { Leaf { .. }, Node { .. } }`
Move out of VecPattern match (GC handles)`std::mem::replace` — explicit move with placeholder
Sort by end time`List.sort (fun (s1,e1,_) (s2,e2,_) -> compare e1 e2)`Tuple sort `(end, start, id)` — lexicographic by default
Greedy terminationPattern match on empty list`while heap.len() > 1` loop
/// Greedy Algorithms: Activity Selection and Huffman Coding.
///
/// Activity selection: sort by finish time, greedily pick non-overlapping.
/// Huffman: repeatedly merge two lowest-frequency nodes.

use std::cmp::Reverse;
use std::collections::{BinaryHeap, HashMap};

// ─── Activity Selection ───────────────────────────────────────────────────────

#[derive(Debug, Clone)]
struct Activity {
    id: usize,
    start: u64,
    finish: u64,
}

/// Return maximum set of non-overlapping activities (greedy by earliest finish).
fn activity_selection(activities: &[Activity]) -> Vec<&Activity> {
    let mut sorted: Vec<&Activity> = activities.iter().collect();
    sorted.sort_by_key(|a| a.finish);

    let mut selected = Vec::new();
    let mut last_finish = 0u64;

    for a in sorted {
        if a.start >= last_finish {
            selected.push(a);
            last_finish = a.finish;
        }
    }
    selected
}

// ─── Huffman Coding ──────────────────────────────────────────────────────────

#[derive(Debug)]
enum HuffmanNode {
    Leaf { symbol: char, freq: u64 },
    Internal { freq: u64, left: Box<HuffmanNode>, right: Box<HuffmanNode> },
}

impl HuffmanNode {
    fn freq(&self) -> u64 {
        match self {
            HuffmanNode::Leaf { freq, .. } => *freq,
            HuffmanNode::Internal { freq, .. } => *freq,
        }
    }
}

// Wrapper for BinaryHeap (min-heap via Reverse)
struct HeapItem(Box<HuffmanNode>);

impl PartialEq for HeapItem {
    fn eq(&self, other: &Self) -> bool { self.0.freq() == other.0.freq() }
}
impl Eq for HeapItem {}
impl PartialOrd for HeapItem {
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> { Some(self.cmp(other)) }
}
impl Ord for HeapItem {
    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
        // Min-heap: invert comparison
        other.0.freq().cmp(&self.0.freq())
    }
}

/// Build Huffman tree from (symbol, frequency) pairs.
fn huffman_tree(symbols: &[(char, u64)]) -> Option<Box<HuffmanNode>> {
    if symbols.is_empty() { return None; }

    let mut heap: BinaryHeap<HeapItem> = symbols
        .iter()
        .map(|&(symbol, freq)| HeapItem(Box::new(HuffmanNode::Leaf { symbol, freq })))
        .collect();

    while heap.len() > 1 {
        let HeapItem(left) = heap.pop().unwrap();
        let HeapItem(right) = heap.pop().unwrap();
        let freq = left.freq() + right.freq();
        let merged = Box::new(HuffmanNode::Internal { freq, left, right });
        heap.push(HeapItem(merged));
    }

    heap.pop().map(|HeapItem(node)| node)
}

/// Extract code table: symbol → binary string.
fn huffman_codes(node: &HuffmanNode) -> HashMap<char, String> {
    let mut codes = HashMap::new();
    traverse(node, String::new(), &mut codes);
    codes
}

fn traverse(node: &HuffmanNode, prefix: String, codes: &mut HashMap<char, String>) {
    match node {
        HuffmanNode::Leaf { symbol, .. } => {
            codes.insert(*symbol, if prefix.is_empty() { "0".to_string() } else { prefix });
        }
        HuffmanNode::Internal { left, right, .. } => {
            traverse(left, format!("{prefix}0"), codes);
            traverse(right, format!("{prefix}1"), codes);
        }
    }
}

/// Compute total encoded bits for a message.
fn encoded_bits(text: &str, codes: &HashMap<char, String>) -> usize {
    text.chars().map(|c| codes[&c].len()).sum()
}

fn main() {
    // Activity selection
    let activities = vec![
        Activity { id: 1, start: 1, finish: 4 },
        Activity { id: 2, start: 3, finish: 5 },
        Activity { id: 3, start: 0, finish: 6 },
        Activity { id: 4, start: 5, finish: 7 },
        Activity { id: 5, start: 3, finish: 9 },
        Activity { id: 6, start: 5, finish: 9 },
        Activity { id: 7, start: 6, finish: 10 },
        Activity { id: 8, start: 8, finish: 11 },
    ];
    let selected = activity_selection(&activities);
    let ids: Vec<usize> = selected.iter().map(|a| a.id).collect();
    println!("Activity selection: {} selected, ids={ids:?}", selected.len());

    // Huffman
    let freqs = [('a', 5u64), ('b', 2), ('c', 1), ('d', 3), ('e', 4), ('f', 7)];
    let tree = huffman_tree(&freqs).unwrap();
    let mut codes: Vec<(char, String)> = huffman_codes(&tree).into_iter().collect();
    codes.sort();
    println!("\nHuffman codes:");
    for (c, code) in &codes {
        let freq = freqs.iter().find(|&&(ch, _)| ch == *c).map(|&(_, f)| f).unwrap();
        println!("  '{c}' (freq={freq}): {code}");
    }
    let total_bits: u64 = freqs.iter().map(|&(c, f)| {
        let code = codes.iter().find(|(ch, _)| *ch == c).map(|(_, code)| code.len()).unwrap();
        f * code as u64
    }).sum();
    println!("Total weighted bits: {total_bits}");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_activity_selection_count() {
        let acts = vec![
            Activity { id: 1, start: 1, finish: 4 },
            Activity { id: 2, start: 3, finish: 5 },
            Activity { id: 3, start: 0, finish: 6 },
            Activity { id: 4, start: 5, finish: 7 },
            Activity { id: 5, start: 8, finish: 11 },
        ];
        let sel = activity_selection(&acts);
        assert_eq!(sel.len(), 3);
    }

    #[test]
    fn test_activity_selection_non_overlapping() {
        let acts = vec![
            Activity { id: 1, start: 0, finish: 2 },
            Activity { id: 2, start: 1, finish: 3 },
            Activity { id: 3, start: 2, finish: 4 },
        ];
        let sel = activity_selection(&acts);
        // Non-overlapping: check no two overlap
        for i in 0..sel.len() {
            for j in i + 1..sel.len() {
                assert!(sel[i].finish <= sel[j].start || sel[j].finish <= sel[i].start);
            }
        }
    }

    #[test]
    fn test_huffman_prefix_free() {
        let freqs = [('a', 5u64), ('b', 2), ('c', 1), ('d', 3)];
        let tree = huffman_tree(&freqs).unwrap();
        let codes = huffman_codes(&tree);
        // Prefix-free: no code is a prefix of another
        let code_list: Vec<String> = codes.values().cloned().collect();
        for (i, a) in code_list.iter().enumerate() {
            for (j, b) in code_list.iter().enumerate() {
                if i != j {
                    assert!(!b.starts_with(a.as_str()), "{b} starts with {a}");
                }
            }
        }
    }

    #[test]
    fn test_huffman_all_symbols_covered() {
        let freqs = [('a', 1u64), ('b', 2), ('c', 3), ('d', 4)];
        let tree = huffman_tree(&freqs).unwrap();
        let codes = huffman_codes(&tree);
        assert_eq!(codes.len(), 4);
        for (c, _) in &freqs {
            assert!(codes.contains_key(c), "'{c}' missing from codes");
        }
    }
}
(* Greedy Algorithms: Activity Selection + Huffman Coding in OCaml *)

(* Activity Selection Problem *)
type activity = { id: int; start: int; finish: int }

(* Greedy: sort by finish time, pick non-overlapping activities *)
let activity_selection (activities : activity list) : activity list =
  let sorted = List.sort (fun a b -> compare a.finish b.finish) activities in
  let rec select last_finish = function
    | [] -> []
    | a :: rest ->
      if a.start >= last_finish then
        a :: select a.finish rest
      else
        select last_finish rest
  in
  select 0 sorted

(* Huffman Coding *)
type huffman =
  | Leaf of { symbol: char; freq: int }
  | Node of { freq: int; left: huffman; right: huffman }

let freq_of = function Leaf l -> l.freq | Node n -> n.freq

(* Simple priority queue using sorted list *)
let insert_sorted tree lst =
  let rec ins = function
    | [] -> [tree]
    | h :: t ->
      if freq_of tree <= freq_of h then tree :: h :: t
      else h :: ins t
  in
  ins lst

let huffman_tree (symbols : (char * int) list) : huffman option =
  match symbols with
  | [] -> None
  | _ ->
    let initial = List.sort (fun (_, a) (_, b) -> compare a b)
      (List.map (fun (c, f) -> Leaf { symbol=c; freq=f }) symbols)
    in
    let rec build = function
      | [tree] -> tree
      | a :: b :: rest ->
        let merged = Node { freq = freq_of a + freq_of b; left = a; right = b } in
        build (insert_sorted merged rest)
      | [] -> failwith "empty"
    in
    Some (build initial)

(* Extract code table: symbol -> binary string *)
let huffman_codes (tree : huffman) : (char * string) list =
  let codes = ref [] in
  let rec traverse prefix = function
    | Leaf l -> codes := (l.symbol, if prefix = "" then "0" else prefix) :: !codes
    | Node n ->
      traverse (prefix ^ "0") n.left;
      traverse (prefix ^ "1") n.right
  in
  traverse "" tree;
  List.sort compare !codes

let () =
  (* Activity selection *)
  let acts = [
    {id=1; start=1; finish=4};
    {id=2; start=3; finish=5};
    {id=3; start=0; finish=6};
    {id=4; start=5; finish=7};
    {id=5; start=3; finish=9};
    {id=6; start=5; finish=9};
    {id=7; start=6; finish=10};
    {id=8; start=8; finish=11};
    {id=9; start=8; finish=12};
    {id=10; start=2; finish=14};
  ] in
  let selected = activity_selection acts in
  Printf.printf "Activity selection: %d activities selected (ids: %s)\n"
    (List.length selected)
    (String.concat "," (List.map (fun a -> string_of_int a.id) selected));

  (* Huffman coding *)
  let freqs = [('a', 5); ('b', 2); ('c', 1); ('d', 3); ('e', 4); ('f', 7)] in
  match huffman_tree freqs with
  | None -> ()
  | Some tree ->
    let codes = huffman_codes tree in
    Printf.printf "\nHuffman codes:\n";
    List.iter (fun (c, code) ->
      Printf.printf "  '%c' (freq=%d): %s\n" c
        (List.assoc c freqs) code
    ) codes