๐Ÿฆ€ Functional Rust

848: Big-O Reasoning and Complexity Analysis in Rust

Difficulty: 3 Level: Intermediate A practical reference for algorithmic complexity in Rust: every complexity class illustrated with idiomatic code, plus Rust-specific constants that make O() analysis misleading.

The Problem This Solves

Knowing an algorithm is O(n log n) is necessary but not sufficient. Two O(n log n) algorithms can differ by 10ร— in practice because of cache behavior, branch prediction, allocator overhead, and SIMD potential. Rust's zero-cost abstractions mean high-level iterator code is often as fast as hand-written C โ€” but that "often" has precise conditions you need to understand. This guide answers: when does O(nยฒ) actually beat O(n log n)? (Answer: for n < ~32, insertion sort beats merge sort because of cache and branch predictor advantages.) When does O(1) HashMap::get actually degrade? (Answer: under adversarial keys with SipHash collision, though this is rare by design.) When should you use BTreeMap over HashMap? (When you need ordered iteration โ€” BTreeMap's O(log n) with cache-friendly B-tree nodes can beat HashMap's O(1) for small maps.) For Rust specifically: iterator chains are lazy and fuse into a single pass โ€” `v.iter().filter(...).map(...).collect()` is O(n) with no intermediate allocations. Understanding this prevents unnecessary micro-optimizations.

The Intuition

Complexity classes form a hierarchy: O(1) < O(log n) < O(n) < O(n log n) < O(nยฒ) < O(2^n) < O(n!). For n = 10^6: O(log n) โ‰ˆ 20 ops, O(n) = 10^6 ops, O(n log n) โ‰ˆ 2ร—10^7, O(nยฒ) = 10^12 โ€” effectively impossible. The boundary between "fast" and "slow" is roughly O(n log n) for n up to 10^7 in a 1-second time limit. The master theorem handles D&C recurrences: T(n) = aร—T(n/b) + f(n). Three cases: when f dominates (Case 3), when log dominates (Case 1), when they're equal (Case 2, gives the n log n result for merge sort).

How It Works in Rust

// O(1): Direct array access, HashMap lookup (amortized)
fn constant_access(v: &[i32], i: usize) -> i32 { v[i] }

// O(log n): Binary search โ€” halves search space each step
fn binary_search(arr: &[i32], target: i32) -> Option<usize> {
 arr.binary_search(&target).ok()  // std library: use this, not your own
}

// O(n): Single linear scan โ€” iterate lazily, no intermediate alloc
fn linear_max(v: &[i32]) -> Option<i32> { v.iter().copied().max() }

// O(n log n): Sort โ€” Rust uses pdqsort (unstable) or timsort (stable)
// pdqsort is O(n) for nearly-sorted, O(n log n) worst case
fn sort_demo(mut v: Vec<i32>) -> Vec<i32> {
 v.sort_unstable();  // Faster than sort() when stability not needed
 v
}

// O(nยฒ): Insertion sort โ€” optimal for n < ~32 due to cache/branch predictor
fn insertion_sort(v: &mut [i32]) {
 for i in 1..v.len() {
     let key = v[i];
     let mut j = i;
     while j > 0 && v[j - 1] > key { v[j] = v[j - 1]; j -= 1; }
     v[j] = key;
 }
}

// Rust-specific: iterator chains are O(n), NOT O(kร—n) for k operations
// This is one pass, zero intermediate allocations:
fn process(v: &[i32]) -> Vec<i32> {
 v.iter()
     .filter(|&&x| x > 0)    // Lazy: no allocation
     .map(|&x| x * 2)         // Lazy: no allocation
     .collect()               // One allocation for final Vec
}

What This Unlocks

Key Differences

ConceptOCamlRust
Sort stability`List.sort` is stable`sort` stable (timsort), `sort_unstable` faster (pdqsort)
HashMap`Hashtbl` โ€” open addressing`HashMap` โ€” SipHash by default, swap for `FxHashMap` if speed needed
Iterator chainsEager by default; use `Seq` for lazyLazy by default โ€” fuse without allocation
`Vec::push``Array.append` โ€” O(1) amortizedSame amortized O(1); doubling strategy
BTreeMap`Map` module (AVL tree)`BTreeMap` โ€” B-tree, better cache behavior than AVL for sequential access
/// Algorithm Complexity Guide โ€” Rust Reference.
///
/// Demonstrates each complexity class with idiomatic Rust code.
/// Includes empirical timing and Rust-specific complexity notes.

use std::collections::{BTreeMap, HashMap};
use std::time::Instant;

// โ”€โ”€โ”€ O(1) โ€” Constant time โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// O(1): Direct array access, HashMap lookup.
fn constant_access(v: &[i32], i: usize) -> i32 { v[i] }

fn hashmap_lookup(map: &HashMap<&str, i32>, key: &str) -> Option<i32> {
    map.get(key).copied()
}

// โ”€โ”€โ”€ O(log n) โ€” Logarithmic โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// O(log n): Binary search. Each step halves the search space.
fn binary_search(arr: &[i32], target: i32) -> Option<usize> {
    let (mut lo, mut hi) = (0usize, arr.len());
    while lo < hi {
        let mid = lo + (hi - lo) / 2;
        match arr[mid].cmp(&target) {
            std::cmp::Ordering::Equal => return Some(mid),
            std::cmp::Ordering::Less => lo = mid + 1,
            std::cmp::Ordering::Greater => hi = mid,
        }
    }
    None
}

// โ”€โ”€โ”€ O(n) โ€” Linear โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// O(n): Single pass through data.
fn linear_max(v: &[i32]) -> Option<i32> { v.iter().copied().max() }

fn linear_contains(v: &[i32], target: i32) -> bool { v.contains(&target) }

// โ”€โ”€โ”€ O(n log n) โ€” Linearithmic โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// O(n log n): Standard sort. Rust's sort is TimSort (stable) or pdqsort (unstable).
fn sort_demo(mut v: Vec<i32>) -> Vec<i32> {
    v.sort_unstable(); // slightly faster when stability not needed
    v
}

/// O(n log n): Count distinct elements via BTreeMap (sorted order).
fn count_distinct_sorted(v: &[i32]) -> BTreeMap<i32, usize> {
    let mut map = BTreeMap::new();
    for &x in v { *map.entry(x).or_insert(0) += 1; }
    map
}

// โ”€โ”€โ”€ O(nยฒ) โ€” Quadratic โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// O(nยฒ): Insertion sort โ€” optimal for nearly-sorted or tiny arrays (n < ~32).
fn insertion_sort(v: &mut [i32]) {
    for i in 1..v.len() {
        let key = v[i];
        let mut j = i;
        while j > 0 && v[j - 1] > key {
            v[j] = v[j - 1];
            j -= 1;
        }
        v[j] = key;
    }
}

/// O(nยฒ): All pairs โ€” useful when n is small or pairs are sparse.
fn all_pairs_sum(v: &[i32]) -> Vec<i32> {
    v.iter().flat_map(|&a| v.iter().map(move |&b| a + b)).collect()
}

// โ”€โ”€โ”€ O(2โฟ) โ€” Exponential โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// O(2โฟ): Generate all subsets.
fn all_subsets(v: &[i32]) -> Vec<Vec<i32>> {
    let n = v.len();
    (0u64..1 << n)
        .map(|mask| (0..n).filter(|&i| mask >> i & 1 == 1).map(|i| v[i]).collect())
        .collect()
}

// โ”€โ”€โ”€ Complexity class summary โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// Demonstrate master theorem recurrences:
/// T(n) = aยทT(n/b) + f(n)
///
/// Case 1: f(n) = O(n^(log_b a - ฮต)) โ†’ T(n) = ฮ˜(n^(log_b a))
/// Case 2: f(n) = ฮ˜(n^(log_b a))      โ†’ T(n) = ฮ˜(n^(log_b a) ยท log n)
/// Case 3: f(n) = ฮฉ(n^(log_b a + ฮต)) โ†’ T(n) = ฮ˜(f(n))
///
/// Examples:
///   Merge sort:     a=2, b=2, f=O(n)    โ†’ Case 2 โ†’ O(n log n)
///   Binary search:  a=1, b=2, f=O(1)    โ†’ Case 2 โ†’ O(log n)
///   Strassen:       a=7, b=2, f=O(nยฒ)   โ†’ Case 1 โ†’ O(n^2.807)
///   Linear search:  a=1, b=2, f=O(n)    โ†’ Case 3 โ†’ O(n)

fn print_complexity_table() {
    println!("\nโ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฆโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฆโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—");
    println!("โ•‘ Class                โ•‘ n=1000    โ•‘ Rust context              โ•‘");
    println!("โ• โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฌโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฌโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฃ");
    println!("โ•‘ O(1)                 โ•‘ 1         โ•‘ arr[i], HashMap::get      โ•‘");
    println!("โ•‘ O(log n)             โ•‘ 10        โ•‘ binary_search, BTreeMap   โ•‘");
    println!("โ•‘ O(n)                 โ•‘ 1,000     โ•‘ iter, contains, max       โ•‘");
    println!("โ•‘ O(n log n)           โ•‘ 10,000    โ•‘ sort, sort_unstable        โ•‘");
    println!("โ•‘ O(nยฒ)               โ•‘ 10โถ      โ•‘ nested loops (small n ok) โ•‘");
    println!("โ•‘ O(nยณ)               โ•‘ 10โน      โ•‘ Floyd-Warshall (n<500)    โ•‘");
    println!("โ•‘ O(2โฟ)               โ•‘ 10ยณโฐโฐ    โ•‘ subsets (only n<25)       โ•‘");
    println!("โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฉโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ฉโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•");
}

/// Empirical timing comparison: O(n log n) sort vs O(nยฒ) insertion sort.
fn timing_demo() {
    let sizes = [100usize, 1000, 5000];
    println!("\nโ”€โ”€ Empirical timing โ”€โ”€");
    for &n in &sizes {
        let data: Vec<i32> = (0..n as i32).rev().collect(); // worst case: reverse sorted

        let mut v1 = data.clone();
        let t1 = Instant::now();
        insertion_sort(&mut v1);
        let d1 = t1.elapsed();

        let t2 = Instant::now();
        let _v2 = sort_demo(data);
        let d2 = t2.elapsed();

        println!("  n={n:>5}: insertion_sort={d1:>10?}  std::sort={d2:>10?}  ratio={:.1}x",
            d1.as_nanos() as f64 / d2.as_nanos().max(1) as f64);
    }
}

fn main() {
    // O(1)
    let arr = vec![10, 20, 30, 40, 50];
    println!("O(1) arr[2] = {}", constant_access(&arr, 2));

    // O(log n)
    let sorted = vec![1, 3, 5, 7, 9, 11, 13];
    println!("O(log n) binary_search(7) = {:?}", binary_search(&sorted, 7));

    // O(n)
    println!("O(n) max = {:?}", linear_max(&arr));

    // O(n log n)
    let unsorted = vec![5, 3, 8, 1, 9, 2];
    println!("O(n log n) sort = {:?}", sort_demo(unsorted));

    // O(nยฒ)
    let mut v = vec![5, 3, 8, 1, 9, 2];
    insertion_sort(&mut v);
    println!("O(nยฒ) insertion_sort = {v:?}");

    // O(2โฟ) โ€” only for tiny n!
    let small = vec![1, 2, 3];
    println!("O(2โฟ) all_subsets([1,2,3]) = {} subsets", all_subsets(&small).len());

    print_complexity_table();
    timing_demo();

    println!("\nโ”€โ”€ Rust complexity notes โ”€โ”€");
    println!("  Vec::push:      amortised O(1) โ€” doubles capacity when full");
    println!("  HashMap::get:   O(1) average (SipHash), O(n) worst case");
    println!("  BTreeMap::get:  O(log n) always โ€” ordered, cache-friendlier for iteration");
    println!("  Iterator chain: lazy O(n), no intermediate allocations");
    println!("  sort_unstable:  O(n log n) worst case, faster than sort in practice");
    println!("  Advice: profile first, then optimise โ€” LLVM often surprises you");
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_constant() {
        assert_eq!(constant_access(&[10, 20, 30], 1), 20);
    }

    #[test]
    fn test_binary_search() {
        let arr = vec![1, 3, 5, 7, 9];
        assert_eq!(binary_search(&arr, 5), Some(2));
        assert_eq!(binary_search(&arr, 6), None);
        assert_eq!(binary_search(&arr, 1), Some(0));
        assert_eq!(binary_search(&arr, 9), Some(4));
    }

    #[test]
    fn test_linear_max() {
        assert_eq!(linear_max(&[3, 1, 4, 1, 5, 9, 2, 6]), Some(9));
        assert_eq!(linear_max(&[]), None);
    }

    #[test]
    fn test_sort() {
        assert_eq!(sort_demo(vec![5, 3, 8, 1, 9]), vec![1, 3, 5, 8, 9]);
    }

    #[test]
    fn test_insertion_sort() {
        let mut v = vec![5, 3, 8, 1, 9, 2];
        insertion_sort(&mut v);
        assert_eq!(v, vec![1, 2, 3, 5, 8, 9]);
    }

    #[test]
    fn test_all_subsets_count() {
        assert_eq!(all_subsets(&[1, 2, 3]).len(), 8); // 2ยณ
        assert_eq!(all_subsets(&[1, 2, 3, 4]).len(), 16); // 2โด
    }

    #[test]
    fn test_insertion_sort_matches_std() {
        let data = vec![7i32, 3, 9, 1, 5, 8, 2, 6, 4];
        let mut v1 = data.clone();
        let v2 = sort_demo(data);
        insertion_sort(&mut v1);
        assert_eq!(v1, v2);
    }

    #[test]
    fn test_complexity_empirical() {
        // Verify that sort is actually O(n log n) in practice: measure ratio
        let n = 1000;
        let data: Vec<i32> = (0..n as i32).rev().collect();

        let mut v_ins = data.clone();
        let t1 = Instant::now();
        insertion_sort(&mut v_ins);
        let ins_ns = t1.elapsed().as_nanos();

        let t2 = Instant::now();
        let _v_std = sort_demo(data);
        let std_ns = t2.elapsed().as_nanos();

        // std sort should be faster for n=1000
        // (Not guaranteed in tests, but overwhelmingly true)
        println!("insertion_sort: {ins_ns}ns, std::sort: {std_ns}ns");
        // Don't assert timing โ€” flaky in CI; just verify correctness
        assert!(v_ins.windows(2).all(|w| w[0] <= w[1]));
    }
}
(* Algorithm Complexity Guide in OCaml โ€” reference implementations *)

(* O(1) โ€” Constant time: direct access *)
let constant_example arr i = Array.get arr i

(* O(log n) โ€” Logarithmic: binary search *)
let rec binary_search arr target lo hi =
  if lo > hi then None
  else
    let mid = (lo + hi) / 2 in
    if arr.(mid) = target then Some mid
    else if arr.(mid) < target then binary_search arr target (mid + 1) hi
    else binary_search arr target lo (mid - 1)

(* O(n) โ€” Linear: find maximum *)
let linear_max arr =
  Array.fold_left max min_int arr

(* O(n log n) โ€” Linearithmic: merge sort *)
let rec merge_sort = function
  | [] | [_] as xs -> xs
  | xs ->
    let n = List.length xs in
    let l = List.filteri (fun i _ -> i < n/2) xs in
    let r = List.filteri (fun i _ -> i >= n/2) xs in
    let rec merge a b = match a, b with
      | [], b -> b | a, [] -> a
      | x::xs, y::ys -> if x <= y then x :: merge xs b else y :: merge a ys
    in
    merge (merge_sort l) (merge_sort r)

(* O(nยฒ) โ€” Quadratic: insertion sort *)
let insertion_sort arr =
  let n = Array.length arr in
  for i = 1 to n - 1 do
    let key = arr.(i) in
    let j = ref (i - 1) in
    while !j >= 0 && arr.(!j) > key do
      arr.(!j + 1) <- arr.(!j);
      decr j
    done;
    arr.(!j + 1) <- key
  done

(* Master theorem examples *)
(* T(n) = 2T(n/2) + O(n)  โ†’ O(n log n)  [merge sort] *)
(* T(n) = 1T(n/2) + O(1)  โ†’ O(log n)    [binary search] *)
(* T(n) = 2T(n/2) + O(1)  โ†’ O(n)        [tree traversal] *)
(* T(n) = 2T(n-1) + O(1)  โ†’ O(2^n)      [naive Fibonacci] *)

let () =
  let arr = Array.init 20 (fun i -> i * 2) in
  Printf.printf "O(1) โ€” arr[5] = %d\n" (constant_example arr 5);
  Printf.printf "O(log n) โ€” binary_search(10) = %s\n"
    (match binary_search arr 10 0 19 with Some i -> string_of_int i | None -> "None");
  Printf.printf "O(n) โ€” max of [0,2,..,38] = %d\n" (linear_max arr);
  Printf.printf "O(n log n) โ€” merge_sort([3,1,4,1,5,9]) = [%s]\n"
    (String.concat "," (List.map string_of_int (merge_sort [3;1;4;1;5;9])));

  (* Empirical timing example: show nยฒ vs n log n difference *)
  let sizes = [100; 1000; 10000] in
  List.iter (fun n ->
    let a = Array.init n (fun i -> n - i) in
    insertion_sort a;
    Printf.printf "insertion_sort(%d): sorted = %b\n" n
      (Array.for_all2 (<=) (Array.sub a 0 (n-1)) (Array.sub a 1 (n-1)))
  ) sizes