🦀 Functional Rust

824: Sieve of Eratosthenes

Difficulty: 3 Level: Intermediate Generate all primes up to N in O(n log log n) by iteratively marking composites — the oldest and most cache-friendly batch prime algorithm.

The Problem This Solves

When you need many primes — RSA key generation, primality sieves in cryptographic libraries, competitive programming precomputation, prime-counting functions — you need a batch algorithm, not per-number testing. Trial division checks one number in O(√n); the sieve amortizes this across all numbers up to N and achieves O(n log log n) total, which in practice is nearly linear. The sieve is also the canonical example of a simple, cache-friendly computation that benefits enormously from modern CPU memory hierarchies. The inner marking loop is a stride-p write over a contiguous boolean array — exactly the access pattern that hardware prefetchers handle well. Using `Vec<bool>` (or a bit-vector) over a `HashSet` can be 10-50× faster in practice because of this. The segmented sieve variant (also shown) processes the range `[lo, hi]` using only O(√hi) base primes — essential when N is too large for a single array (e.g., primes up to 10^12 in a problem).

The Intuition

Mark every multiple of p starting from p² as composite. Start from p² because all smaller multiples of p have already been marked by smaller primes. Only sieve up to √N because any composite ≤ N has a prime factor ≤ √N. The inner loop runs `N/p` times per prime p; summing over all primes gives Σ N/p ≈ N × log log N by Mertens' theorem — the source of the O(n log log n) complexity. OCaml uses `Array.make n true` with a for loop; Rust's `.step_by(p)` range is identical in behavior, slightly more idiomatic.

How It Works in Rust

fn sieve(limit: usize) -> Vec<usize> {
 if limit < 2 { return vec![]; }
 let mut is_prime = vec![true; limit + 1];
 is_prime[0] = false;
 is_prime[1] = false;

 let sqrt_limit = (limit as f64).sqrt() as usize;
 for p in 2..=sqrt_limit {
     if is_prime[p] {
         // Start at p² — all smaller multiples already marked
         // step_by(p): stride-p access is cache-prefetcher-friendly
         for j in (p * p..=limit).step_by(p) {
             is_prime[j] = false;  // O(1) per mark, N/p total
         }
     }
 }

 // Collect: enumerate gives (index, value) — filter_map keeps only primes
 is_prime.iter().enumerate()
     .filter_map(|(i, &b)| if b { Some(i) } else { None })
     .collect()
}

// Returns the sieve itself for O(1) primality queries
fn prime_sieve(limit: usize) -> Vec<bool> {
 // Same construction; returns the boolean array
 // O(1) per query after O(n log log n) build — the key practical advantage
 // ...
}
For memory-critical applications: replace `Vec<bool>` with a `Vec<u64>` bit-packed manually or the `bit-vec` crate — reduces memory by 8×, improving L1/L2 cache utilization dramatically for large N.

What This Unlocks

Key Differences

ConceptOCamlRust
Boolean array`Array.make (n+1) true``vec![true; n + 1]`
Mark multiples`for j = pp to n step p do``(pp..=n).step_by(p)` — lazy range
Integer sqrt`int_of_float (sqrt (float_of_int n))``(n as f64).sqrt() as usize`
Collect primes`Array.to_seqi> Seq.filter_map``.enumerate().filter_map(...)`
Bit-packingExternal library neededManual `Vec<u64>` or `bit-vec` crate
/// Sieve of Eratosthenes — generate all primes up to N in O(n log log n).

/// Returns all primes ≤ limit.
fn sieve(limit: usize) -> Vec<usize> {
    if limit < 2 {
        return vec![];
    }
    let mut is_prime = vec![true; limit + 1];
    is_prime[0] = false;
    is_prime[1] = false;
    let sqrt_limit = (limit as f64).sqrt() as usize;
    for p in 2..=sqrt_limit {
        if is_prime[p] {
            // Mark multiples of p starting from p²
            for j in (p * p..=limit).step_by(p) {
                is_prime[j] = false;
            }
        }
    }
    is_prime
        .iter()
        .enumerate()
        .filter_map(|(i, &b)| if b { Some(i) } else { None })
        .collect()
}

/// Returns a boolean sieve (useful for O(1) primality queries after O(n) build).
fn prime_sieve(limit: usize) -> Vec<bool> {
    if limit < 2 {
        return vec![false; limit + 1];
    }
    let mut is_prime = vec![true; limit + 1];
    is_prime[0] = false;
    is_prime[1] = false;
    let sqrt_limit = (limit as f64).sqrt() as usize;
    for p in 2..=sqrt_limit {
        if is_prime[p] {
            for j in (p * p..=limit).step_by(p) {
                is_prime[j] = false;
            }
        }
    }
    is_prime
}

/// Prime-counting function π(n): number of primes ≤ n.
fn prime_pi(n: usize) -> usize {
    prime_sieve(n).iter().filter(|&&b| b).count()
}

/// Segmented sieve: all primes in [lo, hi].
fn segmented_sieve(lo: usize, hi: usize) -> Vec<usize> {
    let base_primes = sieve((hi as f64).sqrt() as usize + 1);
    let size = hi - lo + 1;
    let mut is_prime = vec![true; size];
    // 0 and 1 are not prime
    if lo == 0 { is_prime[0] = false; }
    if lo <= 1 && 1 <= hi { is_prime[1 - lo] = false; }

    for p in base_primes {
        // First multiple of p in [lo, hi]
        let start = (lo + p - 1) / p * p;
        let start = if start == p { p + p } else { start };
        if start > hi { continue; }
        for j in (start..=hi).step_by(p) {
            is_prime[j - lo] = false;
        }
    }
    is_prime
        .iter()
        .enumerate()
        .filter_map(|(i, &b)| if b { Some(lo + i) } else { None })
        .collect()
}

fn main() {
    let primes = sieve(50);
    println!("Primes ≤ 50: {:?}", primes);
    println!("π(100) = {} (expected 25)", prime_pi(100));
    println!("π(1000) = {} (expected 168)", prime_pi(1000));

    let seg = segmented_sieve(10, 50);
    println!("Primes in [10,50]: {:?}", seg);

    // Quick primality check using sieve
    let sieve_100 = prime_sieve(100);
    println!("Is 97 prime? {}", sieve_100[97]);
    println!("Is 100 prime? {}", sieve_100[100]);
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_primes_to_30() {
        assert_eq!(sieve(30), vec![2, 3, 5, 7, 11, 13, 17, 19, 23, 29]);
    }

    #[test]
    fn test_prime_pi_25() {
        assert_eq!(prime_pi(100), 25);
    }

    #[test]
    fn test_prime_pi_168() {
        assert_eq!(prime_pi(1000), 168);
    }

    #[test]
    fn test_no_primes_below_2() {
        assert_eq!(sieve(1), vec![]);
        assert_eq!(sieve(0), vec![]);
    }

    #[test]
    fn test_sieve_2() {
        assert_eq!(sieve(2), vec![2]);
    }

    #[test]
    fn test_segmented_matches_basic() {
        let full = sieve(100);
        let seg: Vec<usize> = full.into_iter().filter(|&p| p >= 10).collect();
        assert_eq!(segmented_sieve(10, 100), seg);
    }

    #[test]
    fn test_97_is_prime() {
        let s = prime_sieve(100);
        assert!(s[97]);
        assert!(!s[98]);
        assert!(!s[99]);
        assert!(!s[100]);
    }

    #[test]
    fn test_twin_primes() {
        // 11,13 — 17,19 — 29,31 all twin primes ≤ 32
        let s = prime_sieve(32);
        assert!(s[11] && s[13]);
        assert!(s[17] && s[19]);
        assert!(s[29] && s[31]);
    }
}
(* Sieve of Eratosthenes in OCaml *)

(* Return all primes up to and including limit *)
let sieve (limit : int) : int list =
  if limit < 2 then []
  else begin
    let is_prime = Array.make (limit + 1) true in
    is_prime.(0) <- false;
    is_prime.(1) <- false;
    let sqrt_limit = int_of_float (sqrt (float_of_int limit)) in
    for p = 2 to sqrt_limit do
      if is_prime.(p) then begin
        let j = ref (p * p) in
        while !j <= limit do
          is_prime.(!j) <- false;
          j := !j + p
        done
      end
    done;
    (* Collect all primes functionally *)
    Array.to_seq is_prime
    |> Seq.filter_mapi (fun i b -> if b then Some i else None)
    |> List.of_seq
  end

(* Count primes up to n (prime-counting function π(n)) *)
let prime_pi (n : int) : int =
  List.length (sieve n)

(* Segmented sieve for large ranges — sieve [lo, hi] *)
let segmented_sieve (lo : int) (hi : int) : int list =
  let base_primes = sieve (int_of_float (sqrt (float_of_int hi)) + 1) in
  let size = hi - lo + 1 in
  let is_prime = Array.make size true in
  (* 0 and 1 are not prime *)
  if lo <= 1 then
    for i = lo to min 1 hi do is_prime.(i - lo) <- false done;
  List.iter (fun p ->
    let start = max (p * p) (((lo + p - 1) / p) * p) in
    let j = ref start in
    while !j <= hi do
      is_prime.(!j - lo) <- false;
      j := !j + p
    done
  ) base_primes;
  Array.to_seq is_prime
  |> Seq.filter_mapi (fun i b -> if b then Some (lo + i) else None)
  |> List.of_seq

let () =
  let primes_30 = sieve 30 in
  Printf.printf "Primes up to 30: [%s]\n"
    (String.concat "; " (List.map string_of_int primes_30));
  Printf.printf "π(100) = %d (expected 25)\n" (prime_pi 100);
  Printf.printf "Primes in [10, 50]: [%s]\n"
    (String.concat "; " (List.map string_of_int (segmented_sieve 10 50)))