๐Ÿฆ€ Functional Rust

845: Randomized Quickselect

Difficulty: 3 Level: Intermediate Find the k-th smallest element in O(n) average time without sorting โ€” the linear-time order statistic algorithm that powers percentile computation, median finding, and partial sorts.

The Problem This Solves

Finding the minimum or maximum is O(n). Finding the k-th smallest (a general order statistic) by sorting is O(n log n). Quickselect cuts this to O(n) average by partitioning around a pivot โ€” like quicksort, but only recursing into the partition that contains the k-th element, discarding the other half. Real-world use: finding the median (k = n/2) for statistical algorithms, computing the 99th percentile latency for SLO monitoring, partial sort (return the top-k elements), database ORDER BY LIMIT k without full sort, and the median-of-medians algorithm (which gives O(n) worst case). Without randomization, quickselect degrades to O(nยฒ) on adversarially sorted input (always picking the smallest or largest element as pivot). Randomized pivot selection โ€” choosing the pivot uniformly at random โ€” makes worst case negligibly unlikely in practice and reduces expected case to O(n).

The Intuition

Partition the array around a pivot: all elements less than the pivot go left, all greater go right. The pivot lands at some index `p`. If `p == k`, done. If `p > k`, the k-th element is in the left partition โ€” recurse there only. If `p < k`, it's in the right partition โ€” recurse there only. Each step discards at least one element (the pivot). On average, the partition roughly halves the search space: T(n) = T(n/2) + O(n) โ†’ O(n). Lomuto partition scheme: swap a random pivot to the end, sweep left-to-right moving elements smaller than pivot to the left partition, swap pivot to its final position. Rust's `&mut [T]` slice allows in-place swaps without pointer arithmetic.

How It Works in Rust

// XorShift PRNG: fast, no external crates, good enough for randomized pivot
struct Rng(u64);
impl Rng {
 fn new(seed: u64) -> Self { Rng(seed) }
 fn next_usize(&mut self, n: usize) -> usize {
     self.0 ^= self.0 << 13;
     self.0 ^= self.0 >> 7;
     self.0 ^= self.0 << 17;
     (self.0 as usize) % n
 }
}

// Lomuto partition: rearrange arr[lo..=hi] around a random pivot
// Returns the pivot's final position
fn partition<T: Ord>(arr: &mut [T], lo: usize, hi: usize, rng: &mut Rng) -> usize {
 let pivot_idx = lo + rng.next_usize(hi - lo + 1);  // Random pivot
 arr.swap(pivot_idx, hi);   // Move pivot to end
 let mut store = lo;
 for i in lo..hi {
     if arr[i] < arr[hi] {  // arr[hi] is the pivot
         arr.swap(store, i);
         store += 1;
     }
 }
 arr.swap(store, hi);       // Place pivot at its final position
 store
}

// Quickselect: find the k-th smallest (0-indexed) in O(n) average
pub fn quickselect<T: Ord>(arr: &mut [T], mut lo: usize, mut hi: usize, k: usize) -> &T {
 let mut rng = Rng::new(0xdeadbeef);
 loop {
     if lo == hi { return &arr[lo]; }
     let p = partition(arr, lo, hi, &mut rng);
     match p.cmp(&k) {
         std::cmp::Ordering::Equal   => return &arr[p],   // Found it
         std::cmp::Ordering::Greater => hi = p - 1,       // k is in left partition
         std::cmp::Ordering::Less    => lo = p + 1,       // k is in right partition
     }
 }
}

// 1-indexed wrapper that doesn't modify the original
pub fn kth_smallest<T: Ord + Clone>(arr: &[T], k: usize) -> T {
 assert!(k >= 1 && k <= arr.len());
 let mut copy = arr.to_vec();
 let n = copy.len();
 quickselect(&mut copy, 0, n - 1, k - 1).clone()
}
Rust's `&mut [T]` slice is the natural representation for in-place algorithms: `arr.swap(i, j)` is safe (bounds-checked in debug, elided in release) and communicates intent clearly.

What This Unlocks

Key Differences

ConceptOCamlRust
Random pivot`Random.int (hi - lo + 1) + lo`XorShift PRNG โ€” no external crates
In-place partition`Array.blit` or `Array.swap``arr.swap(i, j)` โ€” bounds-checked
Recursive vs iterativeNatural tail-recursive styleIterative `loop` โ€” no stack frames
Return by referenceNot idiomatic`&arr[p]` โ€” zero-copy, borrow-checked
Copy for non-destructive`Array.copy arr``.to_vec()` then `quickselect` on copy
/// Randomised Quickselect โ€” O(n) average time.
///
/// Find the k-th smallest element without fully sorting.
/// Random pivot avoids worst-case O(nยฒ) on sorted input.

/// Simple XorShift PRNG (no external crates needed).
struct Rng(u64);

impl Rng {
    fn new(seed: u64) -> Self { Rng(seed) }
    fn next_usize(&mut self, n: usize) -> usize {
        self.0 ^= self.0 << 13;
        self.0 ^= self.0 >> 7;
        self.0 ^= self.0 << 17;
        (self.0 as usize) % n
    }
}

/// Lomuto partition: rearrange arr[lo..=hi] around a random pivot.
/// Returns the final pivot index.
fn partition<T: Ord>(arr: &mut [T], lo: usize, hi: usize, rng: &mut Rng) -> usize {
    let pivot_idx = lo + rng.next_usize(hi - lo + 1);
    arr.swap(pivot_idx, hi);

    let mut store = lo;
    for i in lo..hi {
        if arr[i] < arr[hi] {
            arr.swap(store, i);
            store += 1;
        }
    }
    arr.swap(store, hi);
    store
}

/// Quickselect: return the k-th smallest element (0-indexed k).
/// Mutates the slice but does NOT fully sort it.
pub fn quickselect<T: Ord>(arr: &mut [T], mut lo: usize, mut hi: usize, k: usize) -> &T {
    let mut rng = Rng::new(0xdeadbeef);
    loop {
        if lo == hi { return &arr[lo]; }
        let p = partition(arr, lo, hi, &mut rng);
        match p.cmp(&k) {
            std::cmp::Ordering::Equal => return &arr[p],
            std::cmp::Ordering::Greater => hi = p - 1,
            std::cmp::Ordering::Less => lo = p + 1,
        }
    }
}

/// k-th smallest (1-indexed). Does not modify original array.
pub fn kth_smallest<T: Ord + Clone>(arr: &[T], k: usize) -> T {
    assert!(k >= 1 && k <= arr.len(), "k out of range");
    let mut copy = arr.to_vec();
    let n = copy.len();
    quickselect(&mut copy, 0, n - 1, k - 1).clone()
}

/// Median (1-indexed midpoint for odd, average for even).
fn median_f64(arr: &[f64]) -> f64 {
    let n = arr.len();
    if n % 2 == 1 {
        kth_smallest(arr, (n + 1) / 2)
    } else {
        (kth_smallest(arr, n / 2) + kth_smallest(arr, n / 2 + 1)) / 2.0
    }
}

fn main() {
    let arr = [7i32, 10, 4, 3, 20, 15];
    println!("Array: {arr:?}");
    for k in 1..=arr.len() {
        println!("  {k}-th smallest: {}", kth_smallest(&arr, k));
    }

    let flarr: Vec<f64> = arr.iter().map(|&x| x as f64).collect();
    println!("Median: {}", median_f64(&flarr));

    // Large array: verify k-th smallest matches sorted
    let big: Vec<i32> = (0..100).rev().collect(); // [99, 98, ..., 0]
    println!("\n50th smallest of 0..99: {}", kth_smallest(&big, 50)); // 49
    println!("1st smallest of 0..99:  {}", kth_smallest(&big, 1));   // 0
    println!("100th smallest of 0..99: {}", kth_smallest(&big, 100)); // 99
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_kth_sorted_input() {
        let arr: Vec<i32> = (1..=10).collect();
        for k in 1..=10 {
            assert_eq!(kth_smallest(&arr, k), k as i32);
        }
    }

    #[test]
    fn test_kth_reverse_input() {
        let arr: Vec<i32> = (1..=10).rev().collect();
        for k in 1..=10 {
            assert_eq!(kth_smallest(&arr, k), k as i32);
        }
    }

    #[test]
    fn test_kth_random_input() {
        let arr = vec![7i32, 10, 4, 3, 20, 15];
        let mut sorted = arr.clone();
        sorted.sort();
        for k in 1..=arr.len() {
            assert_eq!(kth_smallest(&arr, k), sorted[k - 1]);
        }
    }

    #[test]
    fn test_kth_duplicates() {
        let arr = vec![3i32, 1, 4, 1, 5, 9, 2, 6, 5, 3];
        let mut sorted = arr.clone();
        sorted.sort();
        for k in 1..=arr.len() {
            assert_eq!(kth_smallest(&arr, k), sorted[k - 1]);
        }
    }

    #[test]
    fn test_single_element() {
        assert_eq!(kth_smallest(&[42i32], 1), 42);
    }

    #[test]
    fn test_median() {
        // Odd length: exact median
        assert_eq!(median_f64(&[1.0, 5.0, 3.0, 2.0, 4.0]), 3.0);
        // Even length: average of two middle
        assert_eq!(median_f64(&[1.0, 2.0, 3.0, 4.0]), 2.5);
    }
}
(* Randomised Quickselect in OCaml *)

(* Simple LCG pseudo-random number generator (no dependencies) *)
let state = ref 42

let rand_int n =
  state := (!state * 1664525 + 1013904223) land 0x7fffffff;
  !state mod n

(* Partition: rearrange arr[lo..hi] around pivot, return pivot index *)
let partition (arr : int array) (lo hi : int) : int =
  let pivot_idx = lo + rand_int (hi - lo + 1) in
  let pivot = arr.(pivot_idx) in
  (* Move pivot to end *)
  let tmp = arr.(pivot_idx) in arr.(pivot_idx) <- arr.(hi); arr.(hi) <- tmp;
  let store = ref lo in
  for i = lo to hi - 1 do
    if arr.(i) < pivot then begin
      let tmp = arr.(!store) in arr.(!store) <- arr.(i); arr.(i) <- tmp;
      incr store
    end
  done;
  (* Move pivot to final position *)
  let tmp = arr.(!store) in arr.(!store) <- arr.(hi); arr.(hi) <- tmp;
  !store

(* Quickselect: find the k-th smallest (0-indexed) in arr[lo..hi] *)
let rec quickselect (arr : int array) (lo hi k : int) : int =
  if lo = hi then arr.(lo)
  else
    let p = partition arr lo hi in
    if p = k then arr.(p)
    else if p > k then quickselect arr lo (p - 1) k
    else quickselect arr (p + 1) hi k

(* Public API: k-th smallest (1-indexed) in the array *)
let kth_smallest (arr : int array) (k : int) : int =
  let copy = Array.copy arr in
  quickselect copy 0 (Array.length copy - 1) (k - 1)

(* Median of array *)
let median (arr : int array) : float =
  let n = Array.length arr in
  if n mod 2 = 1 then
    float_of_int (kth_smallest arr ((n + 1) / 2))
  else
    (float_of_int (kth_smallest arr (n / 2)) +.
     float_of_int (kth_smallest arr (n / 2 + 1))) /. 2.0

let () =
  let arr = [| 7; 10; 4; 3; 20; 15 |] in
  Printf.printf "Array: [7, 10, 4, 3, 20, 15]\n";
  for k = 1 to Array.length arr do
    Printf.printf "  %d-th smallest: %d\n" k (kth_smallest arr k)
  done;
  Printf.printf "Median: %.1f\n" (median arr);

  let arr2 = [| 1; 2; 3; 4; 5; 6; 7; 8; 9; 10 |] in
  Printf.printf "\n5th smallest of 1..10: %d (expected 5)\n" (kth_smallest arr2 5)