๐Ÿฆ€ Functional Rust
๐ŸŽฌ Fearless Concurrency Threads, Arc>, channels โ€” safe parallelism enforced by the compiler.
๐Ÿ“ Text version (for readers / accessibility)

โ€ข std::thread::spawn creates OS threads โ€” closures must be Send + 'static

โ€ข Arc> provides shared mutable state across threads safely

โ€ข Channels (mpsc) enable message passing โ€” multiple producers, single consumer

โ€ข Send and Sync marker traits enforce thread safety at compile time

โ€ข Data races are impossible โ€” the type system prevents them before your code runs

448: Rayon Parallel Iterators

Difficulty: 3 Level: Intermediate `.par_iter()` โ€” swap one word and your loop runs on all CPU cores.

The Problem This Solves

Sequential iterators are elegant but single-threaded. When you're processing millions of items โ€” image pixels, log lines, simulation steps โ€” you're leaving Nโˆ’1 cores idle. Manually splitting work across threads, joining them, and merging results is tedious and error-prone. Rayon's `par_iter()` parallelises iterator pipelines automatically. It uses a global work-stealing thread pool tuned to your hardware. You write the same functional chain you'd write for serial code; Rayon decides how to split and schedule it. Because ownership and `Send`/`Sync` bounds are enforced at compile time, data races are impossible. The compiler rejects parallel code that would be unsafe.

The Intuition

You have a conveyor belt (your iterator) processing items one at a time. With `.par_iter()` you're replacing it with N parallel conveyor belts, each handling a slice of items, feeding results into a single final collection. You don't manage the belts โ€” you just describe what to do with each item.

How It Works in Rust

1. Switch to parallel โ€” call `.par_iter()` instead of `.iter()` (requires `rayon::prelude::*`). 2. Same API โ€” `.map()`, `.filter()`, `.flat_map()`, `.fold()`, `.reduce()` all work identically. 3. Collect results โ€” `.collect::<Vec<_>>()` merges partial results from all threads. 4. Automatic chunking โ€” Rayon splits the input adaptively using its work-stealing scheduler. You never size chunks manually.
use rayon::prelude::*;

let sum: i64 = (0..1_000_000_i64)
 .into_par_iter()
 .filter(|n| n % 2 == 0)
 .map(|n| n * n)
 .sum();
5. Custom thread pool โ€” `rayon::ThreadPoolBuilder::new().num_threads(4).build_global()` if defaults don't suit.

What This Unlocks

Key Differences

ConceptOCamlRust
Parallel iteration`Parmap` / `Domainslib.Task.parallel_for``rayon::par_iter()`
Thread pool`Domainslib` pool, explicitImplicit global pool
SafetyRuntime checksCompile-time `Send`/`Sync` bounds
Chunk sizingManual or library heuristicAdaptive work-stealing
// 448. Rayon parallel iterators โ€” concept via std threads
use std::thread;

fn parallel_map<T: Sync, U: Send+Default+Clone, F: Fn(&T)->U+Sync>(data: &[T], f: F) -> Vec<U> {
    let n = thread::available_parallelism().map(|n| n.get()).unwrap_or(4);
    let chunk = (data.len() / n).max(1);
    let mut out = vec![U::default(); data.len()];
    thread::scope(|s| {
        for (ci, co) in data.chunks(chunk).zip(out.chunks_mut(chunk)) {
            s.spawn(|| { for (d,r) in ci.iter().zip(co.iter_mut()) { *r = f(d); } });
        }
    });
    out
}

fn parallel_sum(data: &[f64]) -> f64 {
    let n = 4usize;
    let chunk = (data.len() / n).max(1);
    let partials: Vec<f64> = thread::scope(|s|
        data.chunks(chunk).map(|c| s.spawn(move || c.iter().sum::<f64>()))
            .collect::<Vec<_>>().into_iter().map(|h| h.join().unwrap()).collect()
    );
    partials.iter().sum()
}

fn main() {
    let data: Vec<f64> = (1..=1000).map(|x| x as f64).collect();
    let sq = parallel_map(&data, |x| x*x);
    println!("sum squares = {:.0}", sq.iter().sum::<f64>());
    println!("parallel sum = {:.0}", parallel_sum(&data));
}

#[cfg(test)]
mod tests {
    use super::*;
    #[test] fn test_map()  { let d:Vec<f64>=(1..=5).map(|x| x as f64).collect(); let r=parallel_map(&d,|x|x*x); assert_eq!(r,vec![1.,4.,9.,16.,25.]); }
    #[test] fn test_sum()  { let d:Vec<f64>=(1..=100).map(|x|x as f64).collect(); assert!((parallel_sum(&d)-5050.).abs()<1e-9); }
}
(* 448. Parallel map โ€“ OCaml manual *)
let parallel_map f arr =
  let n = Array.length arr in
  let res = Array.make n (f arr.(0)) in
  let nt = 4 in
  let chunk = (n + nt - 1) / nt in
  let ts = Array.init nt (fun t ->
    let lo = t*chunk and hi = min n ((t+1)*chunk) in
    Thread.create (fun () ->
      for i = lo to hi-1 do res.(i) <- f arr.(i) done) ()
  ) in
  Array.iter Thread.join ts; res

let () =
  let data = Array.init 1000 (fun i -> float_of_int (i+1)) in
  let sq = parallel_map (fun x -> x*.x) data in
  Printf.printf "sum of squares = %.0f\n" (Array.fold_left (+.) 0. sq)