๐Ÿฆ€ Functional Rust
๐ŸŽฌ Fearless Concurrency Threads, Arc>, channels โ€” safe parallelism enforced by the compiler.
๐Ÿ“ Text version (for readers / accessibility)

โ€ข std::thread::spawn creates OS threads โ€” closures must be Send + 'static

โ€ข Arc> provides shared mutable state across threads safely

โ€ข Channels (mpsc) enable message passing โ€” multiple producers, single consumer

โ€ข Send and Sync marker traits enforce thread safety at compile time

โ€ข Data races are impossible โ€” the type system prevents them before your code runs

988: Thread-Local Storage

Difficulty: Intermediate Category: Async / Concurrency FP Patterns Concept: Per-thread state that doesn't need synchronization Key Insight: `thread_local!` creates a separate instance per thread โ€” use with `RefCell` for interior mutability; no `Arc`/`Mutex` needed since no sharing occurs

Versions

DirectoryDescription
`std/`Standard library version using `std::sync`, `std::thread`
`tokio/`Tokio async runtime version using `tokio::sync`, `tokio::spawn`

Running

# Standard library version
cd std && cargo test

# Tokio version
cd tokio && cargo test
// 988: Thread-Local Storage
// Rust: thread_local! macro โ€” each thread gets its own instance

use std::cell::RefCell;
use std::sync::{Arc, Mutex};
use std::thread;

// --- Approach 1: thread_local! with Cell (simple counter) ---
thread_local! {
    static COUNTER: RefCell<i32> = RefCell::new(0);
}

fn thread_local_counter() -> Vec<i32> {
    let results = Arc::new(Mutex::new(Vec::new()));

    let handles: Vec<_> = (0..5i32).map(|i| {
        let results = Arc::clone(&results);
        thread::spawn(move || {
            // Each thread has its own COUNTER โ€” no sharing
            COUNTER.with(|c| *c.borrow_mut() = i * 10);
            thread::yield_now();
            let v = COUNTER.with(|c| *c.borrow());
            results.lock().unwrap().push(v);
        })
    }).collect();

    for h in handles { h.join().unwrap(); }
    let mut v = results.lock().unwrap().clone();
    v.sort();
    v
}

// --- Approach 2: Thread-local accumulator (no shared state needed) ---
thread_local! {
    static LOCAL_SUM: RefCell<i64> = RefCell::new(0);
}

fn thread_local_sum(id: i64) -> i64 {
    LOCAL_SUM.with(|s| {
        *s.borrow_mut() = 0; // reset for this thread
        for i in 1..=10 {
            *s.borrow_mut() += i * id;
        }
        *s.borrow()
    })
}

fn parallel_sums() -> i64 {
    let results = Arc::new(Mutex::new(Vec::new()));

    let handles: Vec<_> = (0..4i64).map(|id| {
        let results = Arc::clone(&results);
        thread::spawn(move || {
            let s = thread_local_sum(id);
            results.lock().unwrap().push(s);
        })
    }).collect();

    for h in handles { h.join().unwrap(); }
    let x = results.lock().unwrap().iter().sum(); x
}

// --- Approach 3: Thread-local cache (computed once per thread) ---
thread_local! {
    static THREAD_ID_CACHE: RefCell<Option<String>> = RefCell::new(None);
}

fn get_thread_name(name: &str) -> String {
    THREAD_ID_CACHE.with(|cache| {
        let mut c = cache.borrow_mut();
        if c.is_none() {
            *c = Some(format!("thread-{}", name));
        }
        c.clone().unwrap()
    })
}


#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_thread_local_isolation() {
        let counts = thread_local_counter();
        assert_eq!(counts, vec![0, 10, 20, 30, 40]);
    }

    #[test]
    fn test_parallel_sums() {
        // 0 + 55 + 110 + 165 = 330
        assert_eq!(parallel_sums(), 330);
    }

    #[test]
    fn test_thread_local_doesnt_leak_across_threads() {
        COUNTER.with(|c| *c.borrow_mut() = 999);
        let val_in_new_thread = thread::spawn(|| {
            COUNTER.with(|c| *c.borrow()) // should be 0, not 999
        }).join().unwrap();
        assert_eq!(val_in_new_thread, 0);
    }

    #[test]
    fn test_thread_name_cached() {
        let n1 = get_thread_name("x");
        let n2 = get_thread_name("y"); // returns cached value, not "thread-y"
        assert_eq!(n1, n2); // same thread โ€” cached
    }
}
(* 988: Thread-Local Storage *)
(* OCaml 5: Domain.DLS (domain-local storage). OCaml < 5: Thread.self() map *)

(* --- Approach 1: Simulate thread-local via Thread.self() hash table --- *)

let tls : (int, int ref) Hashtbl.t = Hashtbl.create 16
let tls_m = Mutex.create ()

let get_tls () =
  let id = Thread.id (Thread.self ()) in
  Mutex.lock tls_m;
  let v = match Hashtbl.find_opt tls id with
    | Some r -> r
    | None -> let r = ref 0 in Hashtbl.add tls id r; r
  in
  Mutex.unlock tls_m;
  v

let set_tls v =
  let cell = get_tls () in
  cell := v

let read_tls () = !(get_tls ())

let () =
  let results = ref [] in
  let m = Mutex.create () in
  let threads = List.init 5 (fun i ->
    Thread.create (fun () ->
      set_tls (i * 10);
      (* Other thread's changes don't affect ours *)
      Thread.yield ();
      let v = read_tls () in
      Mutex.lock m;
      results := v :: !results;
      Mutex.unlock m
    ) ()
  ) in
  List.iter Thread.join threads;
  let sorted = List.sort compare !results in
  assert (sorted = [0; 10; 20; 30; 40]);
  Printf.printf "Approach 1 (thread-local): [%s]\n"
    (String.concat "; " (List.map string_of_int sorted))

(* --- Approach 2: Per-thread accumulator (independent state) --- *)

let () =
  let all_sums = ref [] in
  let m = Mutex.create () in
  let threads = List.init 4 (fun id ->
    Thread.create (fun () ->
      (* Each thread accumulates independently *)
      let local_sum = ref 0 in
      for i = 1 to 10 do
        local_sum := !local_sum + i * id
      done;
      Mutex.lock m;
      all_sums := !local_sum :: !all_sums;
      Mutex.unlock m
    ) ()
  ) in
  List.iter Thread.join threads;
  (* sum of: 0, 55, 110, 165 = 330 *)
  let total = List.fold_left (+) 0 !all_sums in
  assert (total = 330);
  Printf.printf "Approach 2 (per-thread sum): total=%d\n" total

let () = Printf.printf "โœ“ All tests passed\n"

๐Ÿ“Š Detailed Comparison

Thread-Local Storage โ€” Comparison

Core Insight

Thread-local storage is the answer to "I want mutable state but don't want synchronization overhead." Each thread has its own private copy โ€” no races possible, no locks needed.

OCaml Approach

  • OCaml 5: `Domain.DLS.new_key` / `Domain.DLS.get` / `Domain.DLS.set` (domain-local)
  • OCaml < 5: Simulate with `Thread.id` โ†’ `Hashtbl` (requires mutex for the table itself)
  • Domains โ‰  threads in OCaml 5 โ€” one domain can run many lightweight threads
  • Typical use: per-domain RNG seeds, error buffers, caches

Rust Approach

  • `thread_local! { static NAME: Type = init; }` declares the variable
  • `.with(|v| ...)` is the only access method โ€” ensures scoped lifetime
  • Usually paired with `Cell<T>` (copy types) or `RefCell<T>` (arbitrary types)
  • Initialized lazily on first access per thread
  • Dropped when thread exits

Comparison Table

ConceptOCamlRust
Declare`Domain.DLS.new_key (fun () -> init)``thread_local! { static X: T }`
Read`Domain.DLS.get key``X.with(\v\v.borrow())`
Write`Domain.DLS.set key val``X.with(\v\v.borrow_mut() = x)`
Interior mutabilityMutable by nature`Cell<T>` or `RefCell<T>`
InitializationClosure passed at creationExpression in macro
IsolationPer-domain (not per-thread in OCaml 5)Per-OS-thread
No sync neededYesYes โ€” the whole point

std vs tokio

Aspectstd versiontokio version
RuntimeOS threads via `std::thread`Async tasks on tokio runtime
Synchronization`std::sync::Mutex`, `Condvar``tokio::sync::Mutex`, channels
Channels`std::sync::mpsc` (unbounded)`tokio::sync::mpsc` (bounded, async)
BlockingThread blocks on lock/recvTask yields, runtime switches tasks
OverheadOne OS thread per taskMany tasks per thread (M:N)
Best forCPU-bound, simple concurrencyI/O-bound, high-concurrency servers