948 Intermediate

948 Merge Sort

Functional Programming

Tutorial

The Problem

Implement functional merge sort in Rust: recursively split a slice in half, sort each half, then merge two sorted slices. Implement two merge variants — one with explicit index cursors and one using peekable iterators — to demonstrate the functional style. Compare with OCaml's list-based merge sort that uses pattern matching.

🎯 Learning Outcomes

• Implement recursive merge_sort<T: Ord + Clone>(list: &[T]) -> Vec<T> using slice splitting

• Implement merge<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> with two-pointer merge

• Implement merge_iter using Peekable iterators for a more functional-style merge

• Understand that Clone is required because merge sort produces new Vec allocations at each level

• Connect to OCaml's pattern-matched list merge sort: [] | [_] base case, List.init split

Code Example

#![allow(clippy::all)]
/// Merge Sort — Functional Divide and Conquer
///
/// Pure functional merge sort: split the list, sort each half, merge.
/// OCaml uses pattern matching on lists; Rust uses slices and Vec.

/// Merge two sorted slices into a new sorted Vec.
pub fn merge<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
    let mut result = Vec::with_capacity(left.len() + right.len());
    let (mut i, mut j) = (0, 0);
    while i < left.len() && j < right.len() {
        if left[i] <= right[j] {
            result.push(left[i].clone());
            i += 1;
        } else {
            result.push(right[j].clone());
            j += 1;
        }
    }
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);
    result
}

/// Functional merge sort — recursive, creates new Vecs at each level.
pub fn merge_sort<T: Ord + Clone>(list: &[T]) -> Vec<T> {
    if list.len() <= 1 {
        return list.to_vec();
    }
    let mid = list.len() / 2;
    let left = merge_sort(&list[..mid]);
    let right = merge_sort(&list[mid..]);
    merge(&left, &right)
}

/// Merge using iterators — more functional style.
pub fn merge_iter<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
    let mut result = Vec::with_capacity(left.len() + right.len());
    let mut li = left.iter().peekable();
    let mut ri = right.iter().peekable();
    loop {
        match (li.peek(), ri.peek()) {
            (Some(l), Some(r)) => {
                if l <= r {
                    result.push((*li.next().unwrap()).clone());
                } else {
                    result.push((*ri.next().unwrap()).clone());
                }
            }
            (Some(_), None) => {
                result.extend(li.cloned());
                break;
            }
            (None, Some(_)) => {
                result.extend(ri.cloned());
                break;
            }
            (None, None) => break,
        }
    }
    result
}

/// Custom comparator version (like OCaml's `cmp` parameter).
pub fn merge_sort_by<T: Clone>(list: &[T], cmp: &impl Fn(&T, &T) -> std::cmp::Ordering) -> Vec<T> {
    if list.len() <= 1 {
        return list.to_vec();
    }
    let mid = list.len() / 2;
    let left = merge_sort_by(&list[..mid], cmp);
    let right = merge_sort_by(&list[mid..], cmp);
    let mut result = Vec::with_capacity(left.len() + right.len());
    let (mut i, mut j) = (0, 0);
    while i < left.len() && j < right.len() {
        if cmp(&left[i], &right[j]) != std::cmp::Ordering::Greater {
            result.push(left[i].clone());
            i += 1;
        } else {
            result.push(right[j].clone());
            j += 1;
        }
    }
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);
    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        assert_eq!(merge_sort(&[5, 2, 8, 1, 9, 3]), vec![1, 2, 3, 5, 8, 9]);
    }

    #[test]
    fn test_empty() {
        assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
    }

    #[test]
    fn test_single() {
        assert_eq!(merge_sort(&[42]), vec![42]);
    }

    #[test]
    fn test_already_sorted() {
        assert_eq!(merge_sort(&[1, 2, 3, 4, 5]), vec![1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_reverse_sorted() {
        assert_eq!(merge_sort(&[5, 4, 3, 2, 1]), vec![1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_duplicates() {
        assert_eq!(merge_sort(&[3, 1, 2, 1, 3]), vec![1, 1, 2, 3, 3]);
    }

    #[test]
    fn test_strings() {
        assert_eq!(
            merge_sort(&["banana", "apple", "cherry"]),
            vec!["apple", "banana", "cherry"]
        );
    }

    #[test]
    fn test_custom_cmp() {
        // Sort descending
        let result = merge_sort_by(&[1, 5, 3, 2, 4], &|a: &i32, b: &i32| b.cmp(a));
        assert_eq!(result, vec![5, 4, 3, 2, 1]);
    }
}

let rec merge cmp l1 l2 = match l1, l2 with
  | [], l | l, [] -> l
  | h1 :: t1, h2 :: t2 ->
    if cmp h1 h2 <= 0 then h1 :: merge cmp t1 l2
    else h2 :: merge cmp l1 t2

let rec split = function
  | [] -> [], []
  | [x] -> [x], []
  | a :: b :: rest ->
    let l, r = split rest in
    a :: l, b :: r

let rec merge_sort cmp = function
  | ([] | [_]) as l -> l
  | l ->
    let left, right = split l in
    merge cmp (merge_sort cmp left) (merge_sort cmp right)

let () =
  let sorted = merge_sort compare [5; 2; 8; 1; 9; 3] in
  assert (sorted = [1; 2; 3; 5; 8; 9]);
  assert (merge_sort compare [] = []);
  assert (merge_sort compare [42] = [42]);
  assert (merge_sort compare [3; 1; 2; 1; 3] = [1; 1; 2; 3; 3]);
  print_endline "All assertions passed."

Key Differences

Aspect	Rust	OCaml
Input type	`&[T]` slice — zero-copy split at mid	`'a list` — must copy for split
Clone requirement	`T: Clone` — explicit	GC handles sharing transparently
Base case	`list.len() <= 1`	Pattern match `[] \\| [_]`
Merge style	Index cursor or `Peekable`	Pattern match on pairs
Memory	One `Vec` per merge step	One list cons per element

Merge sort is naturally recursive and allocation-heavy. For production use, Rust's slice::sort (pdqsort) is significantly faster. Functional merge sort here is for algorithmic clarity.

OCaml Approach

let rec merge xs ys = match xs, ys with
  | [], ys -> ys
  | xs, [] -> xs
  | x :: xs', y :: ys' ->
    if x <= y then x :: merge xs' ys
    else y :: merge xs ys'

let split xs =
  let n = List.length xs in
  let k = n / 2 in
  (List.filteri (fun i _ -> i < k) xs,
   List.filteri (fun i _ -> i >= k) xs)

let rec merge_sort = function
  | [] | [_] as xs -> xs
  | xs ->
    let (left, right) = split xs in
    merge (merge_sort left) (merge_sort right)

OCaml's pattern match on (xs, ys) pairs makes the merge function read as three cases: one side empty, other side empty, or both non-empty. The recursive merge in OCaml consumes the lists directly — no index tracking needed.

Full Source

#![allow(clippy::all)]
/// Merge Sort — Functional Divide and Conquer
///
/// Pure functional merge sort: split the list, sort each half, merge.
/// OCaml uses pattern matching on lists; Rust uses slices and Vec.

/// Merge two sorted slices into a new sorted Vec.
pub fn merge<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
    let mut result = Vec::with_capacity(left.len() + right.len());
    let (mut i, mut j) = (0, 0);
    while i < left.len() && j < right.len() {
        if left[i] <= right[j] {
            result.push(left[i].clone());
            i += 1;
        } else {
            result.push(right[j].clone());
            j += 1;
        }
    }
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);
    result
}

/// Functional merge sort — recursive, creates new Vecs at each level.
pub fn merge_sort<T: Ord + Clone>(list: &[T]) -> Vec<T> {
    if list.len() <= 1 {
        return list.to_vec();
    }
    let mid = list.len() / 2;
    let left = merge_sort(&list[..mid]);
    let right = merge_sort(&list[mid..]);
    merge(&left, &right)
}

/// Merge using iterators — more functional style.
pub fn merge_iter<T: Ord + Clone>(left: &[T], right: &[T]) -> Vec<T> {
    let mut result = Vec::with_capacity(left.len() + right.len());
    let mut li = left.iter().peekable();
    let mut ri = right.iter().peekable();
    loop {
        match (li.peek(), ri.peek()) {
            (Some(l), Some(r)) => {
                if l <= r {
                    result.push((*li.next().unwrap()).clone());
                } else {
                    result.push((*ri.next().unwrap()).clone());
                }
            }
            (Some(_), None) => {
                result.extend(li.cloned());
                break;
            }
            (None, Some(_)) => {
                result.extend(ri.cloned());
                break;
            }
            (None, None) => break,
        }
    }
    result
}

/// Custom comparator version (like OCaml's `cmp` parameter).
pub fn merge_sort_by<T: Clone>(list: &[T], cmp: &impl Fn(&T, &T) -> std::cmp::Ordering) -> Vec<T> {
    if list.len() <= 1 {
        return list.to_vec();
    }
    let mid = list.len() / 2;
    let left = merge_sort_by(&list[..mid], cmp);
    let right = merge_sort_by(&list[mid..], cmp);
    let mut result = Vec::with_capacity(left.len() + right.len());
    let (mut i, mut j) = (0, 0);
    while i < left.len() && j < right.len() {
        if cmp(&left[i], &right[j]) != std::cmp::Ordering::Greater {
            result.push(left[i].clone());
            i += 1;
        } else {
            result.push(right[j].clone());
            j += 1;
        }
    }
    result.extend_from_slice(&left[i..]);
    result.extend_from_slice(&right[j..]);
    result
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        assert_eq!(merge_sort(&[5, 2, 8, 1, 9, 3]), vec![1, 2, 3, 5, 8, 9]);
    }

    #[test]
    fn test_empty() {
        assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
    }

    #[test]
    fn test_single() {
        assert_eq!(merge_sort(&[42]), vec![42]);
    }

    #[test]
    fn test_already_sorted() {
        assert_eq!(merge_sort(&[1, 2, 3, 4, 5]), vec![1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_reverse_sorted() {
        assert_eq!(merge_sort(&[5, 4, 3, 2, 1]), vec![1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_duplicates() {
        assert_eq!(merge_sort(&[3, 1, 2, 1, 3]), vec![1, 1, 2, 3, 3]);
    }

    #[test]
    fn test_strings() {
        assert_eq!(
            merge_sort(&["banana", "apple", "cherry"]),
            vec!["apple", "banana", "cherry"]
        );
    }

    #[test]
    fn test_custom_cmp() {
        // Sort descending
        let result = merge_sort_by(&[1, 5, 3, 2, 4], &|a: &i32, b: &i32| b.cmp(a));
        assert_eq!(result, vec![5, 4, 3, 2, 1]);
    }
}

let rec merge cmp l1 l2 = match l1, l2 with
  | [], l | l, [] -> l
  | h1 :: t1, h2 :: t2 ->
    if cmp h1 h2 <= 0 then h1 :: merge cmp t1 l2
    else h2 :: merge cmp l1 t2

let rec split = function
  | [] -> [], []
  | [x] -> [x], []
  | a :: b :: rest ->
    let l, r = split rest in
    a :: l, b :: r

let rec merge_sort cmp = function
  | ([] | [_]) as l -> l
  | l ->
    let left, right = split l in
    merge cmp (merge_sort cmp left) (merge_sort cmp right)

let () =
  let sorted = merge_sort compare [5; 2; 8; 1; 9; 3] in
  assert (sorted = [1; 2; 3; 5; 8; 9]);
  assert (merge_sort compare [] = []);
  assert (merge_sort compare [42] = [42]);
  assert (merge_sort compare [3; 1; 2; 1; 3] = [1; 1; 2; 3; 3]);
  print_endline "All assertions passed."

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_basic() {
        assert_eq!(merge_sort(&[5, 2, 8, 1, 9, 3]), vec![1, 2, 3, 5, 8, 9]);
    }

    #[test]
    fn test_empty() {
        assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
    }

    #[test]
    fn test_single() {
        assert_eq!(merge_sort(&[42]), vec![42]);
    }

    #[test]
    fn test_already_sorted() {
        assert_eq!(merge_sort(&[1, 2, 3, 4, 5]), vec![1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_reverse_sorted() {
        assert_eq!(merge_sort(&[5, 4, 3, 2, 1]), vec![1, 2, 3, 4, 5]);
    }

    #[test]
    fn test_duplicates() {
        assert_eq!(merge_sort(&[3, 1, 2, 1, 3]), vec![1, 1, 2, 3, 3]);
    }

    #[test]
    fn test_strings() {
        assert_eq!(
            merge_sort(&["banana", "apple", "cherry"]),
            vec!["apple", "banana", "cherry"]
        );
    }

    #[test]
    fn test_custom_cmp() {
        // Sort descending
        let result = merge_sort_by(&[1, 5, 3, 2, 4], &|a: &i32, b: &i32| b.cmp(a));
        assert_eq!(result, vec![5, 4, 3, 2, 1]);
    }
}

Deep Comparison

Merge Sort — Functional Divide and Conquer: OCaml vs Rust

The Core Insight

Merge sort is the quintessential functional sorting algorithm because it's naturally recursive and doesn't require mutation. The comparison reveals how OCaml's linked lists and Rust's contiguous Vecs lead to different implementation strategies with the same O(n log n) complexity but different constant factors.

OCaml Approach

OCaml's merge sort is textbook elegant. split alternates elements into two lists using pattern matching on pairs (a :: b :: rest). merge recursively cons-es the smaller head onto the merged tail. The compare function is passed as a first-class value for custom ordering. All intermediate lists share structure through the GC — no copying needed. The ([] | [_]) as l -> l or-pattern cleanly handles base cases.

Rust Approach

Rust uses slices (&[T]) for input and Vec<T> for output. Splitting is trivial with slice indexing: &list[..mid] and &list[mid..]. Merging uses index-based iteration with push and extend_from_slice for efficient bulk copying. Vec::with_capacity pre-allocates the exact needed size, avoiding reallocation. The Ord trait bound replaces OCaml's compare function parameter. A peekable iterator variant shows the more functional style.

Side-by-Side

Concept	OCaml	Rust
Split	Pattern match `a :: b :: rest`	Slice indexing `&list[..mid]`
Merge	Recursive cons `h1 :: merge ...`	`Vec::push` + `extend_from_slice`
Base case	`([] \\| [_]) as l -> l`	`if list.len() <= 1`
Comparator	`cmp` function parameter	`Ord` trait bound or closure
Memory	GC, structural sharing	`Vec` allocation, contiguous
Stability	Stable (preserves order)	Stable (preserves order)

What Rust Learners Should Notice

• Vec::with_capacity(n) is a crucial optimization — it avoids O(log n) reallocations during the merge phase

• Slice references (&[T]) give you zero-cost views into data without copying — similar to how OCaml lists share tails

• extend_from_slice copies a contiguous block efficiently, much faster than pushing elements one by one

• The Ord trait bound means any type that implements comparison can be sorted — no need to pass a comparator function (though merge_sort_by with a closure also works)

• In production Rust, use slice::sort() or slice::sort_unstable() — they use Timsort / pdqsort respectively, both highly optimized

Exercises

Implement merge_sort_by<T, F: Fn(&T, &T) -> Ordering>(list, cmp) with a custom comparator.

Make merge_sort work on Vec<T> in-place with O(n log n) time and O(n) auxiliary space.

Implement a natural_merge_sort that detects existing sorted runs and merges them (Timsort inspiration).

Benchmark merge_sort vs slice::sort on a 10,000-element Vec<i32>.

Implement merge_k_sorted(lists: Vec<Vec<T>>) that merges k sorted lists in O(n log k) using a BinaryHeap.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust

Tutorial

The Problem

🎯 Learning Outcomes

Code Example

Key Differences

OCaml Approach

Full Source

Deep Comparison

Merge Sort — Functional Divide and Conquer: OCaml vs Rust

The Core Insight

OCaml Approach

Rust Approach

Side-by-Side

What Rust Learners Should Notice

Further Reading

Exercises

Open Source Repos