075 — Merge Sort
Tutorial
The Problem
Merge sort (John von Neumann, 1945) is the canonical divide-and-conquer sorting algorithm: split the list in half, sort each half recursively, merge the sorted halves. It guarantees O(n log n) in all cases — unlike quicksort which degrades to O(n²) on sorted input. It is the algorithm used in Rust's Vec::sort (TimSort) and Java's Arrays.sort for objects.
Merge sort is naturally recursive and maps directly to functional programming patterns: the merge function is a fold over two sorted lists, and the sort function is a structural recursion on list halves. Understanding merge sort deeply unlocks intuition for all divide-and-conquer algorithms.
🎯 Learning Outcomes
merge function for combining two sorted slices in O(n+m)F: Fn(&T, &T) -> Orderinglen <= 1 (a list of 0 or 1 elements is already sorted)sort (TimSort)Code Example
//! 075: Merge Sort
//!
//! Three variations on divide-and-conquer sorting:
//! 1. Classic recursive merge sort on `Ord` slices.
//! 2. Generic version parameterised by a comparator closure.
//! 3. Bottom-up fold-style sort that pairwise-merges singletons.
//!
//! The implementations mirror the OCaml originals by returning fresh `Vec`s
//! rather than sorting in place — this keeps the code purely functional and
//! easy to compare with the OCaml source.
use std::cmp::Ordering;
/// Merge two already-sorted slices into a new sorted `Vec`.
pub fn merge<T: Ord + Clone>(l1: &[T], l2: &[T]) -> Vec<T> {
merge_by(l1, l2, |a, b| a.cmp(b))
}
/// Classic recursive merge sort: split in half, recurse, merge.
pub fn merge_sort<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
merge_sort_by(xs, |a, b| a.cmp(b))
}
/// Merge two sorted slices using a custom comparator.
pub fn merge_by<T, F>(l1: &[T], l2: &[T], cmp: F) -> Vec<T>
where
T: Clone,
F: Fn(&T, &T) -> Ordering,
{
let mut out = Vec::with_capacity(l1.len() + l2.len());
let (mut i, mut j) = (0, 0);
while i < l1.len() && j < l2.len() {
if cmp(&l1[i], &l2[j]) != Ordering::Greater {
out.push(l1[i].clone());
i += 1;
} else {
out.push(l2[j].clone());
j += 1;
}
}
out.extend_from_slice(&l1[i..]);
out.extend_from_slice(&l2[j..]);
out
}
/// Generic merge sort with a custom comparator.
pub fn merge_sort_by<T, F>(xs: &[T], cmp: F) -> Vec<T>
where
T: Clone,
F: Fn(&T, &T) -> Ordering + Copy,
{
if xs.len() <= 1 {
return xs.to_vec();
}
let (left, right) = xs.split_at(xs.len() / 2);
merge_by(&merge_sort_by(left, cmp), &merge_sort_by(right, cmp), cmp)
}
/// Bottom-up merge sort: start with singletons and repeatedly pair-merge.
pub fn merge_sort_fold<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
if xs.is_empty() {
return Vec::new();
}
let mut runs: Vec<Vec<T>> = xs.iter().map(|x| vec![x.clone()]).collect();
while runs.len() > 1 {
runs = runs
.chunks(2)
.map(|pair| match pair {
[a, b] => merge(a, b),
[a] => a.clone(),
_ => unreachable!(),
})
.collect();
}
runs.pop().unwrap_or_default()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn sorts_typical_list() {
assert_eq!(
merge_sort(&[5, 3, 8, 1, 9, 2, 7]),
vec![1, 2, 3, 5, 7, 8, 9]
);
}
#[test]
fn handles_edge_cases() {
assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
assert_eq!(merge_sort(&[1]), vec![1]);
assert_eq!(merge_sort(&[2, 1]), vec![1, 2]);
}
#[test]
fn preserves_duplicates() {
assert_eq!(merge_sort(&[3, 1, 3, 2, 1, 2]), vec![1, 1, 2, 2, 3, 3]);
}
#[test]
fn merge_combines_two_sorted_slices() {
assert_eq!(merge(&[1, 4, 7], &[2, 3, 8]), vec![1, 2, 3, 4, 7, 8]);
assert_eq!(merge::<i32>(&[], &[1, 2]), vec![1, 2]);
assert_eq!(merge(&[1, 2], &[]), vec![1, 2]);
}
#[test]
fn ascending_with_default_comparator() {
assert_eq!(
merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| a.cmp(b)),
vec![1, 3, 5, 8]
);
}
#[test]
fn descending_with_reversed_comparator() {
assert_eq!(
merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| b.cmp(a)),
vec![8, 5, 3, 1]
);
}
#[test]
fn sorts_strings_by_length() {
let words = ["elephant", "cat", "bumblebee", "dog"];
let sorted = merge_sort_by(&words, |a, b| a.len().cmp(&b.len()));
assert_eq!(sorted, vec!["cat", "dog", "elephant", "bumblebee"]);
}
#[test]
fn bottom_up_matches_recursive() {
let xs = [5, 3, 8, 1, 9, 2, 7];
assert_eq!(merge_sort_fold(&xs), vec![1, 2, 3, 5, 7, 8, 9]);
assert_eq!(merge_sort_fold::<i32>(&[]), Vec::<i32>::new());
assert_eq!(merge_sort_fold(&[42]), vec![42]);
}
}Key Differences
&[i32]) and produces new Vec<i32>. OCaml sorts list and produces new lists. List-based merge sort is purely functional; slice-based requires cloning.Vec::split_at_mut and rotate_left. OCaml's immutable lists cannot be sorted in place.sort is stable (preserves relative order of equal elements). sort_unstable is faster but not stable. OCaml's List.sort is not stable; List.stable_sort is.split_at**: Rust's v.split_at(mid) is O(1). OCaml must traverse to the midpoint: let rec split_at n = function ... | x :: t -> let (l, r) = split_at (n-1) t in (x :: l, r) — O(n/2).OCaml Approach
OCaml's merge sort uses recursive pattern matching on lists:
let rec merge l1 l2 = match l1, l2 with
| [], l | l, [] -> l
| x :: t1, y :: t2 ->
if x <= y then x :: merge t1 l2
else y :: merge l1 t2
let rec merge_sort = function
| ([] | [_]) as lst -> lst
| lst ->
let mid = List.length lst / 2 in
let left = List.filteri (fun i _ -> i < mid) lst in
let right = List.filteri (fun i _ -> i >= mid) lst in
merge (merge_sort left) (merge_sort right)
The list-based version is purely functional with no mutation. List.stable_sort in OCaml's stdlib is an optimized merge sort.
Full Source
//! 075: Merge Sort
//!
//! Three variations on divide-and-conquer sorting:
//! 1. Classic recursive merge sort on `Ord` slices.
//! 2. Generic version parameterised by a comparator closure.
//! 3. Bottom-up fold-style sort that pairwise-merges singletons.
//!
//! The implementations mirror the OCaml originals by returning fresh `Vec`s
//! rather than sorting in place — this keeps the code purely functional and
//! easy to compare with the OCaml source.
use std::cmp::Ordering;
/// Merge two already-sorted slices into a new sorted `Vec`.
pub fn merge<T: Ord + Clone>(l1: &[T], l2: &[T]) -> Vec<T> {
merge_by(l1, l2, |a, b| a.cmp(b))
}
/// Classic recursive merge sort: split in half, recurse, merge.
pub fn merge_sort<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
merge_sort_by(xs, |a, b| a.cmp(b))
}
/// Merge two sorted slices using a custom comparator.
pub fn merge_by<T, F>(l1: &[T], l2: &[T], cmp: F) -> Vec<T>
where
T: Clone,
F: Fn(&T, &T) -> Ordering,
{
let mut out = Vec::with_capacity(l1.len() + l2.len());
let (mut i, mut j) = (0, 0);
while i < l1.len() && j < l2.len() {
if cmp(&l1[i], &l2[j]) != Ordering::Greater {
out.push(l1[i].clone());
i += 1;
} else {
out.push(l2[j].clone());
j += 1;
}
}
out.extend_from_slice(&l1[i..]);
out.extend_from_slice(&l2[j..]);
out
}
/// Generic merge sort with a custom comparator.
pub fn merge_sort_by<T, F>(xs: &[T], cmp: F) -> Vec<T>
where
T: Clone,
F: Fn(&T, &T) -> Ordering + Copy,
{
if xs.len() <= 1 {
return xs.to_vec();
}
let (left, right) = xs.split_at(xs.len() / 2);
merge_by(&merge_sort_by(left, cmp), &merge_sort_by(right, cmp), cmp)
}
/// Bottom-up merge sort: start with singletons and repeatedly pair-merge.
pub fn merge_sort_fold<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
if xs.is_empty() {
return Vec::new();
}
let mut runs: Vec<Vec<T>> = xs.iter().map(|x| vec![x.clone()]).collect();
while runs.len() > 1 {
runs = runs
.chunks(2)
.map(|pair| match pair {
[a, b] => merge(a, b),
[a] => a.clone(),
_ => unreachable!(),
})
.collect();
}
runs.pop().unwrap_or_default()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn sorts_typical_list() {
assert_eq!(
merge_sort(&[5, 3, 8, 1, 9, 2, 7]),
vec![1, 2, 3, 5, 7, 8, 9]
);
}
#[test]
fn handles_edge_cases() {
assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
assert_eq!(merge_sort(&[1]), vec![1]);
assert_eq!(merge_sort(&[2, 1]), vec![1, 2]);
}
#[test]
fn preserves_duplicates() {
assert_eq!(merge_sort(&[3, 1, 3, 2, 1, 2]), vec![1, 1, 2, 2, 3, 3]);
}
#[test]
fn merge_combines_two_sorted_slices() {
assert_eq!(merge(&[1, 4, 7], &[2, 3, 8]), vec![1, 2, 3, 4, 7, 8]);
assert_eq!(merge::<i32>(&[], &[1, 2]), vec![1, 2]);
assert_eq!(merge(&[1, 2], &[]), vec![1, 2]);
}
#[test]
fn ascending_with_default_comparator() {
assert_eq!(
merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| a.cmp(b)),
vec![1, 3, 5, 8]
);
}
#[test]
fn descending_with_reversed_comparator() {
assert_eq!(
merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| b.cmp(a)),
vec![8, 5, 3, 1]
);
}
#[test]
fn sorts_strings_by_length() {
let words = ["elephant", "cat", "bumblebee", "dog"];
let sorted = merge_sort_by(&words, |a, b| a.len().cmp(&b.len()));
assert_eq!(sorted, vec!["cat", "dog", "elephant", "bumblebee"]);
}
#[test]
fn bottom_up_matches_recursive() {
let xs = [5, 3, 8, 1, 9, 2, 7];
assert_eq!(merge_sort_fold(&xs), vec![1, 2, 3, 5, 7, 8, 9]);
assert_eq!(merge_sort_fold::<i32>(&[]), Vec::<i32>::new());
assert_eq!(merge_sort_fold(&[42]), vec![42]);
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn sorts_typical_list() {
assert_eq!(
merge_sort(&[5, 3, 8, 1, 9, 2, 7]),
vec![1, 2, 3, 5, 7, 8, 9]
);
}
#[test]
fn handles_edge_cases() {
assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
assert_eq!(merge_sort(&[1]), vec![1]);
assert_eq!(merge_sort(&[2, 1]), vec![1, 2]);
}
#[test]
fn preserves_duplicates() {
assert_eq!(merge_sort(&[3, 1, 3, 2, 1, 2]), vec![1, 1, 2, 2, 3, 3]);
}
#[test]
fn merge_combines_two_sorted_slices() {
assert_eq!(merge(&[1, 4, 7], &[2, 3, 8]), vec![1, 2, 3, 4, 7, 8]);
assert_eq!(merge::<i32>(&[], &[1, 2]), vec![1, 2]);
assert_eq!(merge(&[1, 2], &[]), vec![1, 2]);
}
#[test]
fn ascending_with_default_comparator() {
assert_eq!(
merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| a.cmp(b)),
vec![1, 3, 5, 8]
);
}
#[test]
fn descending_with_reversed_comparator() {
assert_eq!(
merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| b.cmp(a)),
vec![8, 5, 3, 1]
);
}
#[test]
fn sorts_strings_by_length() {
let words = ["elephant", "cat", "bumblebee", "dog"];
let sorted = merge_sort_by(&words, |a, b| a.len().cmp(&b.len()));
assert_eq!(sorted, vec!["cat", "dog", "elephant", "bumblebee"]);
}
#[test]
fn bottom_up_matches_recursive() {
let xs = [5, 3, 8, 1, 9, 2, 7];
assert_eq!(merge_sort_fold(&xs), vec![1, 2, 3, 5, 7, 8, 9]);
assert_eq!(merge_sort_fold::<i32>(&[]), Vec::<i32>::new());
assert_eq!(merge_sort_fold(&[42]), vec![42]);
}
}
Deep Comparison
Core Insight
Merge sort: split into halves, recursively sort each half, merge sorted halves. O(n log n) guaranteed. The functional version is particularly elegant because merging sorted lists is a natural recursive operation.
OCaml Approach
Rust Approach
sort() (introsort)Comparison Table
| Feature | OCaml | Rust |
|---|---|---|
| Split | Pattern match / List.nth | split_at(mid) |
| Merge | Recursive cons | Push to Vec |
| In-place | No (functional) | Possible but complex |
| Built-in | List.sort | .sort() (introsort) |
Exercises
&mut Vec<i32> using rotate_left for the merge step. This avoids cloning but is O(n² log n) due to rotation cost.timsort_lite that detects ascending runs and merges them.