ExamplesBy LevelBy TopicLearning Paths
075 Intermediate

075 — Merge Sort

Functional Programming

Tutorial

The Problem

Merge sort (John von Neumann, 1945) is the canonical divide-and-conquer sorting algorithm: split the list in half, sort each half recursively, merge the sorted halves. It guarantees O(n log n) in all cases — unlike quicksort which degrades to O(n²) on sorted input. It is the algorithm used in Rust's Vec::sort (TimSort) and Java's Arrays.sort for objects.

Merge sort is naturally recursive and maps directly to functional programming patterns: the merge function is a fold over two sorted lists, and the sort function is a structural recursion on list halves. Understanding merge sort deeply unlocks intuition for all divide-and-conquer algorithms.

🎯 Learning Outcomes

  • • Implement the two-phase merge sort: split, recurse, merge
  • • Write a merge function for combining two sorted slices in O(n+m)
  • • Implement the generic version with a custom comparator F: Fn(&T, &T) -> Ordering
  • • Understand why the base case is len <= 1 (a list of 0 or 1 elements is already sorted)
  • • Compare recursive merge sort with Rust's built-in sort (TimSort)
  • Code Example

    //! 075: Merge Sort
    //!
    //! Three variations on divide-and-conquer sorting:
    //! 1. Classic recursive merge sort on `Ord` slices.
    //! 2. Generic version parameterised by a comparator closure.
    //! 3. Bottom-up fold-style sort that pairwise-merges singletons.
    //!
    //! The implementations mirror the OCaml originals by returning fresh `Vec`s
    //! rather than sorting in place — this keeps the code purely functional and
    //! easy to compare with the OCaml source.
    
    use std::cmp::Ordering;
    
    /// Merge two already-sorted slices into a new sorted `Vec`.
    pub fn merge<T: Ord + Clone>(l1: &[T], l2: &[T]) -> Vec<T> {
        merge_by(l1, l2, |a, b| a.cmp(b))
    }
    
    /// Classic recursive merge sort: split in half, recurse, merge.
    pub fn merge_sort<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
        merge_sort_by(xs, |a, b| a.cmp(b))
    }
    
    /// Merge two sorted slices using a custom comparator.
    pub fn merge_by<T, F>(l1: &[T], l2: &[T], cmp: F) -> Vec<T>
    where
        T: Clone,
        F: Fn(&T, &T) -> Ordering,
    {
        let mut out = Vec::with_capacity(l1.len() + l2.len());
        let (mut i, mut j) = (0, 0);
        while i < l1.len() && j < l2.len() {
            if cmp(&l1[i], &l2[j]) != Ordering::Greater {
                out.push(l1[i].clone());
                i += 1;
            } else {
                out.push(l2[j].clone());
                j += 1;
            }
        }
        out.extend_from_slice(&l1[i..]);
        out.extend_from_slice(&l2[j..]);
        out
    }
    
    /// Generic merge sort with a custom comparator.
    pub fn merge_sort_by<T, F>(xs: &[T], cmp: F) -> Vec<T>
    where
        T: Clone,
        F: Fn(&T, &T) -> Ordering + Copy,
    {
        if xs.len() <= 1 {
            return xs.to_vec();
        }
        let (left, right) = xs.split_at(xs.len() / 2);
        merge_by(&merge_sort_by(left, cmp), &merge_sort_by(right, cmp), cmp)
    }
    
    /// Bottom-up merge sort: start with singletons and repeatedly pair-merge.
    pub fn merge_sort_fold<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
        if xs.is_empty() {
            return Vec::new();
        }
        let mut runs: Vec<Vec<T>> = xs.iter().map(|x| vec![x.clone()]).collect();
        while runs.len() > 1 {
            runs = runs
                .chunks(2)
                .map(|pair| match pair {
                    [a, b] => merge(a, b),
                    [a] => a.clone(),
                    _ => unreachable!(),
                })
                .collect();
        }
        runs.pop().unwrap_or_default()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn sorts_typical_list() {
            assert_eq!(
                merge_sort(&[5, 3, 8, 1, 9, 2, 7]),
                vec![1, 2, 3, 5, 7, 8, 9]
            );
        }
    
        #[test]
        fn handles_edge_cases() {
            assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
            assert_eq!(merge_sort(&[1]), vec![1]);
            assert_eq!(merge_sort(&[2, 1]), vec![1, 2]);
        }
    
        #[test]
        fn preserves_duplicates() {
            assert_eq!(merge_sort(&[3, 1, 3, 2, 1, 2]), vec![1, 1, 2, 2, 3, 3]);
        }
    
        #[test]
        fn merge_combines_two_sorted_slices() {
            assert_eq!(merge(&[1, 4, 7], &[2, 3, 8]), vec![1, 2, 3, 4, 7, 8]);
            assert_eq!(merge::<i32>(&[], &[1, 2]), vec![1, 2]);
            assert_eq!(merge(&[1, 2], &[]), vec![1, 2]);
        }
    
        #[test]
        fn ascending_with_default_comparator() {
            assert_eq!(
                merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| a.cmp(b)),
                vec![1, 3, 5, 8]
            );
        }
    
        #[test]
        fn descending_with_reversed_comparator() {
            assert_eq!(
                merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| b.cmp(a)),
                vec![8, 5, 3, 1]
            );
        }
    
        #[test]
        fn sorts_strings_by_length() {
            let words = ["elephant", "cat", "bumblebee", "dog"];
            let sorted = merge_sort_by(&words, |a, b| a.len().cmp(&b.len()));
            assert_eq!(sorted, vec!["cat", "dog", "elephant", "bumblebee"]);
        }
    
        #[test]
        fn bottom_up_matches_recursive() {
            let xs = [5, 3, 8, 1, 9, 2, 7];
            assert_eq!(merge_sort_fold(&xs), vec![1, 2, 3, 5, 7, 8, 9]);
            assert_eq!(merge_sort_fold::<i32>(&[]), Vec::<i32>::new());
            assert_eq!(merge_sort_fold(&[42]), vec![42]);
        }
    }

    Key Differences

  • Slice vs list: Rust sorts slices (&[i32]) and produces new Vec<i32>. OCaml sorts list and produces new lists. List-based merge sort is purely functional; slice-based requires cloning.
  • In-place option: Rust can implement in-place merge sort using Vec::split_at_mut and rotate_left. OCaml's immutable lists cannot be sorted in place.
  • Stability: Rust's sort is stable (preserves relative order of equal elements). sort_unstable is faster but not stable. OCaml's List.sort is not stable; List.stable_sort is.
  • **split_at**: Rust's v.split_at(mid) is O(1). OCaml must traverse to the midpoint: let rec split_at n = function ... | x :: t -> let (l, r) = split_at (n-1) t in (x :: l, r) — O(n/2).
  • OCaml Approach

    OCaml's merge sort uses recursive pattern matching on lists:

    let rec merge l1 l2 = match l1, l2 with
      | [], l | l, [] -> l
      | x :: t1, y :: t2 ->
        if x <= y then x :: merge t1 l2
        else y :: merge l1 t2
    
    let rec merge_sort = function
      | ([] | [_]) as lst -> lst
      | lst ->
        let mid = List.length lst / 2 in
        let left = List.filteri (fun i _ -> i < mid) lst in
        let right = List.filteri (fun i _ -> i >= mid) lst in
        merge (merge_sort left) (merge_sort right)
    

    The list-based version is purely functional with no mutation. List.stable_sort in OCaml's stdlib is an optimized merge sort.

    Full Source

    //! 075: Merge Sort
    //!
    //! Three variations on divide-and-conquer sorting:
    //! 1. Classic recursive merge sort on `Ord` slices.
    //! 2. Generic version parameterised by a comparator closure.
    //! 3. Bottom-up fold-style sort that pairwise-merges singletons.
    //!
    //! The implementations mirror the OCaml originals by returning fresh `Vec`s
    //! rather than sorting in place — this keeps the code purely functional and
    //! easy to compare with the OCaml source.
    
    use std::cmp::Ordering;
    
    /// Merge two already-sorted slices into a new sorted `Vec`.
    pub fn merge<T: Ord + Clone>(l1: &[T], l2: &[T]) -> Vec<T> {
        merge_by(l1, l2, |a, b| a.cmp(b))
    }
    
    /// Classic recursive merge sort: split in half, recurse, merge.
    pub fn merge_sort<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
        merge_sort_by(xs, |a, b| a.cmp(b))
    }
    
    /// Merge two sorted slices using a custom comparator.
    pub fn merge_by<T, F>(l1: &[T], l2: &[T], cmp: F) -> Vec<T>
    where
        T: Clone,
        F: Fn(&T, &T) -> Ordering,
    {
        let mut out = Vec::with_capacity(l1.len() + l2.len());
        let (mut i, mut j) = (0, 0);
        while i < l1.len() && j < l2.len() {
            if cmp(&l1[i], &l2[j]) != Ordering::Greater {
                out.push(l1[i].clone());
                i += 1;
            } else {
                out.push(l2[j].clone());
                j += 1;
            }
        }
        out.extend_from_slice(&l1[i..]);
        out.extend_from_slice(&l2[j..]);
        out
    }
    
    /// Generic merge sort with a custom comparator.
    pub fn merge_sort_by<T, F>(xs: &[T], cmp: F) -> Vec<T>
    where
        T: Clone,
        F: Fn(&T, &T) -> Ordering + Copy,
    {
        if xs.len() <= 1 {
            return xs.to_vec();
        }
        let (left, right) = xs.split_at(xs.len() / 2);
        merge_by(&merge_sort_by(left, cmp), &merge_sort_by(right, cmp), cmp)
    }
    
    /// Bottom-up merge sort: start with singletons and repeatedly pair-merge.
    pub fn merge_sort_fold<T: Ord + Clone>(xs: &[T]) -> Vec<T> {
        if xs.is_empty() {
            return Vec::new();
        }
        let mut runs: Vec<Vec<T>> = xs.iter().map(|x| vec![x.clone()]).collect();
        while runs.len() > 1 {
            runs = runs
                .chunks(2)
                .map(|pair| match pair {
                    [a, b] => merge(a, b),
                    [a] => a.clone(),
                    _ => unreachable!(),
                })
                .collect();
        }
        runs.pop().unwrap_or_default()
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn sorts_typical_list() {
            assert_eq!(
                merge_sort(&[5, 3, 8, 1, 9, 2, 7]),
                vec![1, 2, 3, 5, 7, 8, 9]
            );
        }
    
        #[test]
        fn handles_edge_cases() {
            assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
            assert_eq!(merge_sort(&[1]), vec![1]);
            assert_eq!(merge_sort(&[2, 1]), vec![1, 2]);
        }
    
        #[test]
        fn preserves_duplicates() {
            assert_eq!(merge_sort(&[3, 1, 3, 2, 1, 2]), vec![1, 1, 2, 2, 3, 3]);
        }
    
        #[test]
        fn merge_combines_two_sorted_slices() {
            assert_eq!(merge(&[1, 4, 7], &[2, 3, 8]), vec![1, 2, 3, 4, 7, 8]);
            assert_eq!(merge::<i32>(&[], &[1, 2]), vec![1, 2]);
            assert_eq!(merge(&[1, 2], &[]), vec![1, 2]);
        }
    
        #[test]
        fn ascending_with_default_comparator() {
            assert_eq!(
                merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| a.cmp(b)),
                vec![1, 3, 5, 8]
            );
        }
    
        #[test]
        fn descending_with_reversed_comparator() {
            assert_eq!(
                merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| b.cmp(a)),
                vec![8, 5, 3, 1]
            );
        }
    
        #[test]
        fn sorts_strings_by_length() {
            let words = ["elephant", "cat", "bumblebee", "dog"];
            let sorted = merge_sort_by(&words, |a, b| a.len().cmp(&b.len()));
            assert_eq!(sorted, vec!["cat", "dog", "elephant", "bumblebee"]);
        }
    
        #[test]
        fn bottom_up_matches_recursive() {
            let xs = [5, 3, 8, 1, 9, 2, 7];
            assert_eq!(merge_sort_fold(&xs), vec![1, 2, 3, 5, 7, 8, 9]);
            assert_eq!(merge_sort_fold::<i32>(&[]), Vec::<i32>::new());
            assert_eq!(merge_sort_fold(&[42]), vec![42]);
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn sorts_typical_list() {
            assert_eq!(
                merge_sort(&[5, 3, 8, 1, 9, 2, 7]),
                vec![1, 2, 3, 5, 7, 8, 9]
            );
        }
    
        #[test]
        fn handles_edge_cases() {
            assert_eq!(merge_sort::<i32>(&[]), Vec::<i32>::new());
            assert_eq!(merge_sort(&[1]), vec![1]);
            assert_eq!(merge_sort(&[2, 1]), vec![1, 2]);
        }
    
        #[test]
        fn preserves_duplicates() {
            assert_eq!(merge_sort(&[3, 1, 3, 2, 1, 2]), vec![1, 1, 2, 2, 3, 3]);
        }
    
        #[test]
        fn merge_combines_two_sorted_slices() {
            assert_eq!(merge(&[1, 4, 7], &[2, 3, 8]), vec![1, 2, 3, 4, 7, 8]);
            assert_eq!(merge::<i32>(&[], &[1, 2]), vec![1, 2]);
            assert_eq!(merge(&[1, 2], &[]), vec![1, 2]);
        }
    
        #[test]
        fn ascending_with_default_comparator() {
            assert_eq!(
                merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| a.cmp(b)),
                vec![1, 3, 5, 8]
            );
        }
    
        #[test]
        fn descending_with_reversed_comparator() {
            assert_eq!(
                merge_sort_by(&[5, 3, 8, 1], |a: &i32, b: &i32| b.cmp(a)),
                vec![8, 5, 3, 1]
            );
        }
    
        #[test]
        fn sorts_strings_by_length() {
            let words = ["elephant", "cat", "bumblebee", "dog"];
            let sorted = merge_sort_by(&words, |a, b| a.len().cmp(&b.len()));
            assert_eq!(sorted, vec!["cat", "dog", "elephant", "bumblebee"]);
        }
    
        #[test]
        fn bottom_up_matches_recursive() {
            let xs = [5, 3, 8, 1, 9, 2, 7];
            assert_eq!(merge_sort_fold(&xs), vec![1, 2, 3, 5, 7, 8, 9]);
            assert_eq!(merge_sort_fold::<i32>(&[]), Vec::<i32>::new());
            assert_eq!(merge_sort_fold(&[42]), vec![42]);
        }
    }

    Deep Comparison

    Core Insight

    Merge sort: split into halves, recursively sort each half, merge sorted halves. O(n log n) guaranteed. The functional version is particularly elegant because merging sorted lists is a natural recursive operation.

    OCaml Approach

  • • Pattern match on list to split
  • • Recursive merge of two sorted lists
  • • No mutation — creates new lists at each step
  • Rust Approach

  • • Split slice at midpoint
  • • Recursive sort on sub-slices
  • • Merge into new Vec — or use sort() (introsort)
  • Comparison Table

    FeatureOCamlRust
    SplitPattern match / List.nthsplit_at(mid)
    MergeRecursive consPush to Vec
    In-placeNo (functional)Possible but complex
    Built-inList.sort.sort() (introsort)

    Exercises

  • In-place merge sort: Implement an in-place merge sort for &mut Vec<i32> using rotate_left for the merge step. This avoids cloning but is O(n² log n) due to rotation cost.
  • External sort: Merge sort naturally extends to external sorting (data too large for RAM). Design an algorithm that reads sorted chunks from disk and merges them. What data structure manages the merge heap?
  • Natural merge sort: Detect pre-sorted runs and use them as the base for merging (this is the core of TimSort). Implement timsort_lite that detects ascending runs and merges them.
  • Open Source Repos