010: Run-Length Encoding
Difficulty: โญโญ Level: Intermediate Compress a list by replacing repeated runs with a count + value pair.The Problem This Solves
Run-length encoding (RLE) is one of the simplest compression techniques. Instead of storing `["a", "a", "a", "b", "c", "c", "c"]`, you store `[(3, "a"), (1, "b"), (3, "c")]` โ the run length and the value. Older image formats, fax machines, and simple protocols all use RLE. Building it from scratch in a loop is straightforward but tedious: maintain a counter, track the "current element," compare, flush when it changes. Rust's iterator model lets you build this cleanly, either by composing smaller operations (pack-then-map) or in a single efficient fold. This example also introduces function composition: the `encode` function builds on a `pack` helper, chaining transformations rather than computing everything at once.The Intuition
Think of it in two steps: 1. Pack: group consecutive equal elements together `["a","a","b","c","c","c"]` โ `[["a","a"], ["b"], ["c","c","c"]]` 2. Encode: count each group `[["a","a"], ["b"], ["c","c","c"]]` โ `[(2,"a"), (1,"b"), (3,"c")]` In Python you'd reach for `itertools.groupby`:from itertools import groupby
result = [(sum(1 for _ in g), k) for k, g in groupby(lst)]
Rust does the same with `.fold()` or by composing a pack function with `.map()`. The end result is a `Vec<(usize, T)>` โ a vector of (count, value) tuples.
How It Works in Rust
// The composed approach: pack first, then map each group to (count, first_element)
fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
pack(list)
.into_iter()
.map(|group| (group.len(), group[0].clone()))
.collect()
}
The single-pass fold version (no intermediate groups, most efficient):
fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
list.iter().fold(Vec::new(), |mut acc, x| {
match acc.last_mut() {
Some((count, val)) if val == x => *count += 1, // extend current run
_ => acc.push((1, x.clone())), // start new run
}
acc
})
}
The `last_mut()` call gets a mutable reference to the last element โ if it matches `x`, we increment the counter in place. Otherwise we start a new `(1, x)` run. One pass, no intermediate allocations.
Using the result:
let encoded = encode(&["a","a","b","c","c","c"]);
// [(2, "a"), (1, "b"), (3, "c")]
// Decoding: reverse the process
let decoded: Vec<&str> = encoded.iter()
.flat_map(|(count, val)| std::iter::repeat(*val).take(*count))
.collect();
// ["a", "a", "b", "c", "c", "c"]
What This Unlocks
- Simple compression โ image data, network packets, any repetitive sequence
- Sequence summarization โ "3 consecutive failures, then 1 success" in logs
- Decoding โ `flat_map` with `repeat().take()` reverses the encoding in one expression
Key Differences
| Concept | OCaml | Rust |
|---|---|---|
| Tuple type | `int * 'a` | `(usize, T)` |
| Count type | `int` (signed) | `usize` (unsigned) |
| Composing pack + encode | `List.map` after `pack` | `.into_iter().map()` |
| Single-pass fold | `List.fold_left` | `.fold()` with `last_mut()` |
| Intermediate groups | `Vec<Vec<T>>` then map | Optional โ fold avoids it |
// Run-length Encoding
// Rust translation from OCaml 99 Problems #10
// Run-length encoding
fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
pack(list)
.into_iter()
.map(|group| (group.len(), group[0].clone()))
.collect()
}
// Helper: pack function (from example 009)
fn pack<T: PartialEq + Clone>(list: &[T]) -> Vec<Vec<T>> {
if list.is_empty() {
return vec![];
}
let mut result = vec![];
let mut current = vec![list[0].clone()];
for i in 1..list.len() {
if list[i] == list[i - 1] {
current.push(list[i].clone());
} else {
result.push(current);
current = vec![list[i].clone()];
}
}
result.push(current);
result
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_encode() {
assert_eq!(
encode(&["a","a","b","c","c","c"]),
vec![(2,"a"), (1,"b"), (3,"c")]
);
assert_eq!(encode::<i32>(&[]), Vec::<(usize, i32)>::new());
assert_eq!(encode(&[1]), vec![(1, 1)]);
}
}
fn main() {
let list = vec!["a","a","b","c","c","c"];
println!("encode({:?}) = {:?}", list, encode(&list));
println!("โ Rust tests passed");
}
(* Run-length Encoding *)
(* OCaml 99 Problems #10 *)
(* First, pack consecutive elements *)
let pack lst =
let rec aux current acc = function
| [] -> []
| [x] -> (x :: current) :: acc
| h1 :: (h2 :: _ as t) ->
if h1 = h2 then aux (h1 :: current) acc t
else aux [] ((h1 :: current) :: acc) t
in
List.rev (aux [] [] lst)
(* Then encode by counting *)
let encode lst =
let rec count n = function
| [] -> []
| [x] -> [(n + 1, x)]
| h1 :: (h2 :: _ as t) -> (n + 1, h1) :: count 0 t
in
count 0 (pack lst)
(* Tests *)
let () =
assert (encode ["a";"a";"b";"c";"c";"c"] = [(2,"a");(1,"b");(3,"c")]);
assert (encode [] = []);
assert (encode [1] = [(1,1)]);
print_endline "โ OCaml tests passed"
๐ Detailed Comparison
Comparison: Run-Length Encoding
OCaml โ Pack then encode
๐ช Show OCaml equivalent
let encode lst =
List.map (fun group -> (List.length group, List.hd group)) (pack lst)
Rust โ Idiomatic (compose with pack)
pub fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
pack(list)
.into_iter()
.map(|group| (group.len(), group[0].clone()))
.collect()
}Rust โ Functional (single-pass fold)
pub fn encode_fold<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)> {
list.iter().fold(Vec::new(), |mut acc, item| {
match acc.last_mut() {
Some((count, ref val)) if val == item => *count += 1,
_ => acc.push((1, item.clone())),
}
acc
})
}Comparison Table
| Aspect | OCaml | Rust |
|---|---|---|
| Tuple type | `int * 'a` | `(usize, T)` |
| Count type | `int` (signed) | `usize` (unsigned) |
| Composition | `List.map f (pack lst)` | `pack().into_iter().map().collect()` |
| Single-pass | Separate recursive function | `fold` with `last_mut()` |
| Ownership | GC handles pack result | `into_iter()` consumes packed groups |
Type Signatures
- OCaml: `val encode : 'a list -> (int * 'a) list`
- Rust: `fn encode<T: PartialEq + Clone>(list: &[T]) -> Vec<(usize, T)>`
Takeaways
1. Both languages naturally compose: pack first, then transform โ functional composition at work 2. Rust's `into_iter()` consumes the intermediate packed groups โ no waste 3. The fold version shows Rust can match OCaml's single-pass elegance with `last_mut()` 4. `usize` for counts is more semantically correct (counts can't be negative) than OCaml's `int` 5. The pattern of `match acc.last_mut()` is a Rust idiom for stateful folds โ learn it well