๐Ÿฆ€ Functional Rust

374: Radix Tree / Patricia Trie

Difficulty: 4 Level: Expert Compress a trie by merging single-child chains into multi-character edge labels โ€” the data structure behind IP routing and autocomplete.

The Problem This Solves

A standard trie stores one character per node. For a dictionary of English words, most internal nodes have exactly one child โ€” a long chain spelling out a common prefix like "inter-". This wastes memory and makes lookups traverse many nodes for no branching benefit. A Radix tree (also called Patricia Trie or Compressed Trie) solves this by collapsing single-child chains into a single edge with a multi-character label. "international" and "internet" share the prefix "intern" โ€” that prefix becomes one edge label, and only the node after it branches. This structure is ideal for routing tables (IP prefixes), DNS lookups, and filesystem path trees. Real-world routers use Patricia Tries because longest-prefix matching is O(key length) regardless of table size.

The Intuition

Imagine a filing cabinet where folders can have multi-word labels. Instead of one folder per letter ("i", then "n", then "t"...), you have one folder labeled "inter" with two sub-folders "national" and "net". Lookup: find the folder whose label is a prefix of your search word, then recurse into it with the remaining characters. When inserting a new word that shares only part of an existing edge label, you split the edge: create an intermediate node for the common prefix, with two children for the differing suffixes.

How It Works in Rust

1. `RadixNode` โ€” `children: HashMap<String, RadixNode>` maps edge labels to child nodes. `is_end: bool` marks complete words. 2. `insert_node` โ€” scan children for a matching edge (one that is a prefix of `remaining`, or vice versa): - Full match: edge is a prefix of `remaining` โ†’ recurse with the rest. - Partial match: find the common prefix length, split the edge, create an intermediate node. - No match: insert a new edge with `remaining` as the label. 3. `search_node` โ€” for each child edge, check if `remaining` starts with it; if so, recurse with the suffix. 4. `starts_with` โ€” similar, but succeeds as soon as `remaining` is empty (prefix matched) or an edge starts with `remaining`.
// Split an edge: "card" โ†’ insert "care"
// common = "car", edge_rest = "d", word_rest = "e"
let mut new_node = RadixNode::new();
new_node.children.insert(edge_rest, old_child); // "d" โ†’ old "card" subtree
new_node.children.insert(word_rest, word_leaf);  // "e" โ†’ new "care" leaf
node.children.insert(common, new_node);           // "car" โ†’ split node

What This Unlocks

Key Differences

ConceptOCamlRust
Trie nodeRecursive ADT`struct RadixNode` with owned fields
Edge labelsSingle `char` (standard trie)`String` (multi-char labels)
Children map`Map.Make(Char)``HashMap<String, RadixNode>`
Edge splittingExplicit recursive case`remove` โ†’ build intermediate โ†’ `insert`
Prefix searchWalk tree, check `is_end`Same, short-circuit on empty `remaining`
use std::collections::HashMap;

#[derive(Debug, Default)]
struct RadixNode {
    children: HashMap<String, RadixNode>, // edge label -> child
    is_end: bool,
}

impl RadixNode {
    fn new() -> Self { Default::default() }
}

struct RadixTree { root: RadixNode }

impl RadixTree {
    fn new() -> Self { Self { root: RadixNode::new() } }

    fn insert(&mut self, word: &str) { Self::insert_node(&mut self.root, word); }

    fn insert_node(node: &mut RadixNode, remaining: &str) {
        if remaining.is_empty() { node.is_end = true; return; }

        // Find a matching edge
        let key = node.children.keys()
            .find(|k| remaining.starts_with(k.as_str()) || k.starts_with(remaining))
            .cloned();

        match key {
            Some(edge) if remaining.starts_with(&edge[..]) => {
                // remaining has edge as prefix: go deeper
                let rest = &remaining[edge.len()..];
                let child = node.children.get_mut(&edge).unwrap();
                Self::insert_node(child, rest);
            }
            Some(edge) => {
                // Split the edge
                let common_len = edge.chars().zip(remaining.chars()).take_while(|(a,b)| a==b).count();
                let common = edge[..common_len].to_string();
                let edge_rest = edge[common_len..].to_string();
                let word_rest = remaining[common_len..].to_string();
                let mut old_child = node.children.remove(&edge).unwrap();
                let mut new_node = RadixNode::new();
                if word_rest.is_empty() { new_node.is_end = true; }
                new_node.children.insert(edge_rest, old_child);
                if !word_rest.is_empty() {
                    let mut word_node = RadixNode::new();
                    word_node.is_end = true;
                    new_node.children.insert(word_rest, word_node);
                }
                node.children.insert(common, new_node);
            }
            None => {
                // New edge
                let mut child = RadixNode::new();
                child.is_end = true;
                node.children.insert(remaining.to_string(), child);
            }
        }
    }

    fn search(&self, word: &str) -> bool { Self::search_node(&self.root, word) }

    fn search_node(node: &RadixNode, remaining: &str) -> bool {
        if remaining.is_empty() { return node.is_end; }
        for (edge, child) in &node.children {
            if remaining.starts_with(edge.as_str()) {
                return Self::search_node(child, &remaining[edge.len()..]);
            }
        }
        false
    }

    fn starts_with(&self, prefix: &str) -> bool { Self::prefix_node(&self.root, prefix) }

    fn prefix_node(node: &RadixNode, remaining: &str) -> bool {
        if remaining.is_empty() { return true; }
        for (edge, child) in &node.children {
            if edge.starts_with(remaining) { return true; }
            if remaining.starts_with(edge.as_str()) {
                return Self::prefix_node(child, &remaining[edge.len()..]);
            }
        }
        false
    }
}

fn main() {
    let mut rt = RadixTree::new();
    for w in ["car","card","care","cat","cars","app","apple","application"] { rt.insert(w); }

    for w in ["car","card","care","cat","ca","app","apple","xyz"] {
        println!("{w}: search={}, prefix={}", rt.search(w), rt.starts_with(w));
    }
}

#[cfg(test)]
mod tests {
    use super::*;
    #[test] fn insert_search() {
        let mut rt = RadixTree::new();
        rt.insert("hello"); rt.insert("world");
        assert!(rt.search("hello")); assert!(!rt.search("hell"));
    }
    #[test] fn prefix_search() {
        let mut rt = RadixTree::new();
        for w in ["car","card","care"] { rt.insert(w); }
        assert!(rt.starts_with("car")); assert!(rt.starts_with("ca")); assert!(!rt.starts_with("xyz"));
    }
    #[test] fn shared_prefix() {
        let mut rt = RadixTree::new();
        rt.insert("car"); rt.insert("card");
        assert!(rt.search("car")); assert!(rt.search("card")); assert!(!rt.search("carde"));
    }
}
(* OCaml: radix tree / compressed trie *)

(* We show the compression concept *)
(* Standard trie for 'car','card','care' would have:
   root -> c -> a -> r -> (end) -> d -> (end)
                              \-> e -> (end)
   Radix compresses 'c','a' into edge 'ca' if no branching *)

type radix =
  | Leaf of string
  | Node of (string * radix) list * bool  (* edges * is_end *)

let empty = Node ([], false)

let rec insert node path =
  match node with
  | Leaf s -> if s = path then Leaf s else
    (* Find common prefix *)
    let rec cp a b i =
      if i >= String.length a || i >= String.length b || a.[i] <> b.[i] then i
      else cp a b (i+1)
    in
    let i = cp s path 0 in
    let pre = String.sub s 0 i in
    Node ([String.sub s i (String.length s - i), Leaf s;
           String.sub path i (String.length path - i), Leaf path], false)
  | _ -> node  (* simplified *)

let () =
  Printf.printf "Radix tree compresses 'car','card','care' into:\n";
  Printf.printf "  root --'car'--> Node(end) --'d'--> Leaf --'e'--> Leaf\n";
  Printf.printf "This uses 3 nodes instead of 7 in a standard trie\n"