ExamplesBy LevelBy TopicLearning Paths
1194 Fundamental

String.sub and String.concat — Substring and Join

stdlib-string

Tutorial

The Problem

Extract substrings by position and length, and join a list of strings with a separator — the two most fundamental string assembly operations in OCaml's String module.

🎯 Learning Outcomes

  • • How OCaml's String.sub maps to Rust's slice syntax &s[start..end]
  • • Why Rust slices are zero-cost borrows while OCaml String.sub allocates a new string
  • • How String.concat sep list becomes parts.join(sep) in Rust
  • • The Option-based safe variant vs OCaml's exception-based error handling
  • 🦀 The Rust Way

    Rust's slice syntax &s[start..start+len] borrows a view into the original string — no allocation. str::get provides the same operation with an Option return for safe indexing. Joining is parts.join(sep), a single-method call on slices, which allocates exactly once.

    Code Example

    let s = "Hello, World!";
    let hello = &s[0..5];    // "Hello" — zero-cost borrow, no allocation
    let world = &s[7..12];   // "World" — zero-cost borrow, no allocation
    
    let parts = ["one", "two", "three"];
    let joined = parts.join(" | ");  // "one | two | three"

    Key Differences

  • Allocation: OCaml String.sub always allocates; Rust &s[..] is a zero-cost borrow.
  • Error model: OCaml raises Invalid_argument on bad bounds; Rust panics (unsafe) or returns None via .get() (safe).
  • Join: OCaml String.concat sep list takes a list; Rust parts.join(sep) works on any slice.
  • Mutability: OCaml strings are immutable; Rust has both &str (immutable borrow) and String (owned, mutable).
  • OCaml Approach

    OCaml's String.sub s start len extracts len characters starting at start, always returning a fresh allocated string. String.concat sep list folds a list into one string with a separator between each pair. Both raise Invalid_argument on bad inputs.

    Full Source

    #![allow(dead_code)]
    
    // Solution 1: Idiomatic Rust — slice syntax for substring extraction
    // &str[start..end] borrows a substring with no allocation; panics on bad range
    pub fn substring(s: &str, start: usize, len: usize) -> &str {
        &s[start..start + len]
    }
    
    // Solution 2: Safe variant — returns None instead of panicking on bad bounds
    // OCaml raises Invalid_argument; Rust idiom is Option
    pub fn substring_safe(s: &str, start: usize, len: usize) -> Option<&str> {
        s.get(start..start + len)
    }
    
    // Solution 3: Idiomatic join — str::join is the direct equivalent of String.concat
    pub fn join(parts: &[&str], sep: &str) -> String {
        parts.join(sep)
    }
    
    // Solution 4: Functional fold-based join — mirrors OCaml's List.fold_left pattern
    pub fn join_fold(parts: &[&str], sep: &str) -> String {
        parts
            .iter()
            .enumerate()
            .fold(String::new(), |mut acc, (i, part)| {
                if i > 0 {
                    acc.push_str(sep);
                }
                acc.push_str(part);
                acc
            })
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_substring_from_start() {
            assert_eq!(substring("Hello, World!", 0, 5), "Hello");
        }
    
        #[test]
        fn test_substring_mid() {
            assert_eq!(substring("Hello, World!", 7, 5), "World");
        }
    
        #[test]
        fn test_substring_single_char() {
            assert_eq!(substring("Hello", 1, 1), "e");
        }
    
        #[test]
        fn test_substring_full() {
            let s = "Rust";
            assert_eq!(substring(s, 0, s.len()), "Rust");
        }
    
        #[test]
        fn test_substring_safe_valid() {
            assert_eq!(substring_safe("Hello, World!", 0, 5), Some("Hello"));
        }
    
        #[test]
        fn test_substring_safe_out_of_bounds() {
            assert_eq!(substring_safe("Hello", 3, 10), None);
        }
    
        #[test]
        fn test_substring_safe_empty() {
            assert_eq!(substring_safe("Hello", 0, 0), Some(""));
        }
    
        #[test]
        fn test_join_typical() {
            assert_eq!(join(&["one", "two", "three"], " | "), "one | two | three");
        }
    
        #[test]
        fn test_join_empty_list() {
            assert_eq!(join(&[], ", "), "");
        }
    
        #[test]
        fn test_join_single() {
            assert_eq!(join(&["only"], " | "), "only");
        }
    
        #[test]
        fn test_join_empty_sep() {
            assert_eq!(join(&["hello", "world"], ""), "helloworld");
        }
    
        #[test]
        fn test_join_fold_matches_join() {
            let parts = &["a", "b", "c"];
            let sep = "-";
            assert_eq!(join(parts, sep), join_fold(parts, sep));
        }
    
        #[test]
        fn test_join_fold_empty() {
            assert_eq!(join_fold(&[], ", "), "");
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn test_substring_from_start() {
            assert_eq!(substring("Hello, World!", 0, 5), "Hello");
        }
    
        #[test]
        fn test_substring_mid() {
            assert_eq!(substring("Hello, World!", 7, 5), "World");
        }
    
        #[test]
        fn test_substring_single_char() {
            assert_eq!(substring("Hello", 1, 1), "e");
        }
    
        #[test]
        fn test_substring_full() {
            let s = "Rust";
            assert_eq!(substring(s, 0, s.len()), "Rust");
        }
    
        #[test]
        fn test_substring_safe_valid() {
            assert_eq!(substring_safe("Hello, World!", 0, 5), Some("Hello"));
        }
    
        #[test]
        fn test_substring_safe_out_of_bounds() {
            assert_eq!(substring_safe("Hello", 3, 10), None);
        }
    
        #[test]
        fn test_substring_safe_empty() {
            assert_eq!(substring_safe("Hello", 0, 0), Some(""));
        }
    
        #[test]
        fn test_join_typical() {
            assert_eq!(join(&["one", "two", "three"], " | "), "one | two | three");
        }
    
        #[test]
        fn test_join_empty_list() {
            assert_eq!(join(&[], ", "), "");
        }
    
        #[test]
        fn test_join_single() {
            assert_eq!(join(&["only"], " | "), "only");
        }
    
        #[test]
        fn test_join_empty_sep() {
            assert_eq!(join(&["hello", "world"], ""), "helloworld");
        }
    
        #[test]
        fn test_join_fold_matches_join() {
            let parts = &["a", "b", "c"];
            let sep = "-";
            assert_eq!(join(parts, sep), join_fold(parts, sep));
        }
    
        #[test]
        fn test_join_fold_empty() {
            assert_eq!(join_fold(&[], ", "), "");
        }
    }

    Deep Comparison

    OCaml vs Rust: String.sub and String.concat

    Side-by-Side Code

    OCaml

    let s = "Hello, World!"
    let hello = String.sub s 0 5       (* "Hello" — allocates new string *)
    let world = String.sub s 7 5       (* "World" — allocates new string *)
    
    let parts = ["one"; "two"; "three"]
    let joined = String.concat " | " parts  (* "one | two | three" *)
    

    Rust (idiomatic)

    let s = "Hello, World!";
    let hello = &s[0..5];    // "Hello" — zero-cost borrow, no allocation
    let world = &s[7..12];   // "World" — zero-cost borrow, no allocation
    
    let parts = ["one", "two", "three"];
    let joined = parts.join(" | ");  // "one | two | three"
    

    Rust (safe / Option-based)

    pub fn substring_safe(s: &str, start: usize, len: usize) -> Option<&str> {
        s.get(start..start + len)
    }
    
    // Instead of catching Invalid_argument, callers pattern-match on None:
    match substring_safe("Hello", 3, 10) {
        Some(sub) => println!("{}", sub),
        None      => println!("out of bounds"),
    }
    

    Rust (fold-based join, mirrors OCaml's List.fold_left)

    pub fn join_fold(parts: &[&str], sep: &str) -> String {
        parts
            .iter()
            .enumerate()
            .fold(String::new(), |mut acc, (i, part)| {
                if i > 0 { acc.push_str(sep); }
                acc.push_str(part);
                acc
            })
    }
    

    Type Signatures

    ConceptOCamlRust
    Substring (unsafe)val sub : string -> int -> int -> stringfn substring(s: &str, start: usize, len: usize) -> &str
    Substring (safe)(raises Invalid_argument)fn substring_safe(s: &str, start: usize, len: usize) -> Option<&str>
    Joinval concat : string -> string list -> stringfn join(parts: &[&str], sep: &str) -> String
    Slice viewstring (always owned)&str (borrowed, zero-cost)
    Owned stringstring (always)String (heap-allocated, mutable)

    Key Insights

  • Zero-cost slicing: OCaml String.sub always allocates a new string. Rust &s[start..end] produces a &str that points into the original buffer — no allocation, no copy. This is one of Rust's biggest advantages for string processing.
  • Error discipline: OCaml raises Invalid_argument on out-of-bounds access; you either catch it with try ... with or let it crash. Rust gives you a choice: panic immediately (&s[..]) or handle gracefully (s.get(..) returning Option<&str>). The safe variant forces callers to handle the error at compile time.
  • **join vs concat:** OCaml's String.concat sep list takes the separator first, then the list. Rust's .join(sep) is a method on &[&str] — it reads left-to-right and is often more ergonomic in a pipeline. Both allocate exactly once.
  • **&str vs String:** OCaml has one string type (immutable, reference-counted or copied). Rust has two: &str (borrowed, read-only view) and String (owned, growable). Substring operations return &str; join returns String because it must allocate new memory to combine parts.
  • Fold-based join mirrors OCaml style: OCaml programmers often reach for List.fold_left to build strings. The Rust equivalent with .fold(String::new(), ...) works but is less efficient than .join() (which can pre-compute the total length). Use .join() in production; use fold when teaching functional patterns.
  • When to Use Each Style

    **Use idiomatic Rust (&s[..] and .join()) when:** you want maximum performance with clear, readable code. These are zero-allocation where possible and idiomatically Rust.

    **Use s.get(start..end) when:** the range may be invalid and you want to handle it gracefully without panicking — e.g., parsing untrusted input.

    Use fold-based join when: you're building a string incrementally with non-uniform separators or conditionally including elements, where .join() does not fit.

    Open Source Repos