String.sub and String.concat — Substring and Join
Tutorial
The Problem
Extract substrings by position and length, and join a list of strings with a separator — the two most fundamental string assembly operations in OCaml's String module.
🎯 Learning Outcomes
String.sub maps to Rust's slice syntax &s[start..end]String.sub allocates a new stringString.concat sep list becomes parts.join(sep) in RustOption-based safe variant vs OCaml's exception-based error handling🦀 The Rust Way
Rust's slice syntax &s[start..start+len] borrows a view into the original string — no allocation. str::get provides the same operation with an Option return for safe indexing. Joining is parts.join(sep), a single-method call on slices, which allocates exactly once.
Code Example
let s = "Hello, World!";
let hello = &s[0..5]; // "Hello" — zero-cost borrow, no allocation
let world = &s[7..12]; // "World" — zero-cost borrow, no allocation
let parts = ["one", "two", "three"];
let joined = parts.join(" | "); // "one | two | three"Key Differences
String.sub always allocates; Rust &s[..] is a zero-cost borrow.Invalid_argument on bad bounds; Rust panics (unsafe) or returns None via .get() (safe).String.concat sep list takes a list; Rust parts.join(sep) works on any slice.&str (immutable borrow) and String (owned, mutable).OCaml Approach
OCaml's String.sub s start len extracts len characters starting at start, always returning a fresh allocated string. String.concat sep list folds a list into one string with a separator between each pair. Both raise Invalid_argument on bad inputs.
Full Source
#![allow(dead_code)]
// Solution 1: Idiomatic Rust — slice syntax for substring extraction
// &str[start..end] borrows a substring with no allocation; panics on bad range
pub fn substring(s: &str, start: usize, len: usize) -> &str {
&s[start..start + len]
}
// Solution 2: Safe variant — returns None instead of panicking on bad bounds
// OCaml raises Invalid_argument; Rust idiom is Option
pub fn substring_safe(s: &str, start: usize, len: usize) -> Option<&str> {
s.get(start..start + len)
}
// Solution 3: Idiomatic join — str::join is the direct equivalent of String.concat
pub fn join(parts: &[&str], sep: &str) -> String {
parts.join(sep)
}
// Solution 4: Functional fold-based join — mirrors OCaml's List.fold_left pattern
pub fn join_fold(parts: &[&str], sep: &str) -> String {
parts
.iter()
.enumerate()
.fold(String::new(), |mut acc, (i, part)| {
if i > 0 {
acc.push_str(sep);
}
acc.push_str(part);
acc
})
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_substring_from_start() {
assert_eq!(substring("Hello, World!", 0, 5), "Hello");
}
#[test]
fn test_substring_mid() {
assert_eq!(substring("Hello, World!", 7, 5), "World");
}
#[test]
fn test_substring_single_char() {
assert_eq!(substring("Hello", 1, 1), "e");
}
#[test]
fn test_substring_full() {
let s = "Rust";
assert_eq!(substring(s, 0, s.len()), "Rust");
}
#[test]
fn test_substring_safe_valid() {
assert_eq!(substring_safe("Hello, World!", 0, 5), Some("Hello"));
}
#[test]
fn test_substring_safe_out_of_bounds() {
assert_eq!(substring_safe("Hello", 3, 10), None);
}
#[test]
fn test_substring_safe_empty() {
assert_eq!(substring_safe("Hello", 0, 0), Some(""));
}
#[test]
fn test_join_typical() {
assert_eq!(join(&["one", "two", "three"], " | "), "one | two | three");
}
#[test]
fn test_join_empty_list() {
assert_eq!(join(&[], ", "), "");
}
#[test]
fn test_join_single() {
assert_eq!(join(&["only"], " | "), "only");
}
#[test]
fn test_join_empty_sep() {
assert_eq!(join(&["hello", "world"], ""), "helloworld");
}
#[test]
fn test_join_fold_matches_join() {
let parts = &["a", "b", "c"];
let sep = "-";
assert_eq!(join(parts, sep), join_fold(parts, sep));
}
#[test]
fn test_join_fold_empty() {
assert_eq!(join_fold(&[], ", "), "");
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_substring_from_start() {
assert_eq!(substring("Hello, World!", 0, 5), "Hello");
}
#[test]
fn test_substring_mid() {
assert_eq!(substring("Hello, World!", 7, 5), "World");
}
#[test]
fn test_substring_single_char() {
assert_eq!(substring("Hello", 1, 1), "e");
}
#[test]
fn test_substring_full() {
let s = "Rust";
assert_eq!(substring(s, 0, s.len()), "Rust");
}
#[test]
fn test_substring_safe_valid() {
assert_eq!(substring_safe("Hello, World!", 0, 5), Some("Hello"));
}
#[test]
fn test_substring_safe_out_of_bounds() {
assert_eq!(substring_safe("Hello", 3, 10), None);
}
#[test]
fn test_substring_safe_empty() {
assert_eq!(substring_safe("Hello", 0, 0), Some(""));
}
#[test]
fn test_join_typical() {
assert_eq!(join(&["one", "two", "three"], " | "), "one | two | three");
}
#[test]
fn test_join_empty_list() {
assert_eq!(join(&[], ", "), "");
}
#[test]
fn test_join_single() {
assert_eq!(join(&["only"], " | "), "only");
}
#[test]
fn test_join_empty_sep() {
assert_eq!(join(&["hello", "world"], ""), "helloworld");
}
#[test]
fn test_join_fold_matches_join() {
let parts = &["a", "b", "c"];
let sep = "-";
assert_eq!(join(parts, sep), join_fold(parts, sep));
}
#[test]
fn test_join_fold_empty() {
assert_eq!(join_fold(&[], ", "), "");
}
}
Deep Comparison
OCaml vs Rust: String.sub and String.concat
Side-by-Side Code
OCaml
let s = "Hello, World!"
let hello = String.sub s 0 5 (* "Hello" — allocates new string *)
let world = String.sub s 7 5 (* "World" — allocates new string *)
let parts = ["one"; "two"; "three"]
let joined = String.concat " | " parts (* "one | two | three" *)
Rust (idiomatic)
let s = "Hello, World!";
let hello = &s[0..5]; // "Hello" — zero-cost borrow, no allocation
let world = &s[7..12]; // "World" — zero-cost borrow, no allocation
let parts = ["one", "two", "three"];
let joined = parts.join(" | "); // "one | two | three"
Rust (safe / Option-based)
pub fn substring_safe(s: &str, start: usize, len: usize) -> Option<&str> {
s.get(start..start + len)
}
// Instead of catching Invalid_argument, callers pattern-match on None:
match substring_safe("Hello", 3, 10) {
Some(sub) => println!("{}", sub),
None => println!("out of bounds"),
}
Rust (fold-based join, mirrors OCaml's List.fold_left)
pub fn join_fold(parts: &[&str], sep: &str) -> String {
parts
.iter()
.enumerate()
.fold(String::new(), |mut acc, (i, part)| {
if i > 0 { acc.push_str(sep); }
acc.push_str(part);
acc
})
}
Type Signatures
| Concept | OCaml | Rust |
|---|---|---|
| Substring (unsafe) | val sub : string -> int -> int -> string | fn substring(s: &str, start: usize, len: usize) -> &str |
| Substring (safe) | (raises Invalid_argument) | fn substring_safe(s: &str, start: usize, len: usize) -> Option<&str> |
| Join | val concat : string -> string list -> string | fn join(parts: &[&str], sep: &str) -> String |
| Slice view | string (always owned) | &str (borrowed, zero-cost) |
| Owned string | string (always) | String (heap-allocated, mutable) |
Key Insights
String.sub always allocates a new string. Rust &s[start..end] produces a &str that points into the original buffer — no allocation, no copy. This is one of Rust's biggest advantages for string processing.Invalid_argument on out-of-bounds access; you either catch it with try ... with or let it crash. Rust gives you a choice: panic immediately (&s[..]) or handle gracefully (s.get(..) returning Option<&str>). The safe variant forces callers to handle the error at compile time.join vs concat:** OCaml's String.concat sep list takes the separator first, then the list. Rust's .join(sep) is a method on &[&str] — it reads left-to-right and is often more ergonomic in a pipeline. Both allocate exactly once.&str vs String:** OCaml has one string type (immutable, reference-counted or copied). Rust has two: &str (borrowed, read-only view) and String (owned, growable). Substring operations return &str; join returns String because it must allocate new memory to combine parts.List.fold_left to build strings. The Rust equivalent with .fold(String::new(), ...) works but is less efficient than .join() (which can pre-compute the total length). Use .join() in production; use fold when teaching functional patterns.When to Use Each Style
**Use idiomatic Rust (&s[..] and .join()) when:** you want maximum performance with clear, readable code. These are zero-allocation where possible and idiomatically Rust.
**Use s.get(start..end) when:** the range may be invalid and you want to handle it gracefully without panicking — e.g., parsing untrusted input.
Use fold-based join when: you're building a string incrementally with non-uniform separators or conditionally including elements, where .join() does not fit.