476: split(), splitn(), split_once()
Difficulty: 1 Level: Beginner Split strings lazily, with limits, or exactly once โ zero-copy slices of the original.The Problem This Solves
Every language has string splitting. Python's `str.split()` returns a list. JavaScript's `.split()` returns an array. Both create new string objects for every piece. Rust's `.split()` returns a lazy iterator of `&str` slices โ each piece is just a pointer + length into the original string. No allocation until you actually need it. This matters when you're processing large inputs. Splitting a 1 MB CSV line in Python allocates all fields up front. In Rust, you iterate lazily and only allocate what you use. And since each piece is a `&str` slice of the original, you can pass them directly to functions without copying. Rust also has two specialized variants that fill common gaps: `split_once()` for key=value parsing (no need to split the whole string when you only want the first delimiter), and `splitn()` for limiting the number of pieces (so the last piece contains the remainder as-is).The Intuition
`.split(pattern)` gives you an iterator. Think of it as Python's `str.split()` but lazy โ you pull pieces on demand. Call `.collect::<Vec<_>>()` when you need all pieces at once. `split_once('=')` on `"host=localhost:8080"` gives you `Some(("host", "localhost:8080"))`. It splits on the first occurrence only. Python equivalent: `s.split('=', 1)` returning a 2-element list. Perfect for parsing config lines, HTTP headers, query params. `splitn(3, '/')` gives at most 3 pieces โ the last piece is the remaining string unsplit. OCaml's `String.split_on_char` has no limit; you'd implement this manually.How It Works in Rust
// split โ lazy iterator of &str (zero-copy slices)
let csv = "alice,30,amsterdam,developer";
let parts: Vec<_> = csv.split(',').collect();
// ["alice", "30", "amsterdam", "developer"]
// splitn โ at most n pieces (last piece = remainder)
let path = "a/b/c/d/e";
let parts: Vec<_> = path.splitn(3, '/').collect();
// ["a", "b", "c/d/e"] โ last piece is unsplit remainder
// split_once โ exactly one split, returns Option<(&str, &str)>
let kv = "host=localhost:8080";
if let Some((key, value)) = kv.split_once('=') {
println!("key='{}' val='{}'", key, value);
// key='host' val='localhost:8080' โ colon preserved in value
}
// rsplit_once โ split from the RIGHT (last occurrence)
let file = "/home/user/file.txt";
if let Some((dir, name)) = file.rsplit_once('/') {
println!("dir='{}' name='{}'", dir, name);
// dir='/home/user' name='file.txt'
}
// split_whitespace โ handles all Unicode whitespace, trims leading/trailing
let words: Vec<_> = " hello world\t!\n ".split_whitespace().collect();
// ["hello", "world", "!"] โ no empty strings
// lines() โ split on \n or \r\n
for (i, line) in "line1\nline2\r\nline3".lines().enumerate() {
println!("{}: {}", i + 1, line);
}
What This Unlocks
- CSV/TSV parsing โ `.split(',')` yields lazy field slices, collect only what you need.
- HTTP header parsing โ `header.split_once(':')` cleanly separates name from value.
- URL/path parsing โ `splitn` and `rsplit_once` give you clean directory/filename splits.
Key Differences
| Concept | OCaml | Rust |
|---|---|---|
| Split on char | `String.split_on_char ',' s` โ `string list` | `s.split(',')` โ lazy iterator |
| Split with limit | Manual | `s.splitn(n, sep)` |
| Split once | Manual with pattern match | `s.split_once(sep)` โ `Option<(&str, &str)>` |
| Split from right | Manual | `s.rsplit_once(sep)` |
| Split on whitespace | Manual filter | `s.split_whitespace()` |
| Split on newlines | `String.split_on_char '\n'` | `s.lines()` (handles `\r\n` too) |
| Allocation | New `string` per piece | `&str` slices โ zero-copy |
// 476. split(), splitn(), split_once()
fn main() {
// split โ lazy iterator
let csv = "alice,30,amsterdam,developer";
let parts: Vec<_> = csv.split(',').collect();
println!("{:?}", parts);
// splitn โ at most n pieces (last contains remainder)
let path = "a/b/c/d/e";
println!("{:?}", path.splitn(3, '/').collect::<Vec<_>>()); // ["a","b","c/d/e"]
// split_once โ perfect for key=value
let kv = "host=localhost:8080";
if let Some((k,v)) = kv.split_once('=') { println!("key='{}' val='{}'", k, v); }
// rsplit_once โ from the right
let file = "/home/user/file.txt";
if let Some((dir,name)) = file.rsplit_once('/') { println!("dir='{}' file='{}'", dir, name); }
// split_whitespace โ handles all Unicode whitespace
let words: Vec<_> = " hello world\t!\n ".split_whitespace().collect();
println!("{:?}", words);
// lines()
for (i,l) in "line1\nline2\nline3".lines().enumerate() { println!("{}: {}", i+1, l); }
}
#[cfg(test)]
mod tests {
#[test] fn test_split() { assert_eq!("a,b,c".split(',').collect::<Vec<_>>(),["a","b","c"]); }
#[test] fn test_splitn() { let v:Vec<_>="a:b:c:d".splitn(3,':').collect(); assert_eq!(v,["a","b","c:d"]); }
#[test] fn test_split_once() { assert_eq!("k=v".split_once('='),Some(("k","v"))); assert_eq!("noeq".split_once('='),None); }
#[test] fn test_whitespace() { let w:Vec<_>=" a b c ".split_whitespace().collect(); assert_eq!(w,["a","b","c"]); }
}
(* 476. String splitting โ OCaml *)
let () =
let csv = "alice,30,amsterdam" in
List.iter (fun p -> Printf.printf "'%s'\n" p) (String.split_on_char ',' csv);
(* split_once equivalent *)
let split_once sep s =
match String.split_on_char sep s with
| [] | [_] -> None
| h::t -> Some(h, String.concat (String.make 1 sep) t)
in
(match split_once '=' "key=value=extra" with
| Some(k,v) -> Printf.printf "k=%s v=%s\n" k v | None->());
(* split_whitespace *)
let words = List.filter ((<>) "") (String.split_on_char ' ' " a b c ") in
Printf.printf "words: %d\n" (List.length words)