766: Config File Parsing (INI/TOML-Like)
Difficulty: 3 Level: Intermediate Parse a human-editable config file with sections, key=value pairs, and inline comments โ returning a two-level `HashMap` with typed accessors.The Problem This Solves
Almost every application needs configuration: database host, port, feature flags, timeout values. Hard-coding these means recompiling for every environment. Environment variables work but don't support structure. Full JSON or TOML parsers add dependencies and require exact syntax from non-technical operators. INI-style files hit a practical sweet spot: operators understand them without documentation, they support comments, they're human-editable in any text editor, and the parser is short enough to audit. When you don't want to add `toml` or `config` to your dependency tree โ embedded systems, CLI tools, proprietary software with restricted dependencies โ you write this yourself. The parsing challenge is modest but realistic: strip comments (text after `#`), detect section headers (`[name]`), split key=value pairs, and handle missing keys with sensible defaults. This is also a good example of Rust's string-handling patterns: `split_once`, `trim`, `starts_with`, `ends_with`.The Intuition
Python's `configparser` is the direct equivalent โ `config['server']['host']` returns the value, `config.get('server', 'host', fallback='localhost')` returns a default. Rust's version is a plain `HashMap<String, HashMap<String, String>>`, which is explicit about the two-level structure. The typed accessors (`get_str`, `get_int`, `get_bool`) avoid pushing the parsing burden onto callers. Instead of `cfg["database"]["port"].parse::<u16>().unwrap_or(5432)` at every call site, callers write `get_int(&cfg, "database", "port", 5432)`. The default is explicit; the error is silent and documented.How It Works in Rust
pub type Config = HashMap<String, HashMap<String, String>>;
pub fn parse_config(text: &str) -> Config {
let mut cfg: Config = HashMap::new();
let mut current_section = "global".to_string();
cfg.entry("global".to_string()).or_default();
for raw_line in text.lines() {
// Strip inline comment: "host = db.example.com # production" โ "host = db.example.com"
let line = raw_line.split_once('#').map(|(l, _)| l).unwrap_or(raw_line);
let line = line.trim();
if line.is_empty() { continue; }
if line.starts_with('[') && line.ends_with(']') {
// [server] โ section "server"
current_section = line[1..line.len() - 1].trim().to_string();
cfg.entry(current_section.clone()).or_default();
} else if let Some((key, value)) = line.split_once('=') {
cfg.entry(current_section.clone())
.or_default()
.insert(key.trim().to_string(), value.trim().to_string());
}
// malformed lines are silently skipped
}
cfg
}
// Typed accessors with defaults
pub fn get_str<'a>(cfg: &'a Config, section: &str, key: &str, default: &'a str) -> &'a str {
cfg.get(section).and_then(|s| s.get(key)).map(|s| s.as_str()).unwrap_or(default)
}
pub fn get_int(cfg: &Config, section: &str, key: &str, default: i64) -> i64 {
get_str(cfg, section, key, "").parse().unwrap_or(default)
}
pub fn get_bool(cfg: &Config, section: &str, key: &str, default: bool) -> bool {
match get_str(cfg, section, key, "") {
"true" | "yes" | "1" | "on" => true,
"false"| "no" | "0" | "off" => false,
_ => default,
}
}
// Usage
let cfg = parse_config(include_str!("config.ini"));
let host = get_str(&cfg, "server", "host", "localhost");
let port = get_int(&cfg, "server", "port", 8080) as u16;
let debug = get_bool(&cfg, "server", "debug", false);
Input:
# Main config
[server]
host = localhost
port = 8080
debug = true
[database]
host = db.example.com # production DB
port = 5432
Key points:
- `split_once('#')` strips inline comments in one line โ clean and allocation-free
- `.or_default()` on `HashMap::entry` creates the section if it doesn't exist
- `get_bool` accepts `true/yes/1/on` and `false/no/0/off` โ operators use different conventions
- `include_str!("config.ini")` embeds the config file at compile time โ useful for defaults
- Malformed lines are silently skipped โ robust for real-world files with blank lines and comments
What This Unlocks
- Zero-dependency config loading: add configuration to a library or CLI tool without `serde`, `toml`, or `config` โ just this 50-line parser
- Operator-friendly format: non-developers can edit `[database]\nhost = prod-db` without JSON syntax errors
- Layered configuration: parse multiple config files and merge the `HashMap`s โ later files override earlier ones
Key Differences
| Concept | OCaml | Rust |
|---|---|---|
| Data structure | `Hashtbl` or association list | `HashMap<String, HashMap<String, String>>` |
| Comment stripping | `String.split_on_char '#'` | `split_once('#')` returns `Option<(&str, &str)>` |
| Section detection | `String.get` + char comparison | `starts_with('[') && ends_with(']')` |
| Default values | `Hashtbl.find_opt` + `Option.value` | `cfg.get(s).and_then(...)` .unwrap_or(default)` |
| Typed access | Manual `int_of_string_opt` | `get_int`, `get_bool` helper functions |
| Production library | N/A | `toml` crate, `config` crate, `figment` |
// 766. Config File Parsing (INI/TOML-Like)
// Sections, key=value, # comments โ std-only
use std::collections::HashMap;
pub type Config = HashMap<String, HashMap<String, String>>;
pub fn parse_config(text: &str) -> Config {
let mut cfg: Config = HashMap::new();
let mut current_section = "global".to_string();
cfg.entry("global".to_string()).or_default();
for raw_line in text.lines() {
// Strip inline comment
let line = raw_line.split_once('#').map(|(l, _)| l).unwrap_or(raw_line);
let line = line.trim();
if line.is_empty() { continue; }
if line.starts_with('[') && line.ends_with(']') {
// Section header
current_section = line[1..line.len() - 1].trim().to_string();
cfg.entry(current_section.clone()).or_default();
} else if let Some((key, value)) = line.split_once('=') {
let key = key.trim().to_string();
let value = value.trim().to_string();
cfg.entry(current_section.clone())
.or_default()
.insert(key, value);
}
// else: malformed line, skip
}
cfg
}
pub fn get_str<'a>(cfg: &'a Config, section: &str, key: &str, default: &'a str) -> &'a str {
cfg.get(section)
.and_then(|s| s.get(key))
.map(|s| s.as_str())
.unwrap_or(default)
}
pub fn get_int(cfg: &Config, section: &str, key: &str, default: i64) -> i64 {
get_str(cfg, section, key, "")
.parse()
.unwrap_or(default)
}
pub fn get_bool(cfg: &Config, section: &str, key: &str, default: bool) -> bool {
match get_str(cfg, section, key, "") {
"true" | "yes" | "1" | "on" => true,
"false"| "no" | "0" | "off" => false,
_ => default,
}
}
fn main() {
let text = r#"
# Main config
[server]
host = localhost
port = 8080
debug = true
[database]
host = db.example.com
port = 5432
name = mydb # production DB
max_connections = 10
"#;
let cfg = parse_config(text);
println!("server.host = {}", get_str(&cfg, "server", "host", "(none)"));
println!("server.port = {}", get_int(&cfg, "server", "port", 80));
println!("server.debug = {}", get_bool(&cfg, "server", "debug", false));
println!("db.host = {}", get_str(&cfg, "database", "host", "(none)"));
println!("db.maxconn = {}", get_int(&cfg, "database", "max_connections", 5));
// Dump all sections
println!("\nAll sections:");
let mut sections: Vec<&String> = cfg.keys().collect();
sections.sort();
for section in sections {
println!(" [{section}]");
let mut keys: Vec<&String> = cfg[section].keys().collect();
keys.sort();
for key in keys {
println!(" {key} = {}", cfg[section][key]);
}
}
}
#[cfg(test)]
mod tests {
use super::*;
const SAMPLE: &str = r#"
[app]
name = myapp
version = 2
enabled = true
"#;
#[test]
fn parse_string_value() {
let cfg = parse_config(SAMPLE);
assert_eq!(get_str(&cfg, "app", "name", ""), "myapp");
}
#[test]
fn parse_int_value() {
let cfg = parse_config(SAMPLE);
assert_eq!(get_int(&cfg, "app", "version", 0), 2);
}
#[test]
fn parse_bool_value() {
let cfg = parse_config(SAMPLE);
assert!(get_bool(&cfg, "app", "enabled", false));
}
#[test]
fn missing_key_returns_default() {
let cfg = parse_config(SAMPLE);
assert_eq!(get_str(&cfg, "app", "missing", "default"), "default");
}
#[test]
fn comment_stripped() {
let cfg = parse_config("[s]\nkey = value # comment\n");
assert_eq!(get_str(&cfg, "s", "key", ""), "value");
}
}
(* Config file parsing in OCaml โ INI/TOML-like *)
type config = (string, (string, string) Hashtbl.t) Hashtbl.t
let parse_config text : config =
let cfg = Hashtbl.create 8 in
let current_section = ref "global" in
Hashtbl.replace cfg "global" (Hashtbl.create 4);
List.iter (fun raw_line ->
(* Strip comment *)
let line =
match String.index_opt raw_line '#' with
| Some i -> String.sub raw_line 0 i
| None -> raw_line
in
let line = String.trim line in
if String.length line = 0 then () (* empty *)
else if line.[0] = '[' then begin
(* Section header [name] *)
let name = String.sub line 1 (String.length line - 2) |> String.trim in
current_section := name;
if not (Hashtbl.mem cfg name) then
Hashtbl.replace cfg name (Hashtbl.create 4)
end else begin
(* key = value *)
match String.index_opt line '=' with
| Some eq ->
let key = String.trim (String.sub line 0 eq) in
let value = String.trim (String.sub line (eq + 1) (String.length line - eq - 1)) in
let section_tbl = Hashtbl.find cfg !current_section in
Hashtbl.replace section_tbl key value
| None -> () (* malformed line, skip *)
end
) (String.split_on_char '\n' text);
cfg
let get_str cfg section key default =
match Hashtbl.find_opt cfg section with
| None -> default
| Some tbl -> Option.value ~default (Hashtbl.find_opt tbl key)
let get_int cfg section key default =
match int_of_string_opt (get_str cfg section key "") with
| Some n -> n
| None -> default
let () =
let text = {|
# Main config
[server]
host = localhost
port = 8080
[database]
host = db.example.com
port = 5432
name = mydb # production DB
max_connections = 10
|} in
let cfg = parse_config text in
Printf.printf "server.host = %s\n" (get_str cfg "server" "host" "");
Printf.printf "server.port = %d\n" (get_int cfg "server" "port" 80);
Printf.printf "db.host = %s\n" (get_str cfg "database" "host" "");
Printf.printf "db.maxconn = %d\n" (get_int cfg "database" "max_connections" 5)