CString and CStr for FFI
Tutorial
The Problem
C strings are null-terminated byte arrays: the string ends at the first \0. Rust strings (str) can contain \0 bytes and have an explicit length. Passing a Rust string directly to a C function expecting char * would either crash (no null terminator) or silently truncate at interior nulls. CString::new(s) validates that s contains no interior nulls and appends a terminating \0, producing a value safe to pass to any extern "C" function via .as_ptr().
🎯 Learning Outcomes
CString from a &str with CString::new(s) returning ResultCString::new to fail.as_ptr() for FFI calls.as_bytes_with_nul()CStr back to &str with .to_str() for reading C function outputCode Example
// src/lib.rs content
//! 493 — `CString` for FFI: null-terminated strings across the C ABI.
//!
//! Rust's `str`/`String` are length-prefixed UTF-8, *not* null-terminated,
//! and they may contain interior nulls. C, by contrast, expects
//! `char *` buffers terminated by a single `'\0'` byte and forbids any
//! embedded nulls — the first null is the end of the string.
//!
//! `std::ffi::CString` bridges the gap:
//!
//! * It owns a heap buffer guaranteed to end in exactly one `'\0'`.
//! * Its constructor **rejects interior nulls** with `NulError` — this is
//! how Rust prevents truncation bugs at the FFI boundary.
//! * `as_ptr()` hands C a `*const c_char` that it can read until the null.
//!
//! The borrowed counterpart `CStr` plays the same role for `CString` that
//! `&str` plays for `String`: a non-owning view over a null-terminated
//! byte sequence.
use std::ffi::{CStr, CString, NulError};
use std::os::raw::c_char;
/// Build a null-terminated C string from a Rust string slice.
///
/// Returns `Err(NulError)` if `s` contains an interior `'\0'` byte —
/// such a string cannot be represented in C without silent truncation,
/// so `CString` refuses to construct one.
///
/// # Examples
///
/// ```
/// # use example_493_cstring_ffi::to_cstring;
/// let cs = to_cstring("hello").unwrap();
/// assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
/// assert!(to_cstring("hel\0lo").is_err());
/// ```
pub fn to_cstring(s: &str) -> Result<CString, NulError> {
CString::new(s)
}
/// Return the length of `s` as it would appear in C, **including** the
/// trailing null byte. Errors on interior nulls for the same reason as
/// [`to_cstring`].
pub fn c_string_len(s: &str) -> Result<usize, NulError> {
Ok(to_cstring(s)?.as_bytes_with_nul().len())
}
/// Check whether `s` contains any interior `'\0'` byte.
///
/// This is the condition that would make `CString::new` fail; exposing it
/// as a plain predicate is useful when you want to report the problem
/// without constructing the `CString`.
pub fn has_interior_null(s: &str) -> bool {
s.as_bytes().contains(&0)
}
/// Compute the length of a null-terminated C string pointed to by `ptr`,
/// *not* counting the terminating null — the C `strlen` semantics.
///
/// # Safety
///
/// `ptr` must point to a valid, null-terminated sequence of bytes that
/// remains live for the duration of the call. The caller is responsible
/// for the usual FFI invariants.
pub unsafe fn c_strlen(ptr: *const c_char) -> usize {
// `CStr::from_ptr` walks until the null, giving us the borrowed view.
unsafe { CStr::from_ptr(ptr) }.to_bytes().len()
}
/// Round-trip a Rust string through a `CString` and back to an owned
/// `String`, exercising both directions of the FFI conversion.
///
/// Returns `Err` if `s` contains an interior null (it cannot become a
/// `CString`) or if the round-tripped bytes are not valid UTF-8 — which
/// cannot happen here since `s` was UTF-8 to start with, but the API is
/// written to mirror the real-world case where C hands you arbitrary bytes.
pub fn round_trip(s: &str) -> Result<String, RoundTripError> {
let owned = CString::new(s).map_err(RoundTripError::InteriorNul)?;
let borrowed: &CStr = owned.as_c_str();
borrowed
.to_str()
.map(str::to_owned)
.map_err(RoundTripError::NotUtf8)
}
/// Error type for [`round_trip`].
#[derive(Debug)]
pub enum RoundTripError {
/// The input contained an interior `'\0'`.
InteriorNul(NulError),
/// The bytes were not valid UTF-8.
NotUtf8(std::str::Utf8Error),
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn to_cstring_hello_has_null_terminator() {
let cs = to_cstring("hello").expect("no interior null");
// `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
assert_eq!(cs.as_bytes(), b"hello");
assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
}
#[test]
fn to_cstring_rejects_interior_null() {
let err = to_cstring("hel\0lo").expect_err("interior null must fail");
// The error records the position of the offending byte.
assert_eq!(err.nul_position(), 3);
}
#[test]
fn to_cstring_empty_is_just_the_null_byte() {
let cs = to_cstring("").expect("empty is fine");
assert_eq!(cs.as_bytes(), b"");
assert_eq!(cs.as_bytes_with_nul(), b"\0");
}
#[test]
fn c_string_len_counts_the_null() {
assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
assert_eq!(c_string_len("").unwrap(), 1); // just NUL
assert_eq!(c_string_len("a").unwrap(), 2);
}
#[test]
fn c_string_len_propagates_interior_null_error() {
assert!(c_string_len("ab\0cd").is_err());
}
#[test]
fn has_interior_null_detects_embedded_zero() {
assert!(!has_interior_null("hello"));
assert!(!has_interior_null(""));
assert!(has_interior_null("hel\0lo"));
assert!(has_interior_null("\0"));
assert!(has_interior_null("trailing\0"));
}
#[test]
fn has_interior_null_handles_utf8() {
// Multi-byte UTF-8 must not produce false positives — none of the
// continuation bytes are 0x00.
assert!(!has_interior_null("héllo"));
assert!(!has_interior_null("日本語"));
assert!(has_interior_null("日本\0語"));
}
#[test]
fn c_strlen_matches_cstring_as_bytes_len() {
let cs = to_cstring("hello world").unwrap();
// SAFETY: `cs` owns a valid null-terminated buffer that lives to
// the end of this scope; `as_ptr` gives us a read-only pointer
// into it, which is all `c_strlen` needs.
let len = unsafe { c_strlen(cs.as_ptr()) };
assert_eq!(len, 11);
assert_eq!(len, cs.as_bytes().len());
}
#[test]
fn c_strlen_empty_c_string() {
let cs = to_cstring("").unwrap();
// SAFETY: same as above — `cs` is a valid null-terminated buffer.
let len = unsafe { c_strlen(cs.as_ptr()) };
assert_eq!(len, 0);
}
#[test]
fn round_trip_preserves_content() {
assert_eq!(round_trip("hello").unwrap(), "hello");
assert_eq!(round_trip("").unwrap(), "");
assert_eq!(round_trip("日本語").unwrap(), "日本語");
}
#[test]
fn round_trip_rejects_interior_null() {
assert!(matches!(
round_trip("bad\0string"),
Err(RoundTripError::InteriorNul(_))
));
}
}Key Differences
CString::new explicitly validates and rejects interior nulls; OCaml's ctypes library silently truncates at the first NUL when coercing to C strings.CString owns the null-terminated buffer; CStr borrows one. OCaml's GC manages string lifetime automatically but the C caller must not hold the pointer after the GC moves the string.String/OsString/CString as three distinct types with compile-time checked conversions; OCaml uses string everywhere with runtime checks in FFI layers.CString::as_ptr() is only valid while the CString is alive; dropping it earlier is a use-after-free. OCaml's GC-managed strings can move, requiring pinning for long-lived C pointers.OCaml Approach
OCaml's C FFI uses string directly — the C bindings layer handles the null-termination:
external c_strlen : string -> int = "caml_string_length"
(* ocaml-ctypes uses Ctypes.string for null-terminated C strings *)
let strlen s = Ctypes.(coerce string (ptr char) s |> C.Functions.strlen)
The ctypes library provides Ctypes.CArray, Ctypes.string, and Ctypes.ocaml_string to manage the boundary between OCaml and C strings. OCaml strings can contain NUL bytes — passing them to C functions expecting null-terminated strings would truncate silently.
Full Source
// src/lib.rs content
//! 493 — `CString` for FFI: null-terminated strings across the C ABI.
//!
//! Rust's `str`/`String` are length-prefixed UTF-8, *not* null-terminated,
//! and they may contain interior nulls. C, by contrast, expects
//! `char *` buffers terminated by a single `'\0'` byte and forbids any
//! embedded nulls — the first null is the end of the string.
//!
//! `std::ffi::CString` bridges the gap:
//!
//! * It owns a heap buffer guaranteed to end in exactly one `'\0'`.
//! * Its constructor **rejects interior nulls** with `NulError` — this is
//! how Rust prevents truncation bugs at the FFI boundary.
//! * `as_ptr()` hands C a `*const c_char` that it can read until the null.
//!
//! The borrowed counterpart `CStr` plays the same role for `CString` that
//! `&str` plays for `String`: a non-owning view over a null-terminated
//! byte sequence.
use std::ffi::{CStr, CString, NulError};
use std::os::raw::c_char;
/// Build a null-terminated C string from a Rust string slice.
///
/// Returns `Err(NulError)` if `s` contains an interior `'\0'` byte —
/// such a string cannot be represented in C without silent truncation,
/// so `CString` refuses to construct one.
///
/// # Examples
///
/// ```
/// # use example_493_cstring_ffi::to_cstring;
/// let cs = to_cstring("hello").unwrap();
/// assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
/// assert!(to_cstring("hel\0lo").is_err());
/// ```
pub fn to_cstring(s: &str) -> Result<CString, NulError> {
CString::new(s)
}
/// Return the length of `s` as it would appear in C, **including** the
/// trailing null byte. Errors on interior nulls for the same reason as
/// [`to_cstring`].
pub fn c_string_len(s: &str) -> Result<usize, NulError> {
Ok(to_cstring(s)?.as_bytes_with_nul().len())
}
/// Check whether `s` contains any interior `'\0'` byte.
///
/// This is the condition that would make `CString::new` fail; exposing it
/// as a plain predicate is useful when you want to report the problem
/// without constructing the `CString`.
pub fn has_interior_null(s: &str) -> bool {
s.as_bytes().contains(&0)
}
/// Compute the length of a null-terminated C string pointed to by `ptr`,
/// *not* counting the terminating null — the C `strlen` semantics.
///
/// # Safety
///
/// `ptr` must point to a valid, null-terminated sequence of bytes that
/// remains live for the duration of the call. The caller is responsible
/// for the usual FFI invariants.
pub unsafe fn c_strlen(ptr: *const c_char) -> usize {
// `CStr::from_ptr` walks until the null, giving us the borrowed view.
unsafe { CStr::from_ptr(ptr) }.to_bytes().len()
}
/// Round-trip a Rust string through a `CString` and back to an owned
/// `String`, exercising both directions of the FFI conversion.
///
/// Returns `Err` if `s` contains an interior null (it cannot become a
/// `CString`) or if the round-tripped bytes are not valid UTF-8 — which
/// cannot happen here since `s` was UTF-8 to start with, but the API is
/// written to mirror the real-world case where C hands you arbitrary bytes.
pub fn round_trip(s: &str) -> Result<String, RoundTripError> {
let owned = CString::new(s).map_err(RoundTripError::InteriorNul)?;
let borrowed: &CStr = owned.as_c_str();
borrowed
.to_str()
.map(str::to_owned)
.map_err(RoundTripError::NotUtf8)
}
/// Error type for [`round_trip`].
#[derive(Debug)]
pub enum RoundTripError {
/// The input contained an interior `'\0'`.
InteriorNul(NulError),
/// The bytes were not valid UTF-8.
NotUtf8(std::str::Utf8Error),
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn to_cstring_hello_has_null_terminator() {
let cs = to_cstring("hello").expect("no interior null");
// `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
assert_eq!(cs.as_bytes(), b"hello");
assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
}
#[test]
fn to_cstring_rejects_interior_null() {
let err = to_cstring("hel\0lo").expect_err("interior null must fail");
// The error records the position of the offending byte.
assert_eq!(err.nul_position(), 3);
}
#[test]
fn to_cstring_empty_is_just_the_null_byte() {
let cs = to_cstring("").expect("empty is fine");
assert_eq!(cs.as_bytes(), b"");
assert_eq!(cs.as_bytes_with_nul(), b"\0");
}
#[test]
fn c_string_len_counts_the_null() {
assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
assert_eq!(c_string_len("").unwrap(), 1); // just NUL
assert_eq!(c_string_len("a").unwrap(), 2);
}
#[test]
fn c_string_len_propagates_interior_null_error() {
assert!(c_string_len("ab\0cd").is_err());
}
#[test]
fn has_interior_null_detects_embedded_zero() {
assert!(!has_interior_null("hello"));
assert!(!has_interior_null(""));
assert!(has_interior_null("hel\0lo"));
assert!(has_interior_null("\0"));
assert!(has_interior_null("trailing\0"));
}
#[test]
fn has_interior_null_handles_utf8() {
// Multi-byte UTF-8 must not produce false positives — none of the
// continuation bytes are 0x00.
assert!(!has_interior_null("héllo"));
assert!(!has_interior_null("日本語"));
assert!(has_interior_null("日本\0語"));
}
#[test]
fn c_strlen_matches_cstring_as_bytes_len() {
let cs = to_cstring("hello world").unwrap();
// SAFETY: `cs` owns a valid null-terminated buffer that lives to
// the end of this scope; `as_ptr` gives us a read-only pointer
// into it, which is all `c_strlen` needs.
let len = unsafe { c_strlen(cs.as_ptr()) };
assert_eq!(len, 11);
assert_eq!(len, cs.as_bytes().len());
}
#[test]
fn c_strlen_empty_c_string() {
let cs = to_cstring("").unwrap();
// SAFETY: same as above — `cs` is a valid null-terminated buffer.
let len = unsafe { c_strlen(cs.as_ptr()) };
assert_eq!(len, 0);
}
#[test]
fn round_trip_preserves_content() {
assert_eq!(round_trip("hello").unwrap(), "hello");
assert_eq!(round_trip("").unwrap(), "");
assert_eq!(round_trip("日本語").unwrap(), "日本語");
}
#[test]
fn round_trip_rejects_interior_null() {
assert!(matches!(
round_trip("bad\0string"),
Err(RoundTripError::InteriorNul(_))
));
}
}#[cfg(test)]
mod tests {
use super::*;
#[test]
fn to_cstring_hello_has_null_terminator() {
let cs = to_cstring("hello").expect("no interior null");
// `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
assert_eq!(cs.as_bytes(), b"hello");
assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
}
#[test]
fn to_cstring_rejects_interior_null() {
let err = to_cstring("hel\0lo").expect_err("interior null must fail");
// The error records the position of the offending byte.
assert_eq!(err.nul_position(), 3);
}
#[test]
fn to_cstring_empty_is_just_the_null_byte() {
let cs = to_cstring("").expect("empty is fine");
assert_eq!(cs.as_bytes(), b"");
assert_eq!(cs.as_bytes_with_nul(), b"\0");
}
#[test]
fn c_string_len_counts_the_null() {
assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
assert_eq!(c_string_len("").unwrap(), 1); // just NUL
assert_eq!(c_string_len("a").unwrap(), 2);
}
#[test]
fn c_string_len_propagates_interior_null_error() {
assert!(c_string_len("ab\0cd").is_err());
}
#[test]
fn has_interior_null_detects_embedded_zero() {
assert!(!has_interior_null("hello"));
assert!(!has_interior_null(""));
assert!(has_interior_null("hel\0lo"));
assert!(has_interior_null("\0"));
assert!(has_interior_null("trailing\0"));
}
#[test]
fn has_interior_null_handles_utf8() {
// Multi-byte UTF-8 must not produce false positives — none of the
// continuation bytes are 0x00.
assert!(!has_interior_null("héllo"));
assert!(!has_interior_null("日本語"));
assert!(has_interior_null("日本\0語"));
}
#[test]
fn c_strlen_matches_cstring_as_bytes_len() {
let cs = to_cstring("hello world").unwrap();
// SAFETY: `cs` owns a valid null-terminated buffer that lives to
// the end of this scope; `as_ptr` gives us a read-only pointer
// into it, which is all `c_strlen` needs.
let len = unsafe { c_strlen(cs.as_ptr()) };
assert_eq!(len, 11);
assert_eq!(len, cs.as_bytes().len());
}
#[test]
fn c_strlen_empty_c_string() {
let cs = to_cstring("").unwrap();
// SAFETY: same as above — `cs` is a valid null-terminated buffer.
let len = unsafe { c_strlen(cs.as_ptr()) };
assert_eq!(len, 0);
}
#[test]
fn round_trip_preserves_content() {
assert_eq!(round_trip("hello").unwrap(), "hello");
assert_eq!(round_trip("").unwrap(), "");
assert_eq!(round_trip("日本語").unwrap(), "日本語");
}
#[test]
fn round_trip_rejects_interior_null() {
assert!(matches!(
round_trip("bad\0string"),
Err(RoundTripError::InteriorNul(_))
));
}
}
Exercises
strlen wrapper**: Write fn safe_strlen(s: &str) -> Result<usize, NulError> that creates a CString and calls a hypothetical extern "C" fn strlen(*const i8) -> usize.*const i8 returned by a C function, wrap it in unsafe { CStr::from_ptr(ptr) } and convert to a String with .to_string_lossy().b"hel\x00lo" to CString::from_vec_unchecked (unsafe) and observe that .to_str() returns only "hel" — demonstrating the truncation hazard.