493 Fundamental

CString and CStr for FFI

Functional Programming

Tutorial

The Problem

C strings are null-terminated byte arrays: the string ends at the first \0. Rust strings (str) can contain \0 bytes and have an explicit length. Passing a Rust string directly to a C function expecting char * would either crash (no null terminator) or silently truncate at interior nulls. CString::new(s) validates that s contains no interior nulls and appends a terminating \0, producing a value safe to pass to any extern "C" function via .as_ptr().

🎯 Learning Outcomes

• Create a CString from a &str with CString::new(s) returning Result

• Understand that interior null bytes cause CString::new to fail

• Retrieve the raw pointer with .as_ptr() for FFI calls

• Inspect the bytes including the null terminator with .as_bytes_with_nul()

• Convert a CStr back to &str with .to_str() for reading C function output

Code Example

// src/lib.rs content
//! 493 — `CString` for FFI: null-terminated strings across the C ABI.
//!
//! Rust's `str`/`String` are length-prefixed UTF-8, *not* null-terminated,
//! and they may contain interior nulls.  C, by contrast, expects
//! `char *` buffers terminated by a single `'\0'` byte and forbids any
//! embedded nulls — the first null is the end of the string.
//!
//! `std::ffi::CString` bridges the gap:
//!
//! * It owns a heap buffer guaranteed to end in exactly one `'\0'`.
//! * Its constructor **rejects interior nulls** with `NulError` — this is
//!   how Rust prevents truncation bugs at the FFI boundary.
//! * `as_ptr()` hands C a `*const c_char` that it can read until the null.
//!
//! The borrowed counterpart `CStr` plays the same role for `CString` that
//! `&str` plays for `String`: a non-owning view over a null-terminated
//! byte sequence.

use std::ffi::{CStr, CString, NulError};
use std::os::raw::c_char;

/// Build a null-terminated C string from a Rust string slice.
///
/// Returns `Err(NulError)` if `s` contains an interior `'\0'` byte —
/// such a string cannot be represented in C without silent truncation,
/// so `CString` refuses to construct one.
///
/// # Examples
///
/// ```
/// # use example_493_cstring_ffi::to_cstring;
/// let cs = to_cstring("hello").unwrap();
/// assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
/// assert!(to_cstring("hel\0lo").is_err());
/// ```
pub fn to_cstring(s: &str) -> Result<CString, NulError> {
    CString::new(s)
}

/// Return the length of `s` as it would appear in C, **including** the
/// trailing null byte.  Errors on interior nulls for the same reason as
/// [`to_cstring`].
pub fn c_string_len(s: &str) -> Result<usize, NulError> {
    Ok(to_cstring(s)?.as_bytes_with_nul().len())
}

/// Check whether `s` contains any interior `'\0'` byte.
///
/// This is the condition that would make `CString::new` fail; exposing it
/// as a plain predicate is useful when you want to report the problem
/// without constructing the `CString`.
pub fn has_interior_null(s: &str) -> bool {
    s.as_bytes().contains(&0)
}

/// Compute the length of a null-terminated C string pointed to by `ptr`,
/// *not* counting the terminating null — the C `strlen` semantics.
///
/// # Safety
///
/// `ptr` must point to a valid, null-terminated sequence of bytes that
/// remains live for the duration of the call.  The caller is responsible
/// for the usual FFI invariants.
pub unsafe fn c_strlen(ptr: *const c_char) -> usize {
    // `CStr::from_ptr` walks until the null, giving us the borrowed view.
    unsafe { CStr::from_ptr(ptr) }.to_bytes().len()
}

/// Round-trip a Rust string through a `CString` and back to an owned
/// `String`, exercising both directions of the FFI conversion.
///
/// Returns `Err` if `s` contains an interior null (it cannot become a
/// `CString`) or if the round-tripped bytes are not valid UTF-8 — which
/// cannot happen here since `s` was UTF-8 to start with, but the API is
/// written to mirror the real-world case where C hands you arbitrary bytes.
pub fn round_trip(s: &str) -> Result<String, RoundTripError> {
    let owned = CString::new(s).map_err(RoundTripError::InteriorNul)?;
    let borrowed: &CStr = owned.as_c_str();
    borrowed
        .to_str()
        .map(str::to_owned)
        .map_err(RoundTripError::NotUtf8)
}

/// Error type for [`round_trip`].
#[derive(Debug)]
pub enum RoundTripError {
    /// The input contained an interior `'\0'`.
    InteriorNul(NulError),
    /// The bytes were not valid UTF-8.
    NotUtf8(std::str::Utf8Error),
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn to_cstring_hello_has_null_terminator() {
        let cs = to_cstring("hello").expect("no interior null");
        // `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
        assert_eq!(cs.as_bytes(), b"hello");
        assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
    }

    #[test]
    fn to_cstring_rejects_interior_null() {
        let err = to_cstring("hel\0lo").expect_err("interior null must fail");
        // The error records the position of the offending byte.
        assert_eq!(err.nul_position(), 3);
    }

    #[test]
    fn to_cstring_empty_is_just_the_null_byte() {
        let cs = to_cstring("").expect("empty is fine");
        assert_eq!(cs.as_bytes(), b"");
        assert_eq!(cs.as_bytes_with_nul(), b"\0");
    }

    #[test]
    fn c_string_len_counts_the_null() {
        assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
        assert_eq!(c_string_len("").unwrap(), 1); // just NUL
        assert_eq!(c_string_len("a").unwrap(), 2);
    }

    #[test]
    fn c_string_len_propagates_interior_null_error() {
        assert!(c_string_len("ab\0cd").is_err());
    }

    #[test]
    fn has_interior_null_detects_embedded_zero() {
        assert!(!has_interior_null("hello"));
        assert!(!has_interior_null(""));
        assert!(has_interior_null("hel\0lo"));
        assert!(has_interior_null("\0"));
        assert!(has_interior_null("trailing\0"));
    }

    #[test]
    fn has_interior_null_handles_utf8() {
        // Multi-byte UTF-8 must not produce false positives — none of the
        // continuation bytes are 0x00.
        assert!(!has_interior_null("héllo"));
        assert!(!has_interior_null("日本語"));
        assert!(has_interior_null("日本\0語"));
    }

    #[test]
    fn c_strlen_matches_cstring_as_bytes_len() {
        let cs = to_cstring("hello world").unwrap();
        // SAFETY: `cs` owns a valid null-terminated buffer that lives to
        // the end of this scope; `as_ptr` gives us a read-only pointer
        // into it, which is all `c_strlen` needs.
        let len = unsafe { c_strlen(cs.as_ptr()) };
        assert_eq!(len, 11);
        assert_eq!(len, cs.as_bytes().len());
    }

    #[test]
    fn c_strlen_empty_c_string() {
        let cs = to_cstring("").unwrap();
        // SAFETY: same as above — `cs` is a valid null-terminated buffer.
        let len = unsafe { c_strlen(cs.as_ptr()) };
        assert_eq!(len, 0);
    }

    #[test]
    fn round_trip_preserves_content() {
        assert_eq!(round_trip("hello").unwrap(), "hello");
        assert_eq!(round_trip("").unwrap(), "");
        assert_eq!(round_trip("日本語").unwrap(), "日本語");
    }

    #[test]
    fn round_trip_rejects_interior_null() {
        assert!(matches!(
            round_trip("bad\0string"),
            Err(RoundTripError::InteriorNul(_))
        ));
    }
}

(* 493. CString for FFI – OCaml *)
(* OCaml uses Ctypes for FFI; strings need null termination *)
let () =
  (* In OCaml FFI, strings to C functions need null termination *)
  (* OCaml strings are NOT null-terminated by default *)
  let s = "hello" in
  (* Create null-terminated version: *)
  let cs = Bytes.create (String.length s + 1) in
  Bytes.blit_string s 0 cs 0 (String.length s);
  Bytes.set cs (String.length s) '\000';
  Printf.printf "c-string len=%d (including null)\n" (Bytes.length cs);

  (* Check for interior nulls *)
  let has_null s =
    String.exists ((=) '\000') s
  in
  Printf.printf "has_null 'hello': %b\n" (has_null s);
  Printf.printf "has_null 'hel\000lo': %b\n" (has_null "hel\000lo")

Key Differences

Null validation: Rust's CString::new explicitly validates and rejects interior nulls; OCaml's ctypes library silently truncates at the first NUL when coercing to C strings.

Ownership: CString owns the null-terminated buffer; CStr borrows one. OCaml's GC manages string lifetime automatically but the C caller must not hold the pointer after the GC moves the string.

Type separation: Rust has String/OsString/CString as three distinct types with compile-time checked conversions; OCaml uses string everywhere with runtime checks in FFI layers.

Safety: Rust's CString::as_ptr() is only valid while the CString is alive; dropping it earlier is a use-after-free. OCaml's GC-managed strings can move, requiring pinning for long-lived C pointers.

OCaml Approach

OCaml's C FFI uses string directly — the C bindings layer handles the null-termination:

external c_strlen : string -> int = "caml_string_length"

(* ocaml-ctypes uses Ctypes.string for null-terminated C strings *)
let strlen s = Ctypes.(coerce string (ptr char) s |> C.Functions.strlen)

The ctypes library provides Ctypes.CArray, Ctypes.string, and Ctypes.ocaml_string to manage the boundary between OCaml and C strings. OCaml strings can contain NUL bytes — passing them to C functions expecting null-terminated strings would truncate silently.

Full Source

// src/lib.rs content
//! 493 — `CString` for FFI: null-terminated strings across the C ABI.
//!
//! Rust's `str`/`String` are length-prefixed UTF-8, *not* null-terminated,
//! and they may contain interior nulls.  C, by contrast, expects
//! `char *` buffers terminated by a single `'\0'` byte and forbids any
//! embedded nulls — the first null is the end of the string.
//!
//! `std::ffi::CString` bridges the gap:
//!
//! * It owns a heap buffer guaranteed to end in exactly one `'\0'`.
//! * Its constructor **rejects interior nulls** with `NulError` — this is
//!   how Rust prevents truncation bugs at the FFI boundary.
//! * `as_ptr()` hands C a `*const c_char` that it can read until the null.
//!
//! The borrowed counterpart `CStr` plays the same role for `CString` that
//! `&str` plays for `String`: a non-owning view over a null-terminated
//! byte sequence.

use std::ffi::{CStr, CString, NulError};
use std::os::raw::c_char;

/// Build a null-terminated C string from a Rust string slice.
///
/// Returns `Err(NulError)` if `s` contains an interior `'\0'` byte —
/// such a string cannot be represented in C without silent truncation,
/// so `CString` refuses to construct one.
///
/// # Examples
///
/// ```
/// # use example_493_cstring_ffi::to_cstring;
/// let cs = to_cstring("hello").unwrap();
/// assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
/// assert!(to_cstring("hel\0lo").is_err());
/// ```
pub fn to_cstring(s: &str) -> Result<CString, NulError> {
    CString::new(s)
}

/// Return the length of `s` as it would appear in C, **including** the
/// trailing null byte.  Errors on interior nulls for the same reason as
/// [`to_cstring`].
pub fn c_string_len(s: &str) -> Result<usize, NulError> {
    Ok(to_cstring(s)?.as_bytes_with_nul().len())
}

/// Check whether `s` contains any interior `'\0'` byte.
///
/// This is the condition that would make `CString::new` fail; exposing it
/// as a plain predicate is useful when you want to report the problem
/// without constructing the `CString`.
pub fn has_interior_null(s: &str) -> bool {
    s.as_bytes().contains(&0)
}

/// Compute the length of a null-terminated C string pointed to by `ptr`,
/// *not* counting the terminating null — the C `strlen` semantics.
///
/// # Safety
///
/// `ptr` must point to a valid, null-terminated sequence of bytes that
/// remains live for the duration of the call.  The caller is responsible
/// for the usual FFI invariants.
pub unsafe fn c_strlen(ptr: *const c_char) -> usize {
    // `CStr::from_ptr` walks until the null, giving us the borrowed view.
    unsafe { CStr::from_ptr(ptr) }.to_bytes().len()
}

/// Round-trip a Rust string through a `CString` and back to an owned
/// `String`, exercising both directions of the FFI conversion.
///
/// Returns `Err` if `s` contains an interior null (it cannot become a
/// `CString`) or if the round-tripped bytes are not valid UTF-8 — which
/// cannot happen here since `s` was UTF-8 to start with, but the API is
/// written to mirror the real-world case where C hands you arbitrary bytes.
pub fn round_trip(s: &str) -> Result<String, RoundTripError> {
    let owned = CString::new(s).map_err(RoundTripError::InteriorNul)?;
    let borrowed: &CStr = owned.as_c_str();
    borrowed
        .to_str()
        .map(str::to_owned)
        .map_err(RoundTripError::NotUtf8)
}

/// Error type for [`round_trip`].
#[derive(Debug)]
pub enum RoundTripError {
    /// The input contained an interior `'\0'`.
    InteriorNul(NulError),
    /// The bytes were not valid UTF-8.
    NotUtf8(std::str::Utf8Error),
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn to_cstring_hello_has_null_terminator() {
        let cs = to_cstring("hello").expect("no interior null");
        // `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
        assert_eq!(cs.as_bytes(), b"hello");
        assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
    }

    #[test]
    fn to_cstring_rejects_interior_null() {
        let err = to_cstring("hel\0lo").expect_err("interior null must fail");
        // The error records the position of the offending byte.
        assert_eq!(err.nul_position(), 3);
    }

    #[test]
    fn to_cstring_empty_is_just_the_null_byte() {
        let cs = to_cstring("").expect("empty is fine");
        assert_eq!(cs.as_bytes(), b"");
        assert_eq!(cs.as_bytes_with_nul(), b"\0");
    }

    #[test]
    fn c_string_len_counts_the_null() {
        assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
        assert_eq!(c_string_len("").unwrap(), 1); // just NUL
        assert_eq!(c_string_len("a").unwrap(), 2);
    }

    #[test]
    fn c_string_len_propagates_interior_null_error() {
        assert!(c_string_len("ab\0cd").is_err());
    }

    #[test]
    fn has_interior_null_detects_embedded_zero() {
        assert!(!has_interior_null("hello"));
        assert!(!has_interior_null(""));
        assert!(has_interior_null("hel\0lo"));
        assert!(has_interior_null("\0"));
        assert!(has_interior_null("trailing\0"));
    }

    #[test]
    fn has_interior_null_handles_utf8() {
        // Multi-byte UTF-8 must not produce false positives — none of the
        // continuation bytes are 0x00.
        assert!(!has_interior_null("héllo"));
        assert!(!has_interior_null("日本語"));
        assert!(has_interior_null("日本\0語"));
    }

    #[test]
    fn c_strlen_matches_cstring_as_bytes_len() {
        let cs = to_cstring("hello world").unwrap();
        // SAFETY: `cs` owns a valid null-terminated buffer that lives to
        // the end of this scope; `as_ptr` gives us a read-only pointer
        // into it, which is all `c_strlen` needs.
        let len = unsafe { c_strlen(cs.as_ptr()) };
        assert_eq!(len, 11);
        assert_eq!(len, cs.as_bytes().len());
    }

    #[test]
    fn c_strlen_empty_c_string() {
        let cs = to_cstring("").unwrap();
        // SAFETY: same as above — `cs` is a valid null-terminated buffer.
        let len = unsafe { c_strlen(cs.as_ptr()) };
        assert_eq!(len, 0);
    }

    #[test]
    fn round_trip_preserves_content() {
        assert_eq!(round_trip("hello").unwrap(), "hello");
        assert_eq!(round_trip("").unwrap(), "");
        assert_eq!(round_trip("日本語").unwrap(), "日本語");
    }

    #[test]
    fn round_trip_rejects_interior_null() {
        assert!(matches!(
            round_trip("bad\0string"),
            Err(RoundTripError::InteriorNul(_))
        ));
    }
}

(* 493. CString for FFI – OCaml *)
(* OCaml uses Ctypes for FFI; strings need null termination *)
let () =
  (* In OCaml FFI, strings to C functions need null termination *)
  (* OCaml strings are NOT null-terminated by default *)
  let s = "hello" in
  (* Create null-terminated version: *)
  let cs = Bytes.create (String.length s + 1) in
  Bytes.blit_string s 0 cs 0 (String.length s);
  Bytes.set cs (String.length s) '\000';
  Printf.printf "c-string len=%d (including null)\n" (Bytes.length cs);

  (* Check for interior nulls *)
  let has_null s =
    String.exists ((=) '\000') s
  in
  Printf.printf "has_null 'hello': %b\n" (has_null s);
  Printf.printf "has_null 'hel\000lo': %b\n" (has_null "hel\000lo")

✓ Tests Rust test suite

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn to_cstring_hello_has_null_terminator() {
        let cs = to_cstring("hello").expect("no interior null");
        // `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
        assert_eq!(cs.as_bytes(), b"hello");
        assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
    }

    #[test]
    fn to_cstring_rejects_interior_null() {
        let err = to_cstring("hel\0lo").expect_err("interior null must fail");
        // The error records the position of the offending byte.
        assert_eq!(err.nul_position(), 3);
    }

    #[test]
    fn to_cstring_empty_is_just_the_null_byte() {
        let cs = to_cstring("").expect("empty is fine");
        assert_eq!(cs.as_bytes(), b"");
        assert_eq!(cs.as_bytes_with_nul(), b"\0");
    }

    #[test]
    fn c_string_len_counts_the_null() {
        assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
        assert_eq!(c_string_len("").unwrap(), 1); // just NUL
        assert_eq!(c_string_len("a").unwrap(), 2);
    }

    #[test]
    fn c_string_len_propagates_interior_null_error() {
        assert!(c_string_len("ab\0cd").is_err());
    }

    #[test]
    fn has_interior_null_detects_embedded_zero() {
        assert!(!has_interior_null("hello"));
        assert!(!has_interior_null(""));
        assert!(has_interior_null("hel\0lo"));
        assert!(has_interior_null("\0"));
        assert!(has_interior_null("trailing\0"));
    }

    #[test]
    fn has_interior_null_handles_utf8() {
        // Multi-byte UTF-8 must not produce false positives — none of the
        // continuation bytes are 0x00.
        assert!(!has_interior_null("héllo"));
        assert!(!has_interior_null("日本語"));
        assert!(has_interior_null("日本\0語"));
    }

    #[test]
    fn c_strlen_matches_cstring_as_bytes_len() {
        let cs = to_cstring("hello world").unwrap();
        // SAFETY: `cs` owns a valid null-terminated buffer that lives to
        // the end of this scope; `as_ptr` gives us a read-only pointer
        // into it, which is all `c_strlen` needs.
        let len = unsafe { c_strlen(cs.as_ptr()) };
        assert_eq!(len, 11);
        assert_eq!(len, cs.as_bytes().len());
    }

    #[test]
    fn c_strlen_empty_c_string() {
        let cs = to_cstring("").unwrap();
        // SAFETY: same as above — `cs` is a valid null-terminated buffer.
        let len = unsafe { c_strlen(cs.as_ptr()) };
        assert_eq!(len, 0);
    }

    #[test]
    fn round_trip_preserves_content() {
        assert_eq!(round_trip("hello").unwrap(), "hello");
        assert_eq!(round_trip("").unwrap(), "");
        assert_eq!(round_trip("日本語").unwrap(), "日本語");
    }

    #[test]
    fn round_trip_rejects_interior_null() {
        assert!(matches!(
            round_trip("bad\0string"),
            Err(RoundTripError::InteriorNul(_))
        ));
    }
}

Exercises

**Safe strlen wrapper**: Write fn safe_strlen(s: &str) -> Result<usize, NulError> that creates a CString and calls a hypothetical extern "C" fn strlen(*const i8) -> usize.

Read C output: Given a *const i8 returned by a C function, wrap it in unsafe { CStr::from_ptr(ptr) } and convert to a String with .to_string_lossy().

Null in the middle: Write a test that passes b"hel\x00lo" to CString::from_vec_unchecked (unsafe) and observe that .to_str() returns only "hel" — demonstrating the truncation hazard.

Open Source Repos

functional-rust

View the source for this example on GitHub — OCaml and Rust side by side in the repo.

Rust