ExamplesBy LevelBy TopicLearning Paths
493 Fundamental

CString and CStr for FFI

Functional Programming

Tutorial

The Problem

C strings are null-terminated byte arrays: the string ends at the first \0. Rust strings (str) can contain \0 bytes and have an explicit length. Passing a Rust string directly to a C function expecting char * would either crash (no null terminator) or silently truncate at interior nulls. CString::new(s) validates that s contains no interior nulls and appends a terminating \0, producing a value safe to pass to any extern "C" function via .as_ptr().

🎯 Learning Outcomes

  • • Create a CString from a &str with CString::new(s) returning Result
  • • Understand that interior null bytes cause CString::new to fail
  • • Retrieve the raw pointer with .as_ptr() for FFI calls
  • • Inspect the bytes including the null terminator with .as_bytes_with_nul()
  • • Convert a CStr back to &str with .to_str() for reading C function output
  • Code Example

    // src/lib.rs content
    //! 493 — `CString` for FFI: null-terminated strings across the C ABI.
    //!
    //! Rust's `str`/`String` are length-prefixed UTF-8, *not* null-terminated,
    //! and they may contain interior nulls.  C, by contrast, expects
    //! `char *` buffers terminated by a single `'\0'` byte and forbids any
    //! embedded nulls — the first null is the end of the string.
    //!
    //! `std::ffi::CString` bridges the gap:
    //!
    //! * It owns a heap buffer guaranteed to end in exactly one `'\0'`.
    //! * Its constructor **rejects interior nulls** with `NulError` — this is
    //!   how Rust prevents truncation bugs at the FFI boundary.
    //! * `as_ptr()` hands C a `*const c_char` that it can read until the null.
    //!
    //! The borrowed counterpart `CStr` plays the same role for `CString` that
    //! `&str` plays for `String`: a non-owning view over a null-terminated
    //! byte sequence.
    
    use std::ffi::{CStr, CString, NulError};
    use std::os::raw::c_char;
    
    /// Build a null-terminated C string from a Rust string slice.
    ///
    /// Returns `Err(NulError)` if `s` contains an interior `'\0'` byte —
    /// such a string cannot be represented in C without silent truncation,
    /// so `CString` refuses to construct one.
    ///
    /// # Examples
    ///
    /// ```
    /// # use example_493_cstring_ffi::to_cstring;
    /// let cs = to_cstring("hello").unwrap();
    /// assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
    /// assert!(to_cstring("hel\0lo").is_err());
    /// ```
    pub fn to_cstring(s: &str) -> Result<CString, NulError> {
        CString::new(s)
    }
    
    /// Return the length of `s` as it would appear in C, **including** the
    /// trailing null byte.  Errors on interior nulls for the same reason as
    /// [`to_cstring`].
    pub fn c_string_len(s: &str) -> Result<usize, NulError> {
        Ok(to_cstring(s)?.as_bytes_with_nul().len())
    }
    
    /// Check whether `s` contains any interior `'\0'` byte.
    ///
    /// This is the condition that would make `CString::new` fail; exposing it
    /// as a plain predicate is useful when you want to report the problem
    /// without constructing the `CString`.
    pub fn has_interior_null(s: &str) -> bool {
        s.as_bytes().contains(&0)
    }
    
    /// Compute the length of a null-terminated C string pointed to by `ptr`,
    /// *not* counting the terminating null — the C `strlen` semantics.
    ///
    /// # Safety
    ///
    /// `ptr` must point to a valid, null-terminated sequence of bytes that
    /// remains live for the duration of the call.  The caller is responsible
    /// for the usual FFI invariants.
    pub unsafe fn c_strlen(ptr: *const c_char) -> usize {
        // `CStr::from_ptr` walks until the null, giving us the borrowed view.
        unsafe { CStr::from_ptr(ptr) }.to_bytes().len()
    }
    
    /// Round-trip a Rust string through a `CString` and back to an owned
    /// `String`, exercising both directions of the FFI conversion.
    ///
    /// Returns `Err` if `s` contains an interior null (it cannot become a
    /// `CString`) or if the round-tripped bytes are not valid UTF-8 — which
    /// cannot happen here since `s` was UTF-8 to start with, but the API is
    /// written to mirror the real-world case where C hands you arbitrary bytes.
    pub fn round_trip(s: &str) -> Result<String, RoundTripError> {
        let owned = CString::new(s).map_err(RoundTripError::InteriorNul)?;
        let borrowed: &CStr = owned.as_c_str();
        borrowed
            .to_str()
            .map(str::to_owned)
            .map_err(RoundTripError::NotUtf8)
    }
    
    /// Error type for [`round_trip`].
    #[derive(Debug)]
    pub enum RoundTripError {
        /// The input contained an interior `'\0'`.
        InteriorNul(NulError),
        /// The bytes were not valid UTF-8.
        NotUtf8(std::str::Utf8Error),
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn to_cstring_hello_has_null_terminator() {
            let cs = to_cstring("hello").expect("no interior null");
            // `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
            assert_eq!(cs.as_bytes(), b"hello");
            assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
        }
    
        #[test]
        fn to_cstring_rejects_interior_null() {
            let err = to_cstring("hel\0lo").expect_err("interior null must fail");
            // The error records the position of the offending byte.
            assert_eq!(err.nul_position(), 3);
        }
    
        #[test]
        fn to_cstring_empty_is_just_the_null_byte() {
            let cs = to_cstring("").expect("empty is fine");
            assert_eq!(cs.as_bytes(), b"");
            assert_eq!(cs.as_bytes_with_nul(), b"\0");
        }
    
        #[test]
        fn c_string_len_counts_the_null() {
            assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
            assert_eq!(c_string_len("").unwrap(), 1); // just NUL
            assert_eq!(c_string_len("a").unwrap(), 2);
        }
    
        #[test]
        fn c_string_len_propagates_interior_null_error() {
            assert!(c_string_len("ab\0cd").is_err());
        }
    
        #[test]
        fn has_interior_null_detects_embedded_zero() {
            assert!(!has_interior_null("hello"));
            assert!(!has_interior_null(""));
            assert!(has_interior_null("hel\0lo"));
            assert!(has_interior_null("\0"));
            assert!(has_interior_null("trailing\0"));
        }
    
        #[test]
        fn has_interior_null_handles_utf8() {
            // Multi-byte UTF-8 must not produce false positives — none of the
            // continuation bytes are 0x00.
            assert!(!has_interior_null("héllo"));
            assert!(!has_interior_null("日本語"));
            assert!(has_interior_null("日本\0語"));
        }
    
        #[test]
        fn c_strlen_matches_cstring_as_bytes_len() {
            let cs = to_cstring("hello world").unwrap();
            // SAFETY: `cs` owns a valid null-terminated buffer that lives to
            // the end of this scope; `as_ptr` gives us a read-only pointer
            // into it, which is all `c_strlen` needs.
            let len = unsafe { c_strlen(cs.as_ptr()) };
            assert_eq!(len, 11);
            assert_eq!(len, cs.as_bytes().len());
        }
    
        #[test]
        fn c_strlen_empty_c_string() {
            let cs = to_cstring("").unwrap();
            // SAFETY: same as above — `cs` is a valid null-terminated buffer.
            let len = unsafe { c_strlen(cs.as_ptr()) };
            assert_eq!(len, 0);
        }
    
        #[test]
        fn round_trip_preserves_content() {
            assert_eq!(round_trip("hello").unwrap(), "hello");
            assert_eq!(round_trip("").unwrap(), "");
            assert_eq!(round_trip("日本語").unwrap(), "日本語");
        }
    
        #[test]
        fn round_trip_rejects_interior_null() {
            assert!(matches!(
                round_trip("bad\0string"),
                Err(RoundTripError::InteriorNul(_))
            ));
        }
    }

    Key Differences

  • Null validation: Rust's CString::new explicitly validates and rejects interior nulls; OCaml's ctypes library silently truncates at the first NUL when coercing to C strings.
  • Ownership: CString owns the null-terminated buffer; CStr borrows one. OCaml's GC manages string lifetime automatically but the C caller must not hold the pointer after the GC moves the string.
  • Type separation: Rust has String/OsString/CString as three distinct types with compile-time checked conversions; OCaml uses string everywhere with runtime checks in FFI layers.
  • Safety: Rust's CString::as_ptr() is only valid while the CString is alive; dropping it earlier is a use-after-free. OCaml's GC-managed strings can move, requiring pinning for long-lived C pointers.
  • OCaml Approach

    OCaml's C FFI uses string directly — the C bindings layer handles the null-termination:

    external c_strlen : string -> int = "caml_string_length"
    
    (* ocaml-ctypes uses Ctypes.string for null-terminated C strings *)
    let strlen s = Ctypes.(coerce string (ptr char) s |> C.Functions.strlen)
    

    The ctypes library provides Ctypes.CArray, Ctypes.string, and Ctypes.ocaml_string to manage the boundary between OCaml and C strings. OCaml strings can contain NUL bytes — passing them to C functions expecting null-terminated strings would truncate silently.

    Full Source

    // src/lib.rs content
    //! 493 — `CString` for FFI: null-terminated strings across the C ABI.
    //!
    //! Rust's `str`/`String` are length-prefixed UTF-8, *not* null-terminated,
    //! and they may contain interior nulls.  C, by contrast, expects
    //! `char *` buffers terminated by a single `'\0'` byte and forbids any
    //! embedded nulls — the first null is the end of the string.
    //!
    //! `std::ffi::CString` bridges the gap:
    //!
    //! * It owns a heap buffer guaranteed to end in exactly one `'\0'`.
    //! * Its constructor **rejects interior nulls** with `NulError` — this is
    //!   how Rust prevents truncation bugs at the FFI boundary.
    //! * `as_ptr()` hands C a `*const c_char` that it can read until the null.
    //!
    //! The borrowed counterpart `CStr` plays the same role for `CString` that
    //! `&str` plays for `String`: a non-owning view over a null-terminated
    //! byte sequence.
    
    use std::ffi::{CStr, CString, NulError};
    use std::os::raw::c_char;
    
    /// Build a null-terminated C string from a Rust string slice.
    ///
    /// Returns `Err(NulError)` if `s` contains an interior `'\0'` byte —
    /// such a string cannot be represented in C without silent truncation,
    /// so `CString` refuses to construct one.
    ///
    /// # Examples
    ///
    /// ```
    /// # use example_493_cstring_ffi::to_cstring;
    /// let cs = to_cstring("hello").unwrap();
    /// assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
    /// assert!(to_cstring("hel\0lo").is_err());
    /// ```
    pub fn to_cstring(s: &str) -> Result<CString, NulError> {
        CString::new(s)
    }
    
    /// Return the length of `s` as it would appear in C, **including** the
    /// trailing null byte.  Errors on interior nulls for the same reason as
    /// [`to_cstring`].
    pub fn c_string_len(s: &str) -> Result<usize, NulError> {
        Ok(to_cstring(s)?.as_bytes_with_nul().len())
    }
    
    /// Check whether `s` contains any interior `'\0'` byte.
    ///
    /// This is the condition that would make `CString::new` fail; exposing it
    /// as a plain predicate is useful when you want to report the problem
    /// without constructing the `CString`.
    pub fn has_interior_null(s: &str) -> bool {
        s.as_bytes().contains(&0)
    }
    
    /// Compute the length of a null-terminated C string pointed to by `ptr`,
    /// *not* counting the terminating null — the C `strlen` semantics.
    ///
    /// # Safety
    ///
    /// `ptr` must point to a valid, null-terminated sequence of bytes that
    /// remains live for the duration of the call.  The caller is responsible
    /// for the usual FFI invariants.
    pub unsafe fn c_strlen(ptr: *const c_char) -> usize {
        // `CStr::from_ptr` walks until the null, giving us the borrowed view.
        unsafe { CStr::from_ptr(ptr) }.to_bytes().len()
    }
    
    /// Round-trip a Rust string through a `CString` and back to an owned
    /// `String`, exercising both directions of the FFI conversion.
    ///
    /// Returns `Err` if `s` contains an interior null (it cannot become a
    /// `CString`) or if the round-tripped bytes are not valid UTF-8 — which
    /// cannot happen here since `s` was UTF-8 to start with, but the API is
    /// written to mirror the real-world case where C hands you arbitrary bytes.
    pub fn round_trip(s: &str) -> Result<String, RoundTripError> {
        let owned = CString::new(s).map_err(RoundTripError::InteriorNul)?;
        let borrowed: &CStr = owned.as_c_str();
        borrowed
            .to_str()
            .map(str::to_owned)
            .map_err(RoundTripError::NotUtf8)
    }
    
    /// Error type for [`round_trip`].
    #[derive(Debug)]
    pub enum RoundTripError {
        /// The input contained an interior `'\0'`.
        InteriorNul(NulError),
        /// The bytes were not valid UTF-8.
        NotUtf8(std::str::Utf8Error),
    }
    
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn to_cstring_hello_has_null_terminator() {
            let cs = to_cstring("hello").expect("no interior null");
            // `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
            assert_eq!(cs.as_bytes(), b"hello");
            assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
        }
    
        #[test]
        fn to_cstring_rejects_interior_null() {
            let err = to_cstring("hel\0lo").expect_err("interior null must fail");
            // The error records the position of the offending byte.
            assert_eq!(err.nul_position(), 3);
        }
    
        #[test]
        fn to_cstring_empty_is_just_the_null_byte() {
            let cs = to_cstring("").expect("empty is fine");
            assert_eq!(cs.as_bytes(), b"");
            assert_eq!(cs.as_bytes_with_nul(), b"\0");
        }
    
        #[test]
        fn c_string_len_counts_the_null() {
            assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
            assert_eq!(c_string_len("").unwrap(), 1); // just NUL
            assert_eq!(c_string_len("a").unwrap(), 2);
        }
    
        #[test]
        fn c_string_len_propagates_interior_null_error() {
            assert!(c_string_len("ab\0cd").is_err());
        }
    
        #[test]
        fn has_interior_null_detects_embedded_zero() {
            assert!(!has_interior_null("hello"));
            assert!(!has_interior_null(""));
            assert!(has_interior_null("hel\0lo"));
            assert!(has_interior_null("\0"));
            assert!(has_interior_null("trailing\0"));
        }
    
        #[test]
        fn has_interior_null_handles_utf8() {
            // Multi-byte UTF-8 must not produce false positives — none of the
            // continuation bytes are 0x00.
            assert!(!has_interior_null("héllo"));
            assert!(!has_interior_null("日本語"));
            assert!(has_interior_null("日本\0語"));
        }
    
        #[test]
        fn c_strlen_matches_cstring_as_bytes_len() {
            let cs = to_cstring("hello world").unwrap();
            // SAFETY: `cs` owns a valid null-terminated buffer that lives to
            // the end of this scope; `as_ptr` gives us a read-only pointer
            // into it, which is all `c_strlen` needs.
            let len = unsafe { c_strlen(cs.as_ptr()) };
            assert_eq!(len, 11);
            assert_eq!(len, cs.as_bytes().len());
        }
    
        #[test]
        fn c_strlen_empty_c_string() {
            let cs = to_cstring("").unwrap();
            // SAFETY: same as above — `cs` is a valid null-terminated buffer.
            let len = unsafe { c_strlen(cs.as_ptr()) };
            assert_eq!(len, 0);
        }
    
        #[test]
        fn round_trip_preserves_content() {
            assert_eq!(round_trip("hello").unwrap(), "hello");
            assert_eq!(round_trip("").unwrap(), "");
            assert_eq!(round_trip("日本語").unwrap(), "日本語");
        }
    
        #[test]
        fn round_trip_rejects_interior_null() {
            assert!(matches!(
                round_trip("bad\0string"),
                Err(RoundTripError::InteriorNul(_))
            ));
        }
    }
    ✓ Tests Rust test suite
    #[cfg(test)]
    mod tests {
        use super::*;
    
        #[test]
        fn to_cstring_hello_has_null_terminator() {
            let cs = to_cstring("hello").expect("no interior null");
            // `as_bytes` omits the trailing null; `as_bytes_with_nul` includes it.
            assert_eq!(cs.as_bytes(), b"hello");
            assert_eq!(cs.as_bytes_with_nul(), b"hello\0");
        }
    
        #[test]
        fn to_cstring_rejects_interior_null() {
            let err = to_cstring("hel\0lo").expect_err("interior null must fail");
            // The error records the position of the offending byte.
            assert_eq!(err.nul_position(), 3);
        }
    
        #[test]
        fn to_cstring_empty_is_just_the_null_byte() {
            let cs = to_cstring("").expect("empty is fine");
            assert_eq!(cs.as_bytes(), b"");
            assert_eq!(cs.as_bytes_with_nul(), b"\0");
        }
    
        #[test]
        fn c_string_len_counts_the_null() {
            assert_eq!(c_string_len("hello").unwrap(), 6); // 5 + NUL
            assert_eq!(c_string_len("").unwrap(), 1); // just NUL
            assert_eq!(c_string_len("a").unwrap(), 2);
        }
    
        #[test]
        fn c_string_len_propagates_interior_null_error() {
            assert!(c_string_len("ab\0cd").is_err());
        }
    
        #[test]
        fn has_interior_null_detects_embedded_zero() {
            assert!(!has_interior_null("hello"));
            assert!(!has_interior_null(""));
            assert!(has_interior_null("hel\0lo"));
            assert!(has_interior_null("\0"));
            assert!(has_interior_null("trailing\0"));
        }
    
        #[test]
        fn has_interior_null_handles_utf8() {
            // Multi-byte UTF-8 must not produce false positives — none of the
            // continuation bytes are 0x00.
            assert!(!has_interior_null("héllo"));
            assert!(!has_interior_null("日本語"));
            assert!(has_interior_null("日本\0語"));
        }
    
        #[test]
        fn c_strlen_matches_cstring_as_bytes_len() {
            let cs = to_cstring("hello world").unwrap();
            // SAFETY: `cs` owns a valid null-terminated buffer that lives to
            // the end of this scope; `as_ptr` gives us a read-only pointer
            // into it, which is all `c_strlen` needs.
            let len = unsafe { c_strlen(cs.as_ptr()) };
            assert_eq!(len, 11);
            assert_eq!(len, cs.as_bytes().len());
        }
    
        #[test]
        fn c_strlen_empty_c_string() {
            let cs = to_cstring("").unwrap();
            // SAFETY: same as above — `cs` is a valid null-terminated buffer.
            let len = unsafe { c_strlen(cs.as_ptr()) };
            assert_eq!(len, 0);
        }
    
        #[test]
        fn round_trip_preserves_content() {
            assert_eq!(round_trip("hello").unwrap(), "hello");
            assert_eq!(round_trip("").unwrap(), "");
            assert_eq!(round_trip("日本語").unwrap(), "日本語");
        }
    
        #[test]
        fn round_trip_rejects_interior_null() {
            assert!(matches!(
                round_trip("bad\0string"),
                Err(RoundTripError::InteriorNul(_))
            ));
        }
    }

    Exercises

  • **Safe strlen wrapper**: Write fn safe_strlen(s: &str) -> Result<usize, NulError> that creates a CString and calls a hypothetical extern "C" fn strlen(*const i8) -> usize.
  • Read C output: Given a *const i8 returned by a C function, wrap it in unsafe { CStr::from_ptr(ptr) } and convert to a String with .to_string_lossy().
  • Null in the middle: Write a test that passes b"hel\x00lo" to CString::from_vec_unchecked (unsafe) and observe that .to_str() returns only "hel" — demonstrating the truncation hazard.
  • Open Source Repos