๐Ÿฆ€ Functional Rust

717: Volatile Memory Reads and Writes

Difficulty: 4 Level: Expert Use `read_volatile`/`write_volatile` to prevent the compiler from optimising away memory-mapped I/O register accesses.

The Problem This Solves

Modern compilers are aggressive optimisers. If you write to a memory address and then write to it again without reading, the compiler may eliminate the first write โ€” it looks dead. If you read the same address twice in a loop and the value hasn't changed from the compiler's perspective, it may cache the value in a register and skip the second read. For normal memory, these optimisations are correct. For memory-mapped I/O registers, they are catastrophically wrong. An MMIO register is a hardware register masquerading as a memory address. Writing to the UART data register sends a byte to the serial port โ€” even if you never read the result. Reading the status register queries the hardware โ€” even if the compiler thinks the value is unchanged. Every access to an MMIO address has observable side effects the compiler cannot see. The compiler must not reorder, merge, or eliminate these accesses. `std::ptr::read_volatile` and `std::ptr::write_volatile` are the solution. They tell the compiler: "this read/write is observable; treat it as if the entire outside world can see it." This is not the same as atomic operations โ€” volatile says nothing about thread ordering. It's purely about suppressing compiler optimisations on a specific memory location.

The Intuition

Think of volatile access as a "do not touch" label for the compiler. Normally, the compiler silently rearranges your code for efficiency, confident that programs can't tell the difference. Volatile says "I can tell the difference โ€” every one of these reads and writes matters, in order, exactly as written." Hardware registers, `mmap`-ed files, and shared memory with other processes all have this property. The canonical abstraction is a `VolatileCell<T>` wrapper that exposes only `read()` and `write()` methods, both implemented with the volatile intrinsics. Users of the wrapper can't accidentally use regular field access.

How It Works in Rust

use std::ptr;

pub struct VolatileCell<T>(std::cell::UnsafeCell<T>);

impl<T: Copy> VolatileCell<T> {
 pub fn read(&self) -> T {
     unsafe {
         // SAFETY: The pointer is valid and aligned; T: Copy.
         // Volatile prevents the compiler from caching or eliminating this read.
         ptr::read_volatile(self.0.get())
     }
 }

 pub fn write(&self, val: T) {
     unsafe {
         // SAFETY: Same as above; write is not eliminated even if val is "unused".
         ptr::write_volatile(self.0.get(), val);
     }
 }
}

// MMIO register map โ€” would live at a fixed physical address in real firmware.
pub struct UartRegisters {
 pub status: VolatileCell<u32>,
 pub data:   VolatileCell<u32>,
}
In real embedded code, the register map is at a fixed linker-defined address: `let uart = unsafe { &(0x4000_1000 as const UartRegisters) }`.

What This Unlocks

Key Differences

ConceptOCamlRust
Volatile reads`Bigarray` (partial workaround)`ptr::read_volatile`
Volatile writesNot available directly`ptr::write_volatile`
Memory-mapped regions`Unix.map_file``mmap` via libc or `/dev/mem`
Register abstractionRecord of functions`struct VolatileCell<T>`
Compiler reorderingNot controllable`core::sync::atomic::compiler_fence`
Embedded bare-metalNot typical`#![no_std]` + linker sections
//! 717 โ€” Volatile Reads/Writes for Memory-Mapped I/O
//! read_volatile / write_volatile: prevent compiler elision of I/O accesses.

use std::ptr;

/// Simulated MMIO device with 8 x u32 registers.
/// Uses &mut self for writes โ€” models exclusive hardware register ownership.
pub struct MmioDevice {
    regs: [u32; 8],
}

impl MmioDevice {
    pub fn new() -> Self { Self { regs: [0u32; 8] } }

    /// Volatile write: every write reaches the hardware, no elision.
    pub fn write(&mut self, reg: usize, val: u32) {
        assert!(reg < 8);
        unsafe {
            // SAFETY: reg < 8 == regs.len(); pointer is valid and aligned for u32.
            // Volatile prevents the compiler from merging or reordering writes โ€”
            // critical for hardware registers with side effects on every write.
            ptr::write_volatile(&mut self.regs[reg] as *mut u32, val);
        }
    }

    /// Volatile read: every read goes to hardware, no cached value.
    pub fn read(&self, reg: usize) -> u32 {
        assert!(reg < 8);
        unsafe {
            // SAFETY: reg < 8; pointer is valid and aligned.
            // Hardware may change this register at any time; volatile
            // prevents the compiler from using a prior cached read.
            ptr::read_volatile(&self.regs[reg] as *const u32)
        }
    }

    /// Read-modify-write: read current, apply f, write back.
    pub fn rmw(&mut self, reg: usize, f: impl FnOnce(u32) -> u32) {
        let v = self.read(reg);
        self.write(reg, f(v));
    }

    /// Set specific bits in a register.
    pub fn set_bits(&mut self, reg: usize, mask: u32) {
        self.rmw(reg, |v| v | mask);
    }

    /// Clear specific bits in a register.
    pub fn clear_bits(&mut self, reg: usize, mask: u32) {
        self.rmw(reg, |v| v & !mask);
    }

    /// Poll until bit is set (returns true if set).
    pub fn poll_bit(&self, reg: usize, bit: u32) -> bool {
        (self.read(reg) & (1 << bit)) != 0
    }
}

/// Simulated device transaction protocol.
fn device_transaction(dev: &mut MmioDevice) -> u32 {
    const CMD:    usize = 0;
    const DATA:   usize = 1;
    const STATUS: usize = 2;
    const RESULT: usize = 3;

    dev.write(CMD, 0x01);            // start command
    dev.write(DATA, 0xABCD_1234);    // payload
    dev.write(STATUS, 0x01);         // simulate HW: set ready bit
    dev.write(RESULT, 0xDEAD_BEEF);  // simulate HW: store result

    if dev.poll_bit(STATUS, 0) {
        dev.read(RESULT)
    } else {
        u32::MAX // timeout
    }
}

fn main() {
    let mut dev = MmioDevice::new();

    dev.write(0, 0xFF);
    println!("reg[0] = {:#010x}", dev.read(0));

    let result = device_transaction(&mut dev);
    println!("Transaction result = {:#010x}", result);

    // Demonstrate volatile write+read round-trip
    for i in 0..8usize {
        dev.write(i, i as u32 * 0x1111);
    }
    for i in 0..8usize {
        println!("reg[{i}] = {:#010x}", dev.read(i));
    }

    // bit manipulation
    dev.write(0, 0x0000_0000);
    dev.set_bits(0, 0b0101);
    println!("After set_bits(0b0101): {:#010b}", dev.read(0));
    dev.clear_bits(0, 0b0001);
    println!("After clear_bits(0b0001): {:#010b}", dev.read(0));
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_write_read() {
        let mut d = MmioDevice::new();
        d.write(0, 0xCAFE_BABE);
        assert_eq!(d.read(0), 0xCAFE_BABE);
    }

    #[test]
    fn test_independent_regs() {
        let mut d = MmioDevice::new();
        for i in 0..8usize { d.write(i, i as u32 * 10); }
        for i in 0..8usize { assert_eq!(d.read(i), i as u32 * 10); }
    }

    #[test]
    fn test_poll_bit() {
        let mut d = MmioDevice::new();
        d.write(0, 0b0000_0100); // bit 2 set
        assert!(d.poll_bit(0, 2));
        assert!(!d.poll_bit(0, 0));
    }

    #[test]
    fn test_set_clear_bits() {
        let mut d = MmioDevice::new();
        d.write(0, 0b1010);
        d.set_bits(0, 0b0101);
        assert_eq!(d.read(0), 0b1111);
        d.clear_bits(0, 0b1100);
        assert_eq!(d.read(0), 0b0011);
    }
}
(* OCaml: Volatile-like access using Bigarray (closest standard library analog)
   Bigarray operations are not reordered by the OCaml compiler in the same way
   that GC'd values can be โ€” but OCaml has no true "volatile" keyword. *)

open Bigarray

(* Create a Bigarray backed by a raw byte region โ€” analogous to mmap'd MMIO *)
type reg32 = (int32, int32_elt, c_layout) Array1.t

let make_mmio_region size_words : reg32 =
  Array1.create int32 c_layout size_words

(* Register offsets (word indices) *)
let status_reg = 0
let data_reg   = 1
let ctrl_reg   = 2

(* Status register bits *)
let tx_ready = Int32.of_int 0x01
let rx_ready = Int32.of_int 0x02

(* Volatile-style read: Bigarray prevents the compiler from caching the value *)
let mmio_read (regs : reg32) offset =
  regs.{offset}  (* Bigarray read โ€” not cached by the compiler *)

let mmio_write (regs : reg32) offset value =
  regs.{offset} <- value

(* Wait until TX ready bit is set *)
let wait_tx_ready regs =
  let rec loop () =
    let status = mmio_read regs status_reg in
    if Int32.logand status tx_ready = tx_ready then ()
    else loop ()
  in
  loop ()

(* Simulate writing a byte to a UART-like device *)
let uart_send regs byte =
  wait_tx_ready regs;
  mmio_write regs data_reg (Int32.of_int byte)

(* Simulate hardware: set TX_READY flag after 3 reads *)
let () =
  let regs = make_mmio_region 8 in
  (* Hardware simulation: pre-set status *)
  mmio_write regs status_reg tx_ready;
  mmio_write regs data_reg Int32.zero;

  uart_send regs (Char.code 'H');
  uart_send regs (Char.code 'i');

  Printf.printf "Status: 0x%08lx\n" (mmio_read regs status_reg);
  Printf.printf "Last data written: %d\n" (Int32.to_int (mmio_read regs data_reg));

  (* Show why volatile matters: without it, a compiler could optimise
     the loop body to `if false then โ€ฆ` after seeing the first read. *)
  Printf.printf "MMIO demo complete\n"