๐Ÿฆ€ Functional Rust

716: Custom Global Allocator with #[global_allocator]

Difficulty: 5 Level: Master Replace Rust's default system allocator with a custom implementation โ€” for embedded, arena allocation, or memory accounting.

The Problem This Solves

Every `Box::new`, `Vec::push`, and `String::from` goes through Rust's global allocator. By default, that's the system allocator (glibc `malloc` on Linux, jemalloc on some targets). For most programs, this is fine. For embedded systems, production servers, or diagnostic tooling, you need more control. `#[global_allocator]` lets you replace the global allocator entirely. Implement `GlobalAlloc` โ€” two methods, `alloc` and `dealloc` โ€” and annotate your static instance. From that point on, every heap allocation in your program goes through your code. You can track live bytes, count allocations, enforce a per-request budget, use a pre-allocated arena, or route all allocation to a region in physical memory you've mapped yourself. The contract is strict. `alloc` must return a pointer to memory of at least `layout.size()` bytes, aligned to `layout.align()`. `dealloc` must receive the exact same pointer and layout that `alloc` returned. The compiler cannot verify this. You own these invariants completely โ€” which is why `GlobalAlloc` is an `unsafe trait`.

The Intuition

The global allocator is the lowest layer of Rust's memory model. Every `Box`, `Vec`, `String`, `Arc`, and `HashMap` eventually calls `alloc` and `dealloc`. Replacing it means you intercept every heap allocation in the program. A tracking allocator wraps the system allocator and maintains atomic counters around each call โ€” the same approach profilers use, but at zero external overhead. A bump allocator skips the system allocator entirely: allocation is just a pointer increment, reset is setting the offset to zero.

How It Works in Rust

use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicUsize, Ordering};

pub struct TrackingAllocator { inner: System }

static LIVE_BYTES: AtomicUsize = AtomicUsize::new(0);
static ALLOC_COUNT: AtomicUsize = AtomicUsize::new(0);

unsafe impl GlobalAlloc for TrackingAllocator {
 unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
     let ptr = self.inner.alloc(layout);
     if !ptr.is_null() {
         LIVE_BYTES.fetch_add(layout.size(), Ordering::Relaxed);
         ALLOC_COUNT.fetch_add(1, Ordering::Relaxed);
     }
     ptr
 }

 unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
     self.inner.dealloc(ptr, layout);
     LIVE_BYTES.fetch_sub(layout.size(), Ordering::Relaxed);
 }
}

#[global_allocator]
static ALLOCATOR: TrackingAllocator = TrackingAllocator { inner: System };
For embedded or `#![no_std]` targets, a bump allocator replaces the `System` inner with a `static mut [u8; N]` slab and a single `AtomicUsize` offset. Allocation: compare-exchange the offset forward. Reset: store zero.

What This Unlocks

Key Differences

ConceptOCamlRust
Custom allocatorNot supported (GC-managed heap)`impl GlobalAlloc` + `#[global_allocator]`
Allocation tracking`Gc.stat()` for GC stats onlyAtomic counters in your allocator
Fixed backing storeNot possible in standard OCaml`static mut [u8; N]` + bump pointer
Thread-safetyGC handles it`AtomicUsize` for counters
Dealloc contractGC finalizesMust match exact pointer + layout from `alloc`
Per-allocation costGC amortizesCustom: can be as low as one `fetch_add`
// 716. Custom GlobalAlloc and #[global_allocator]
//
// Demonstrates:
//  1. A tracking allocator wrapping the system allocator
//  2. A minimal bump allocator over a fixed static buffer

use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicUsize, Ordering};

// โ”€โ”€ Part 1: Tracking Allocator โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

/// Wraps the system allocator and tracks live bytes + allocation count.
pub struct TrackingAllocator {
    inner: System,
}

static LIVE_BYTES: AtomicUsize = AtomicUsize::new(0);
static ALLOC_COUNT: AtomicUsize = AtomicUsize::new(0);

impl TrackingAllocator {
    pub const fn new() -> Self {
        Self { inner: System }
    }

    pub fn live_bytes() -> usize {
        LIVE_BYTES.load(Ordering::Relaxed)
    }

    pub fn alloc_count() -> usize {
        ALLOC_COUNT.load(Ordering::Relaxed)
    }
}

// SAFETY: We forward every call to `System`, which upholds all GlobalAlloc
// contracts (alignment, non-null on success, dealloc with same layout).
// Our only addition is atomic counter updates, which cannot cause UB.
unsafe impl GlobalAlloc for TrackingAllocator {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        // SAFETY: Caller guarantees `layout` is non-zero-sized and well-formed.
        let ptr = unsafe { self.inner.alloc(layout) };
        if !ptr.is_null() {
            LIVE_BYTES.fetch_add(layout.size(), Ordering::Relaxed);
            ALLOC_COUNT.fetch_add(1, Ordering::Relaxed);
        }
        ptr
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        // SAFETY: Caller guarantees `ptr` was returned by a previous `alloc`
        // with the same `layout`, and has not been freed yet.
        unsafe { self.inner.dealloc(ptr, layout) };
        LIVE_BYTES.fetch_sub(layout.size(), Ordering::Relaxed);
    }

    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
        // SAFETY: Same contract as `alloc`.
        let ptr = unsafe { self.inner.alloc_zeroed(layout) };
        if !ptr.is_null() {
            LIVE_BYTES.fetch_add(layout.size(), Ordering::Relaxed);
            ALLOC_COUNT.fetch_add(1, Ordering::Relaxed);
        }
        ptr
    }

    unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
        // SAFETY: `ptr` was previously allocated with `layout`; `new_size` is
        // a valid non-zero size that is compatible with `layout.align()`.
        let new_ptr = unsafe { self.inner.realloc(ptr, layout, new_size) };
        if !new_ptr.is_null() {
            // Adjust live byte count by the delta.
            if new_size > layout.size() {
                LIVE_BYTES.fetch_add(new_size - layout.size(), Ordering::Relaxed);
            } else {
                LIVE_BYTES.fetch_sub(layout.size() - new_size, Ordering::Relaxed);
            }
        }
        new_ptr
    }
}

#[global_allocator]
static ALLOCATOR: TrackingAllocator = TrackingAllocator::new();

// โ”€โ”€ Part 2: Bump Allocator over a fixed buffer โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

use std::cell::UnsafeCell;
use std::sync::atomic::AtomicU32;

/// A simple single-threaded bump allocator.
/// All allocations are freed at once by calling `reset()`.
///
/// This is *not* set as the global allocator (you can only have one), but
/// demonstrates the `GlobalAlloc` contract and how arenas are built.
pub struct BumpAllocator<const N: usize> {
    buf: UnsafeCell<[u8; N]>,
    pos: AtomicU32,
}

// SAFETY: We only hand out non-overlapping slices from `buf`, and `reset()`
// must only be called when no live slices from this arena exist.
unsafe impl<const N: usize> Sync for BumpAllocator<N> {}

impl<const N: usize> BumpAllocator<N> {
    pub const fn new() -> Self {
        Self {
            buf: UnsafeCell::new([0u8; N]),
            pos: AtomicU32::new(0),
        }
    }

    /// Allocate `layout.size()` bytes, properly aligned.
    /// Returns `None` if the arena is exhausted.
    pub fn try_alloc(&self, layout: Layout) -> Option<&mut [u8]> {
        let align = layout.align();
        let size  = layout.size();

        // Busy-loop with compare-exchange to claim the next slot.
        loop {
            let current = self.pos.load(Ordering::Relaxed) as usize;
            // Round up to alignment.
            let aligned = current.checked_add(align - 1)? & !(align - 1);
            let next    = aligned.checked_add(size)?;
            if next > N {
                return None; // OOM
            }
            if self.pos
                .compare_exchange(current as u32, next as u32,
                    Ordering::AcqRel, Ordering::Relaxed)
                .is_ok()
            {
                // SAFETY: `aligned..next` is within the buffer and was just
                // exclusively claimed by this thread via the CAS above.
                let slice = unsafe {
                    let base = (*self.buf.get()).as_mut_ptr();
                    std::slice::from_raw_parts_mut(base.add(aligned), size)
                };
                return Some(slice);
            }
        }
    }

    /// Reset the bump pointer โ€” all previously returned slices become invalid.
    ///
    /// # Safety
    /// Caller must guarantee no live references into this arena exist.
    pub unsafe fn reset(&self) {
        self.pos.store(0, Ordering::Release);
    }

    pub fn used_bytes(&self) -> usize {
        self.pos.load(Ordering::Relaxed) as usize
    }
}

// โ”€โ”€ main โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

fn main() {
    println!("=== Tracking Allocator ===");
    let before = TrackingAllocator::live_bytes();
    let v: Vec<u8> = vec![1u8; 1024];
    let after = TrackingAllocator::live_bytes();
    println!("Before vec:  {} live bytes", before);
    println!("After vec:   {} live bytes", after);
    println!("Total allocs so far: {}", TrackingAllocator::alloc_count());
    drop(v);
    println!("After drop:  {} live bytes", TrackingAllocator::live_bytes());

    println!("\n=== Bump Allocator ===");
    static ARENA: BumpAllocator<4096> = BumpAllocator::new();

    let a = ARENA.try_alloc(Layout::from_size_align(128, 8).unwrap())
        .expect("alloc 128");
    a[0] = 0xFF;
    println!("Allocated 128 bytes, buf[0]=0x{:02X}", a[0]);
    println!("Used: {} / 4096 bytes", ARENA.used_bytes());

    let b = ARENA.try_alloc(Layout::from_size_align(3900, 1).unwrap())
        .expect("alloc 3900");
    println!("Allocated {} more bytes", b.len());
    println!("Used: {} / 4096 bytes", ARENA.used_bytes());

    let c = ARENA.try_alloc(Layout::from_size_align(200, 1).unwrap());
    println!("Alloc 200 when ~full: {:?}", c.map(|s| s.len()));

    // SAFETY: `a` and `b` are not used after this point.
    unsafe { ARENA.reset(); }
    println!("After reset, used: {}", ARENA.used_bytes());
}

// โ”€โ”€ Tests โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn tracking_counts_bytes() {
        let before = TrackingAllocator::live_bytes();
        let v = vec![0u8; 512];
        assert!(TrackingAllocator::live_bytes() >= before + 512);
        drop(v);
        assert_eq!(TrackingAllocator::live_bytes(), before);
    }

    #[test]
    fn bump_basic() {
        static ARENA: BumpAllocator<256> = BumpAllocator::new();
        let s = ARENA.try_alloc(Layout::from_size_align(64, 8).unwrap()).unwrap();
        assert_eq!(s.len(), 64);
        assert!(ARENA.used_bytes() >= 64);
    }

    #[test]
    fn bump_oom() {
        static ARENA: BumpAllocator<32> = BumpAllocator::new();
        let _ = ARENA.try_alloc(Layout::from_size_align(20, 1).unwrap()).unwrap();
        let r = ARENA.try_alloc(Layout::from_size_align(20, 1).unwrap());
        assert!(r.is_none(), "should be OOM");
    }

    #[test]
    fn bump_reset_reuse() {
        static ARENA: BumpAllocator<64> = BumpAllocator::new();
        let a = ARENA.try_alloc(Layout::from_size_align(60, 1).unwrap()).unwrap();
        drop(a); // let go of borrow before reset
        // SAFETY: No live references into the arena.
        unsafe { ARENA.reset(); }
        let b = ARENA.try_alloc(Layout::from_size_align(60, 1).unwrap());
        assert!(b.is_some(), "should succeed after reset");
    }
}
(* OCaml: closest concept to a custom allocator is tracking GC stats.
   OCaml doesn't expose a custom-allocator API in the standard library,
   but we can demonstrate the *intent*: count allocations and bytes. *)

(* GC statistics (built-in) *)
let print_gc_stats label =
  let s = Gc.stat () in
  Printf.printf "[%s] minor_words=%.0f major_words=%.0f live_words=%d\n"
    label s.Gc.minor_words s.Gc.major_words s.Gc.live_words

(* Simulate "tracked allocation" with a ref-counted wrapper *)
let total_allocated = ref 0

let tracked_alloc n =
  total_allocated := !total_allocated + n;
  Bytes.create n   (* OCaml allocation โ€” goes through GC *)

let tracked_free _buf n =
  (* GC manages lifetime; this just tracks intent *)
  total_allocated := !total_allocated - n

let () =
  print_gc_stats "start";
  let bufs = Array.init 1000 (fun _ -> tracked_alloc 64) in
  print_gc_stats "after alloc";
  Printf.printf "Tracked live bytes: %d\n" !total_allocated;
  Array.iter (fun b -> tracked_free b 64) bufs;
  Printf.printf "After free: tracked=%d\n" !total_allocated;
  print_gc_stats "end";

  (* OCaml note: real deallocation happens whenever GC collects;
     a custom allocator in Rust would immediately reclaim on dealloc. *)
  ()

(* --- Conceptual bump allocator (as a pure functional model) --- *)

type bump_state = { buf: bytes; mutable pos: int; capacity: int }

let make_bump cap = { buf = Bytes.create cap; pos = 0; capacity = cap }

let bump_alloc state size =
  if state.pos + size > state.capacity then None
  else begin
    let ptr = state.pos in
    state.pos <- state.pos + size;
    Some ptr   (* return "offset" as a pointer handle *)
  end

let bump_reset state = state.pos <- 0

let () =
  let arena = make_bump 256 in
  (match bump_alloc arena 100 with
  | Some p -> Printf.printf "Bump alloc at offset %d\n" p
  | None   -> print_endline "OOM");
  (match bump_alloc arena 100 with
  | Some p -> Printf.printf "Bump alloc at offset %d\n" p
  | None   -> print_endline "OOM (expected: no space)");
  bump_reset arena;
  (match bump_alloc arena 50 with
  | Some p -> Printf.printf "After reset, bump alloc at offset %d\n" p
  | None   -> print_endline "OOM")