"Zero Sized Reference" (ZSR) sounds like an impossible thing given that
mem::size_of
returns a non-zero value for references to Zero Sized Types
(ZST) like &()
but ZSRs can in fact be constructed and they can improve both
the performance and correctness of your embedded application.
In this post, we'll introduce you to this pattern which is actually used in many embedded crates – though many developers may not have given the actual pattern much attention.
Let's start by motivating the need for zero sized references.
Automatic deallocation for memory pools
Some embedded applications are simple enough that a memory pool that manages memory blocks of the same size (e.g. 128 bytes) – instead of a full blown general-purpose memory allocator – is sufficient to satisfy their dynamic memory management needs.
The heapless
crate provides a lock-free memory pool for such cases. The core
of the Pool
API is shown below:
// module: heapless::pool
/// Lock-free memory pool that manages blocks of the same
/// size (`size_of::<T>`)
pub struct Pool<T> { /* .. */ }
/// An owning pointer into a block managed by `Pool<T>`
pub struct Box<T> { data: NonNull<Node<T>> }
/// LIFO linked list (AKA stack) node
struct Node<T> { /* .. */ }
impl<T> Pool<T> {
/// Creates an empty memory pool
pub const fn empty() -> Self { /* .. */ }
/// Returns `None` when the pool is observed as exhausted
// (if you are wondering why there's no lifetime relationship
// between `self` and the returned value, that's because
// `Pool` manages "statically allocated" (`&'static mut`)
// memory)
pub fn alloc(&self) -> Option<Box<T>> { /* .. */ }
/// Returns the memory `block` to the pool
pub fn dealloc(&self, block: Box<T>) { /* .. */ }
// omitted: API to give an initial chunk of memory the pool
}
One thing that's notably missing is that Box<T>
does not implement the
Drop
trait. This means a block is not automatically returned to the memory
pool when it goes out of scope.
static P: Pool<[u8; 128]> = Pool::empty();
fn main() {
let x = P.alloc().expect("OOM");
// do stuff with `x`
// oops, this leaks memory -- the memory block is gone forever
drop(x);
// to avoid leaking memory you have to call this:
// P.dealloc(x);
}
Let's try to fix that!
Implementing Drop
One may come up with this solution:
pub struct Box<T> {
data: NonNull<Node<T>>,
// added this ...
pool: &'static Pool<T>,
}
impl<T> Pool<T> {
// omitted: empty, alloc and dealloc methods
// ... and this ...
unsafe fn dealloc_raw(&self, p: NonNull<Node<T>>) { /* .. */ }
}
// ... so we can implement this
impl<T> Drop for Box<T> {
fn drop(&mut self) {
unsafe {
// run T's destructor
ptr::drop_in_place(self.data.as_ptr());
// dealloc memory block
self.pool.dealloc_raw(self.data);
}
}
}
This gets the job done: Box
es will be returned to their Pool
when they go
out of scope. But, this solution also doubles the size of pool::Box<T>
(e.g.
from 4B to 8B on a 32-bit architecture like ARMv7-M) which is going to regress
the performance of moving Box
es around.
If doubling the size of Box
doesn't sound like a big issue to you I shall tell
you that most error handling libraries go out of their way (even tapping into
the unsafe
arts) to make their Error
type a thin pointer, instead
of a fat pointer (e.g. Box<dyn std::error::Error>
), because the perf gains are
significant.
Going back to the task at hand: can we implement Drop
without increasing the
size of Box
? Yes!
Why use references when you can use ZST?!
We can actually keep the two-field Box
implementation from last section and
simply shrink the pool
field from 4 (or 8) bytes to 0 bytes by using a Zero
Sized Type instead of a reference. We'll need to introduce a Pool
trait into
the mix to make things work out. Here's the revised version:
// module: heapless::pool
// put the revised API in a new module
pub mod singleton {
/// A memory pool
pub trait Pool {
// the type of the memory block managed by this pool
type Data;
// ^ this is the `T` in the original `Box<T>` version
/* Public API */
fn alloc() -> Option<Box<Self>>;
/* Implementation details */
#[doc(hidden)]
unsafe fn __dealloc_raw(p: NonNull<Node<Self::Data>>);
}
pub struct Box<P: Pool> {
data: NonNull<Node<P::Data>>,
_pool: PhantomData<P>, // zero sized type, not a reference
}
impl<P: Pool> Drop for Box<P> {
fn drop(&mut self) {
unsafe {
ptr::drop_in_place(self.data.as_ptr());
// NOTE static method call: no receiver (`&self`)
P::__dealloc_raw(self.data);
}
}
}
}
// the original `Pool<T>` + `Box<T>` implementation still lives here
This may compile but how does one even use this new API? How do you implement
the Pool
trait? The answer is: you don't implement the Pool
trait yourself;
you use the pool!
macro provided by the heapless
crate:
pool!(P: [u8; 128]);
fn main() {
// note that `alloc` is a static method (no receiver: `&self`)
let x: Box<P> = P::alloc().expect("OOM");
// ^ NOTE `P`, not `[u8; 128]`
// do stuff with `x`
drop(x); // <- this returns the memory block to the pool :tada:
}
The pool!
macro will expand to something like this:
// expansion of pool!(P: [u8; 128]);
use heapless::pool;
pub struct P; // <- the memory pool is a unit struct
impl pool::singleton::Pool for P {
type Data = [u8; 128];
fn alloc() -> Option<Box<P>> {
// ommited: conversion from `pool::Box` to
// `pool::singleton::Box`
P::__impl().alloc().map(/* something */)
}
unsafe fn __dealloc_raw(p: NonNull<Node<[u8; 128]>>) {
P::__impl().__dealloc_raw(p)
}
}
impl P {
fn __impl() -> &'static pool::Pool {
static POOL: pool::Pool = pool::Pool::empty();
&POOL
}
}
Singleton-ish
What the pool!
macro is actually doing is creating some sort of global
singleton where all instances of, for example, P
are handles to the same
static variable POOL
, which is hidden from the user. In other words, P
is a
Zero Sized version of a &'static Pool
Reference that points into the
hidden POOL
variable.
That's the basic idea around Zero Sized References (ZSR): a zero sized proxy
type that's equivalent to a shared (&'static
) reference. ZSRs come in
different flavors: pool!
creates ZSR of the "shared" kind (&'static T
) but
there's also an "owned" variant which we'll look into later.
But first, let's look into the properties of the pool!
abstraction.
Different types, same interface
In your embedded application you may end up using different memory pools, each
one associated to a different part of your HAL like the radio interface or the
USB interface. Some of these memory pools may end up managing memory blocks of
the same size. If you are using the non-pool!
version of Pool
you may end up
in a situation where it's not obvious to which pool you should return a memory
block. See below:
// two pools that manage blocks of the same size
static A: Pool<[u8; 128]> = Pool::empty();
static B: Pool<[u8; 128]> = Pool::empty();
fn do_stuff(boxed: Box<[u8; 128]>) {
// ..
// which one to call?
A.dealloc(boxed);
B.dealloc(boxed);
}
The boxed
argument could come from either pool A
or B
. If you pick
A.dealloc
above you may end up exhausting pool B (all of B's blocks get
transferred to pool A); if you pick B.dealloc
you may exhaust pool A.
With the pool!
version it's simply not possible to return a memory block to
the wrong pool because boxes that belong to a pool are uniquely typed. Each
pool!
invocation creates a new static variable and gives that static variable
a different type. See below:
pool!(A: [u8; 128]);
pool!(B: [u8; 128]);
fn do_stuff(boxed: Box<A>) { // <- box managed by pool A
// ..
// `dealloc` is not a method of a `Pool` trait
// but if it were this would be rejected at compile time
B::dealloc(boxed); //~ error: expected type `B`; found type `A`
}
Even though both pools, A and B, are proxies to static variables with the same
type Pool<[u8; 128]>
each pool is exposed to the user as a different type.
This lets you track in the type system the association between a memory block
and a pool.
If for some reason you need to write function that must work with boxes managed by different pools you can write generic code:
// generic function ...
fn zero_before_dealloc<P, T>(boxed: Box<P>)
where
P: Pool<Data = T>
T: Zeroable, // unsafe marker: implies no destructor, etc.
{
let p: NonNull<Node<T>> = Box::into_raw(boxed);
// zeroes the block, even in presence of compiler optimizations
T::zero(p);
unsafe {
P::dealloc_raw(p); // return it to the pool
}
}
fn discard(a: Box<A>, b: Box<B>) {
// ... that works with boxes from pool A and B
zero_before_dealloc(a);
zero_before_dealloc(b);
}
Zero Sized "Owned" References
The pool!
macro creates a static variable proxy with global (static variable
like) visibility but this not a strict requirement of the ZSR pattern. Here we
present an "owned" variant that does not have global visibility:
/* Public API */
pub struct Proxy { // NOTE Zero Sized Type
_marker: PhantomData<&'static mut Impl>,
}
impl Proxy {
/// Returns the `Some` variant only once
pub fn claim() -> Option<Proxy> {
static CLAIMED: AtomicBool = AtomicBool::new(false);
if CLAIMED
.compare_exchange(false, true, SeqCst, SeqCst)
.is_ok()
{
// not yet claimed
// NOTE(unsafe) this branch is executed at most once
unsafe { Self::__impl().write(Impl::new()) }
// do not move the previous 'write' beyond this fence
atomic::compiler_fence(SeqCst);
Some(Self { _marker: PhantomData })
} else {
// already claimed
None
}
}
/// Frobs `arg`
pub fn frob(&mut self, arg: SomeArg) {
(*self.__impl()).frob(arg)
}
/* Implementation detail */
fn __impl() -> *mut Impl {
static mut IMPL: MaybeUninit<Impl> = MaybeUninit::uninit();
unsafe { IMPL.as_mut_ptr() }
}
}
/* Private API */
struct Impl { /* data */ }
impl Impl {
// NOTE constructor does not need to be `const`
fn new() -> Self { /* .. */ }
fn frob(&mut self, arg: SomeArg) { /* .. */ }
}
Usage looks like this:
fn main() {
// before claim: `IMPL` static variable is *un*initialized
let mut proxy = Proxy::claim().expect("already claimed");
// after claim: `IMPL` static variable is initialized
// further calls to `claim` (from any thread) will return None
assert!(Proxy::claim().is_none());
// all methods will operate on an initialized static variable
proxy.frob(SomeArg);
}
Proxy
is a Zero Sized Reference that behaves like a normal variable and it's
subject to the usual ownership semantics. We say this is an "owned" variant
because Proxy
is equivalent to a mutable reference with static lifetime
(&'static mut T
), which has the same move semantics as alloc::Box<T>
.
So, where would you use this "owned" variant?
Spotted in the wild: MMIO registers
This "owned" variant of the ZSR pattern is widely used in the embedded Rust
ecosystem. The API generated by svd2rust
uses this pattern to:
-
make references to registers zero sized, for the same perf reasons as wanting to keep
pool::Box
pointer sized. -
to let you control access to peripherals via ownership, which gives you better encapsulation than C where peripherals can be accessed from pretty much anywhere.
For reference, the svd2rust
API as of version 0.17.0 looks like this:
#![no_std]
fn main() {
// `take` is the same as `claim`: it returns
// the `Some` variant only once
let peripherals = nrf51::Peripherals::take().expect("already claimed");
// ^^^^^ Peripheral Access Crate (PAC)
// generated by svd2rust
// "the GPIO peripheral"
// this handle grants exclusive access to the peripheral
let gpio = peripherals.GPIO;
}
Unlike the general version of claim
, Peripherals::take
performs no
initialization of static variables. These Zero Sized References are not pointing
into static variables (RAM) but into Memory Mapped I/O registers (which have
known addresses).
In recent work: filesystems
Recently, we came up with a Filesystem
API based on the ZSR pattern for our
friends at Iqlusion who are building armistice – a hardware private key
storage for next-generation cryptography – as part of the development of a
Hardware Abstraction Layer for the USB Armory Mk II development board.
We used the Zero Sized References pattern to implement "close on drop" semantics for files without storing a pointer to its filesystem in the file handle.
Like with memory pools you'll want a trait (Filesystem
) that provides a common
interface to different filesystem types.
/// NOTE do NOT implement this yourself;
/// use the `filesystem!` macro
pub unsafe trait Filesystem: Copy {
/// Where does the filesystem live? RAM? on-chip FLASH?
type StorageDevice: Storage;
// ^^^^^^^
// interface between the FS and the storage device
/* Public API */
fn mount(
storage: Self::StorageDevice,
format_before_mounting: bool,
) -> io::Result<Self>;
/* Implementation details */
#[doc(hidden)]
unsafe fn __close_in_place(
&self,
f: &mut File<Self>, // not consumed by the method
) -> io::Result<()>;
}
/// A handle to an open file that lives in filesystem `FS`
pub struct File<FS: Filesystem> {
// omitted fields: state, buffers, caches, etc.
fs: FS, // Zero Sized Reference to the filesystem
}
impl<FS: Filesystem> File<FS> {
/// NOTE must present a handle to the FS to "prove"
/// it has been mounted
pub fn create(
fs: FS,
path: impl AsRef<Path>,
) -> io::Result<Self> {
// ..
}
pub fn write_all(
&mut self,
data: &[u8],
) -> io::Result<()> { /* .. */ }
pub fn close(self) -> io::Result<()> { /* .. */ }
}
/// NOTE this panics if I/O errors occur while closing the file.
/// Use the `File.close` method, which returns a `Result`, to
/// handle those I/O errors
impl<FS: Filesystem> Drop for File<FS> {
fn drop(&mut self) {
let fs = self.fs; // make a copy
fs.close_in_place(&mut self).expect("I/O error");
// omitted: deallocate buffers
}
}
You would use the API like this:
filesystem!(F, StorageDevice = uSD);
// get handle to micro SD card subsystem
let usd = uSD::claim()?;
let format = true;
let fs = F::mount(usd, format)?;
let mut f: File<F> = File::create(fs, "foo.txt")?;
f.write_all(b"Hello, file!")?;
// file will be closed automatically at the end of the scope
// (implicit `drop`)
// drop(f);
// OR you can close the file manually to handle errors
f.close()?;
Note how the API also prevents operations like "creating a file on a filesystem
that has not yet been mounted" at compile time: File::create
requires a handle
to the filesystem and that handle can only exist after the filesystem has been
mount
-ed.
If you use the filesystem!
macro to create one more filesystem, say backed by
RAM (so tmpfs
like), then that filesystem gets its own type, let's say
R
. All files include the filesystem they belong to in their type so File<F>
lives in FS F
whereas File<R>
lives in FS R
. Having different types means
that you can avoid, at compile time, operations like closing (committing) a file
to the wrong filesystem:
filesystem!(F, StorageDevice = uSD); // non-volatile storage
filesystem!(R, StorageDevice = RAM); // volatile storage
// ..
let r = F::mount(usd, format)?;
let f = R::mount(ram_block, format)?;
let mut file: File<R> = File::create(fs, "foo.txt")?;
file.write_all(b"Hello, file!")?;
// wrong filesystem!
F::close(file)?; //~ error: expected type `F`; found type `R`
This Filesystem
implementation uses the littlefs C library under the hood to
do the bulk of the work. Creating a safe Rust wrapper around that C library
proved rather challenging – and probably deserves its own blog post – but at
the end of the day Rust features let us defuse the traps in the C API so that
consumers of the Rust wrapper won't run into them.
Food for thought: capabilities
If you are writing single-process embedded applications the above Filesystem
API lets you restrict, at compile time, which parts of your application can
use the FS.
Consider the following functions. You can tell which ones use the filesystem, and which filesystem, by just looking at their signatures. Or, in other words, the function signatures reflect the capabilities of the subroutine.
// filesystem stored in internal (on-chip) Flash memory
filesystem!(Internal, StorageDevice = FLASH);
// filesystem stored in external (on-board) SPI-NOR Flash
filesystem!(External, StorageDevice = SPI_NOR);
// can use either filesystem
fn create_lockfile(f: impl Filesystem) -> io::Result<(), ()> {
// ..
}
fn uses_the_internal_fs(f: Internal) -> io::Result<(), ()> {
// ..
}
fn uses_the_external_fs(f: External) -> io::Result<(), ()> {
// ..
}
fn cannot_use_any_fs(/* because no arguments */) { /* .. */ }
If you are structuring your embedded application as a set of tasks, either
async
tasks or reactive tasks, you can similarly control the
capabilities of each task by moving, or not, a Filesystem
handle into the
task. With async
tasks that may look like this:
#![no_std]
fn main() -> ! {
let fs = Filesystem::mount(/* .. */);
// tasks
let first = uses_the_fs(fs);
let second = cannot_use_the_fs();
// run asynchronous tasks concurrently
executor::run!(first, second)
}
async fn uses_the_fs(fs: Fs) -> ! { /* .. */ }
async fn cannot_use_the_fs() -> ! { /* .. */ }
If we had opted for a globally available fs
API, like the one provided in the
Rust standard library, it would be much harder to identify which parts of the
application may perform FS operations.
Conclusion
This is one of the patterns we use to build correct-by-construction software for
our clients. In this post we have covered some of the possible uses of this
pattern but, for brevity, have left out (important) details like the concurrency
(thread / interrupt) safety of the ZSR proxies and the extra mileage you can get
out of the runs once section in the ZSR::claim
constructor. Still, we hope
that the post motivates you into trying out this pattern in your embedded code!
Ferrous Systems is a Rust consultancy based in Berlin. Need to strengthen your Rust Project with the external development support? Looking for the advise on how to get the best out of Rust features? Get in touch with us! Want to get up to speed with the language and its tooling? We also offer remote training on basic Rust, advanced topics and embedded Rust — to receive our updated training program, subsribe to our newsletter.