Background
At Embedded World 2025, we are presenting two demos based around Rust on Arm.
The first is our Rust for Microcontrollers demonstration, showing the power of SEGGER's J-Trace system in analysing the performance of the Knurling Project's defmt deferred logging framework.
The second is our Rust for Real Time Systems demonstration, based around our new framework for Rust on Arm Cortex-R, the NXP S32Z2 Automotive SoC, and Lauterbach's TRACE32 PowerView debugger.
Rust for Microcontrollers
Our demonstration uses SEGGER's J-Trace PRO Cortex debug probe, and Ozone
debugger - both of which were kindly supplied by SEGGER. Our hardware platform
is the SEGGER Cortex-M Trace Reference Board, featuring an STM32F407
microcontroller. We have developed two simple Rust applications for this
Cortex-M4 based MCU, using the RTIC framework. Both applications simply wake
up periodically using a timer interrupt, and print a short message to the debug
log. One version of this application, uses SEGGER's Real Time Transfer (RTT)
system to send strings to the debugger, whilst the other uses the Knurling
Project's defmt-rtt
library to send highly compressed binary logs over RTT
to the debugger. These strings cannot be printed (yet!) by Ozone, but can be
easily extracted from the running debugger by using the defmt-print
command
line tool.
Our application looks like:
#![no_main]
#![no_std]
use embedded_hal::digital::OutputPin;
use stm32f4xx_hal::{gpio, pac, prelude::*, timer};
use rtic_time::Monotonic;
use defmt_rtt as _;
use panic_probe as _;
const TIMER_RATE: u32 = 1_000_000;
struct Led<T>(T);
impl<T> Led<T>
where
T: OutputPin,
{
/// Turn LED on
fn on(&mut self) {
_ = self.0.set_low();
}
/// Turn LED off
fn off(&mut self) {
_ = self.0.set_high();
}
}
type Mono = timer::MonoTimer<pac::TIM2, TIMER_RATE>;
#[rtic::app(device = pac, peripherals = true, dispatchers=[EXTI0, EXTI1, EXTI2])]
mod app {
use super::*;
/// The resources used by only a single task
#[local]
struct MyLocalResources {
led1: Led<gpio::Pin<'A', 8, gpio::Output>>,
led2: Led<gpio::Pin<'A', 9, gpio::Output>>,
led3: Led<gpio::Pin<'A', 10, gpio::Output>>,
led1_counter: f32,
}
/// We have no shared resources
#[shared]
struct MySharedResources {}
/// Called once at start-up with interrupts disabled
#[init]
fn init(mut cx: init::Context) -> (MySharedResources, MyLocalResources) {
// Set up PLL
let rcc = cx.device.RCC.constrain();
let clocks = rcc
.cfgr
.use_hse(12.MHz())
.require_pll48clk()
.sysclk(168.MHz())
.freeze();
// WFI enters sleep, not stop
cx.device.PWR.cr().modify(|_r, w| {
w.pdds().enter_standby();
w
});
// Set up LEDs
let gpioa = cx.device.GPIOA.split();
let led1 = gpioa
.pa8
.into_push_pull_output_in_state(gpio::PinState::High);
let led2 = gpioa
.pa9
.into_push_pull_output_in_state(gpio::PinState::High);
let led3 = gpioa
.pa10
.into_push_pull_output_in_state(gpio::PinState::High);
// Set up monotonic timer
cx.device
.TIM2
.monotonic::<TIMER_RATE>(&mut cx.core.NVIC, &clocks);
blink_one::spawn().expect("spawning blink_one");
blink_two::spawn().expect("spawning blink_two");
blink_three::spawn().expect("spawning blink_three");
(
MySharedResources {},
MyLocalResources {
led1: Led(led1),
led1_counter: 0.0,
led2: Led(led2),
led3: Led(led3),
},
)
}
#[task(local=[led1, led1_counter], priority=1)]
async fn blink_one(cx: blink_one::Context) {
loop {
cx.local.led1.off();
*cx.local.led1_counter = *cx.local.led1_counter + 0.1;
defmt::info!("LED1 off ***** {=f32}", cx.local.led1_counter);
Mono::delay(125.millis().into()).await;
cx.local.led1.on();
defmt::info!("LED1 on *****");
Mono::delay(125.millis().into()).await;
}
}
#[task(local=[led2], priority=2)]
async fn blink_two(cx: blink_two::Context) {
loop {
cx.local.led2.off();
defmt::info!("LED2 off ***************");
Mono::delay(250.millis().into()).await;
cx.local.led2.on();
defmt::info!("LED2 on ***************");
Mono::delay(250.millis().into()).await;
}
}
#[task(local=[led3], priority=3)]
async fn blink_three(cx: blink_three::Context) {
loop {
cx.local.led3.off();
defmt::info!("LED3 off ******************************");
Mono::delay(500.millis().into()).await;
cx.local.led3.on();
defmt::info!("LED3 on ******************************");
Mono::delay(500.millis().into()).await;
}
}
}
The RTT version is almost exactly the same, but uses rtt_target::rprintln!
to
send strings formatted with core::fmt::Write
over RTT.
The instruction trace feature of Ozone, in combination with the J-Trace probe's high-speed trace interface, allows us to capture a record of every instruction executed on our system over a given period. We can then dial in using the Timeline view to see precisely how long our RTIC framework is taking to convert our timer interrupt into the resumption of an async function, and how long our logging calls are taking to complete.
Building our firmware with Ferrocene is as simple as:
DEFMT_LOG=info criticalup run cargo build --release
We can see the rprintln!
messages inside Ozone, but we must run defmt-print
to collect messages from the defmt version. The defmt-print
tool knows how to
speak to Ozone, or any J-Link DLL based program, over TCP:
cargo install defmt-print
defmt-print -e ./target/thumbv7em-none-eabihf/release/main tcp
The log message being printed are:
rtt_target::rprintln!("LED1 off ***** {}", cx.local.led1_counter);
// or
defmt::info!("LED1 off ***** {=f32}", cx.local.led1_counter);
The rprintln!
version performs the conversion from f32
to &str
on the
microcontroller, and takes 1675 instructions to complete.
The defmt::info!
version sends the raw f32
value over the logging stream,
along with a unique identifier for the format string (instead of the string
itself). It therefore only takes 1050 instructions to complete - saving us 37%
in instruction count on this small example.
It's generally true that the more logging you have, the easier your systems are to debug, and the more efficient your logging, the more you can log for a given cost in terms of time and power, so this kind of saving soon adds up! And we're also getting bonus information about which crate, module, file and line emitted the log message, as well as compile-time control over which log levels are emitted from which modules.
Why not try defmt in your next Embedded Rust project?
Rust for Real-Time Systems
For our second demo, we wanted to try something that hasn't really been done before - running Rust code on a powerful Automotive-grade System-on-Chip using Arm's Cortex-R52 real-time processor. This presents a number of challenges:
- There are very few boards using chips based on the Arm Cortex-R52.
- Those that exist are in short supply and very expensive.
- Rust's corresponding
armv8r-none-eabihf
target is in Tier 3 and so only works on nightly Rust. - Standard microcontroller debug tools often don't work with JTAG based Cortex-R processors, which require complex 'scripts' to attach and load code into their on-chip RAM.
- There are basically no Rust examples for Cortex-R to be found online, and certainly no libraries to make bring-up and application development easier.
For the first two problems, we are lucky enough to work with NXP as Registered Partners, giving us early access to their pre-production S32Z2-400EVB board, featuring the NXP S32Z70 Safe and Secure High-Performance Real-Time Processor. This chip contains two clusters of four Cortex-R52 processors, along with number of Cortex-M33 cores for CAN-bus offloading and system management.
The second issue - the relevant target being Tier 3 in Upstream Rust - is easily
resolved by using Ferrocene. We've been shipping a preview of the
armv8r-none-eabihf
target since Ferrocene 24.11, but today we can go further.
In the Ferrocene 25.05 release we will be marking this target as Qualification
Ready - marking the fact that we can run (and pass) the complete Rust Compiler
Test Suite on this target - a key pre-requisite for offering it as a Qualified
Target suitable for safety-critical use cases.
For the third part, we are very grateful to Lauterbach for supplying us with a PowerView X50 JTAG Probe and copy of their TRACE32 PowerView debug software for Arm. This package is one of the few debuggers that has support for the S32Z2, allowing us to quickly and easily load code into the RAM for the first CPU cluster in the SoC and single-step through it. It was also an opportunity to check out the current state of Rust support within TRACE32 PowerView and work with Lauterbach by providing suggestions for the next release. Generally though, it's in great shape and has been a huge help in debugging.
Debugging? Surely Rust code is just correct by design? Well, not always. And certainly not on the Arm Cortex-R52 where, unlike on an Arm Cortex-M, the first code that executes on the CPU must be written in assembly language. This is because the CPU boots with all eight (!) stack pointers set to zero, and in EL2 - a mode designed for running a Virtual Machine Hypervisor, not regular firmware.
After some trial and error, and much reading of both Arm and NXP technical reference manuals, we had a working software stack. And, as we are an open-source company, we've not only made our stack available under a permissive open-source licence, but we've donated it to the Rust Embedded Working Group, where hopefully it can act as a catalyst for the development of more examples of Rust on Arm Cortex-R. We've also worked with Google on adding support for the Armv8-R architecture to their Generic Interrupt Controller driver (which is great, as it was one less thing for me to write from scratch…)
Our published stack includes:
cortex-ar
, a crate that provides access to CPU registers and common peripherals for Armv7-R, Armv8-R and Armv7-A processors. This includes a full PMSAv8-R Memory Protection Unit driver, and a driver for the Arm Generic Timer. This crate is modelled after thecortex-m
crate.cortex-r-rt
, a crate that provides the start-up code for Armv7-R and Armv8-R, including transferring from EL2 to EL1 mode, initialisation of all the stack pointers, and assembly language trampolines so that Interrrupt and Exception Handlers can be written in Rust. This crate is modelled after thecortex-m-rt
crate.arm-targets
, abuild.rs
helper library that performs run-time target feature macros based on your build target. This was useful for helping us support both Armv7-R and Armv8-R in the two crates above (because whilst similar, they have different registers and require different start-up code).derive-mmio
, a derive-macro for making safe MMIO peripherals out ofrepr(C)
structures.
As part of our donation, Ferrous Systems have also open-sourced a number of examples which run inside QEMU, the popular open-source sytem emulator. You can find examples for both the Arm Versatile system (fitted with an Armv7-R Cortex-R5), and the Arm MPS3-AN536 system (fitted with an Armv8-R Cortex-R52, like our NXP system) - both emulated by QEMU. Even at this early stage we've been bouyed by some fantastic community contributions to our crates, even adding support for Armv7-A based processors.
Our example at Embedded World is running on the NXP S23Z270-400EVB board, on Cluster 0. We have a working Generic Interrupt Controller and Generic Timer, with Software Interrupts and Timer Interrupts handled in Rust. We also have debug output over the Arm DCC interface. If you'd like to see further support for this SoC, or similar examples, we'd love to talk.
Live Demos at Embedded World 2025
Whether you're interested in Rust for Microcontrollers, or Rust for Real-Time Systems based on Arm Cortex-R, please reach out or come and visit us at Embedded World - Hall 4, Booth 4-402.