Knurling

A tool set to develop embedded applications faster.

Do you need help with coding?

These days GDB is the go-to development/debugging tool of many embedded Rust developers. We think there's room for complementary tools that could provide a better development workflow and improve productivity. In this blog post we'll share our vision for great embedded Rust tooling and share info about the tools we have been building.

This is an early preview of our tool defmt, which is the first tool included in our knurling project. It is available for preview and usage today for sponsors of the knurling-rs efforts, and will be made public in the future, when we feel the tool is ready for general availabilty (think "0.7" release or so).

If you're interested in trying out an early version of defmt, you can become a Knurling-rs sponsor today!

EDIT: This blog post has been updated to reflect the rename of binfmt to defmt.

GDB is great for post-mortem analysis

GDB main features are:

breakpoints: halting execution once a certain part of the program is reached
inspecting variables
inspecting memory
inspecting registers
printing stack backtraces, which are function call stacks but in reverse.

These are great post-mortem features meant for debugging issues like stack overflows, hardware fault exceptions (e.g. invalid instruction), memory corruption and ABI mismatch in function calls.

If you are writing an application in pure safe Rust, however, it's very unlikely you'll encounter these issues except perhaps for checked stack overflows. During development your debugging questions will be more along the lines of: how has this register changed in response to that event? which steps brought the state machine to this particular state? GDB features do not directly answer these questions but well placed logs statements do.

A story not a snapshot

GDB operates under a "stop the world" model: run until a breakpoint, inspect things, move to the next breakpoint. This model is not a great fit for embedded applications, which usually are reactive (need to respond to external events) and time sensitive in nature. It is particularly problematic in applications that involve communication with other machines like USB, Ethernet and radio protocols (BLE, 802.15.4, etc.). Halting the device often results in a breakdown of the communication which leaves no room for 'continuing the program'.

For these long live evented applications logging is more appropriate. With logging you are not constrained to snapshots; instead you get the full story of your program. Logging can quickly answer questions about the control flow of your program. Not to mention that you can also print registers and data structures in a custom format.

So we think a first additional tool in the embedded toolbox is logging.

"Logging is expensive!"

"Formatting is CPU expensive and storing strings in the device can use significant amounts of Flash" is probably what you are thinking. I agree that core::fmt and anything that uses it, like the log crate, is not a great fit for resource constrained devices. I have built programs with log and formatting indeed takes significant resources. Let's look at the program size breakdown of the "Hello, world" program we used in our latest embedded Rust workshop.

First, this is what the program prints to the host console:

DEBUG:dk -- Initializing the board
DEBUG:dk -- Clocks configured
DEBUG:dk -- RTC started
DEBUG:dk -- I/O pins have been configured for digital output
DEBUG:dk -- Radio initialized and configured
INFO:hello -- Hello, world!
(..)

This is the breakdown of the .text section by function:

NOTE: The .text section contains all the functions in the program

$ cargo bloat -n10 --bin hello
File  .text   Size      Crate Name
0.8%  13.4% 1.3KiB        std <char as core::fmt::Debug>::fmt
0.6%  10.0%   964B      hello hello::__cortex_m_rt_main
0.5%   7.9%   768B        std core::str::slice_error_fail
0.5%   7.9%   766B        std core::fmt::Formatter::pad
0.4%   6.3%   610B        std core::fmt::Formatter::pad_integral
0.4%   6.2%   604B        std core::fmt::num::<...
0.3%   4.8%   466B        std core::fmt::write
0.2%   3.9%   382B        std <core::fmt::builders::PadAdapter as ...
0.2%   3.1%   302B        std core::slice::memchr::memchr
0.2%   2.6%   252B        std core::fmt::num::imp::< ...
(..)
6.1% 100.0% 9.5KiB            .text section size

Formatting functions are among the largest functions in the program according to cargo-bloat and formatting functions in the top 10 take more than 45% of .text (~4.5 KB). In contrast, the main function (the application logic) is 10% of the total program size. Having logging take ~50% of your program sounds not promising at all even if you are only including logs in development builds.

cargo-bloat only reports the size of .text. Strings are stored in .rodata so let's see how much space logging strings are taking using cargo-size:

$ cargo size --bin hello -- -A
hello  :
section              size        addr
(..)
.text                9680       0x100
.rodata              4540      0x26d0
(..)

.rodata is 4.5 KB, about half the size of .text. Also not great.

Making logging very cheap

The first step towards making logging pervasive in embedded code is to make it cheap. To this end we have been working on a logging framework called defmt:

Below is the cargo-bloat output of a minimal async USB application that exposes a fully working HID interface and uses defmt for logging (program output shown in the video above).

$ cargo bloat -n20 --bin hid
File  .text   Size     Crate Name
2.9%  56.0% 4.6KiB       hal USBD
0.3%   6.6%   548B [Unknown] main
0.3%   5.3%   440B       hal POWER
0.1%   2.6%   216B       hal RTC0
0.1%   2.4%   200B       hal hal::usbd::start_epin2
0.1%   2.1%   176B       hal hal::usbd::start_epin0
0.1%   1.9%   160B       hal hal::usbd::EVENTCAUSE
(..)
0.0%   0.7%    58B       hal defmt::binWrite::write_sym
0.0%   0.7%    56B       hal defmt::binWrite::leb128_write
0.0%   0.6%    52B       hal semidap::log
5.2% 100.0% 8.2KiB           .text section size

In contrast to the hello example, which uses the log crate, now logging functions are at the bottom of the top 20 largest functions reported by cargo-bloat.

And if we look at the .rodata section we now see that it's minimal.

$ cargo size --bin hid
hid  :
(..)
section              size        addr
.rodata                 4  0x20003250
.text                8364  0x20003254
(..)

What's more the program size (.text + .rodata) of hid (8 KB) is ~60% of hello (14 KB) and packs way more functionality (USB & an async executor).

As for the difference in runtime work we clocked defmt::info!("x={}", 42) to be in the order of a few microseconds whereas log::info!() took ~60 microseconds or so. I must admit this comparison is sort-of apples to oranges because it's possible to construct (absurd) edge cases that make the ratio 100:1 or higher.

Run embedded apps like native apps

We also think it's important to erase the gap between std and no_std applications. no_std applications usually are executed using third-party subcommands; this makes them feel not first-class. Ideally, cargo run should Just Work with no_std applications. That's why we have created a custom Cargo runner that lets you run cargo-run embedded applications.

With this custom Cargo runner you can, for example, use the standard todo! as a placeholder for code that you have not yet written. Take for instance this piece of no_std USB code:

fn handle_standard_request(req: usb::StandardRequest) {
    defmt::debug!("request: {}", req);
    match req {
        usb::StandardRequest::GetDescriptor {
            descriptor,
            length,
        } => {
            match descriptor {
                Descriptor::DeviceQualifier => todo!(),
                // .. omitted other cases ..
            }
        },
        // .. omitted other cases ..
    }
}

Then you can execute cargo run to run your embedded program on the target device. No new Cargo subcommand with new flags to learn; just the built-in cargo-run subcommand you are familiar with. Then the output would look like:

$ cargo run
(..)
0.255_560s DEBUG request: GetDescriptor { descriptor: Device, ..
+.000_010s INFO  EPIN0: sending 18B of data
+.000_079s INFO  EPIN0: data transmitted
+.000_213s INFO  USB reset
0.328_895s DEBUG request: SetAddress { address: Some(13) }
+.000_005s INFO  moving to the Address state
0.345_497s DEBUG request: GetDescriptor { descriptor: Device, ..
+.000_010s INFO  EPIN0: sending 18B of data
+.000_061s INFO  EPIN0: data transmitted
+.000_148s DEBUG request: GetDescriptor { descriptor: DeviceQual ..
stack backtrace:
   0: 0x2003ee7e - __abort
   1: 0x2003ed78 - rust_begin_unwind
   2: 0x2003d532 - core::panicking::panic_fmt
   3: 0x2003d528 - core::panicking::panic
   4: 0x2003e988 - hal::usbd::handle_standard_request
   5: 0x2003e356 - hal::usbd::handle_control_request
   6: 0x2003d7ec - USBD
      <exception entry>
   7: 0x2003ee8e - __wfe
   8: 0x2003d4cc - main
   9: 0x2003d594 - Reset

$ echo $?
134

You would get all these detailed defmt logs but that wouldn't be all. Upon hitting a panic you would get a stack backtrace; just like the one you would get from a program linked to std. What's more you also get the non-zero exit code from cargo run when you hit a panic.

Unit testing on stable

Being able to run no_std programs like this enables cargo-test-like unit testing of no_std code on the device.

The only extra required ingredient is a no_std exit function that terminates the cargo-run process with an exit-code equal to 0. With that you could, for example, write your tests in the src/bin folder.

// src/bin/usb-enumeration.rs
#![no_std]

#[entry]
fn main() -> ! {
    info!("usb-enumeration: started");
    set_timeout(Duration::from_secs(1));
    block_until_enumeration_completes();
    info!("usb-enumeration: passed");
    // exit-code == 0
    exit(); // enumeration completed!
}

#[interrupt(binds = TIMER0)]
fn timeout() {
    // exit-code != 0
    panic!("USB enumeration didn't complete within expected time");
}

Now running your unit tests would amount to running all the binaries in src/bin.

$ cargo run --bin usb-enumeration && cargo run --bin other-test # ..
INFO  usb-enumeration: started
(..)
INFO  usb-enumeration: passed
INFO  other-test: started
(..)
ERROR other-test: failed

$ echo $?
134

Or instead of manually calling cargo-run you could write a cargo-xtask task that does exactly that:

$ cargo xtask test
INFO  usb-enumeration: started
(..)
INFO  usb-enumeration: passed
INFO  other-test: started
(..)
INFO  other-test: passed

Interactive logs

Once you have cheap logs there's no reason to constrain logging to plain text that's fit for console output. One can make the logs structured and include metadata that other tools can use. With structured logs people can build more featureful frontends. In particular, we have been experimenting with web frontends:

In the above prototype, logs are sorted by timestamp on the fly. Also, Clicking a message highlights other messages of that share the same ID; in practice, messages produced by the same log statement info!("x={}, y={}", x, y) would get assigned the same ID.

A web frontend for defmt could provide the following features:

'Display filters' to hide low severity (e.g. trace and/or debug) logging levels or focus on (i.e. hide the rest) logs from a certain crate.
Search for log messages that match some regex.
Highlight all events that are of the same kind.
Measure the time between recurring events or between two unrelated events
'Go to definition' but from rendered log message to matching line in the source code

Visualization

Finally, we think visualization is quite important for embedded applications that sample sensors, either for monitoring or control purposes. Data visualization can help developers better understand how their control algorithms and digital filters are performing.

This data logging is quite similar to the logging we showed before: instead of data formatted into text you have labeled data. Ideally, we see visualization tools being frontends to logging frameworks like defmt. These front ends should be flexible enough to allow re-structuring graphs of live data like:

changing the scale of the axes from linear to logarithmic
going from separate time series graphs per source to a single graph that displays all sources
going from time series graphs to a single XY graph (for example for calibration purposes)

James recently built a prototype visualization tool on top of defmt. This viz tool shows the state of a LED strip that's controlled by firmware running defmt . You can see a demo below, or view the viz tool in action in this archived stream.

We are really excited about the prospect of embedded Rust tooling. We believe that tooling like this can both make embedded development more accessible to beginners and make embedded developers more productive in their day to day work and are thrilled to be involved in its creation.

defmt is a knurling project and can be funded through GitHub sponsors. Thank you to all of the people already sponsoring our work through the Knurling project!

Embedded Rust tooling for the 21st century

Embedded Rust tooling for the 21st century

GDB is great for post-mortem analysis

A story not a snapshot

"Logging is expensive!"

Making logging very cheap

Run embedded apps like native apps

Unit testing on stable

Interactive logs

Visualization

You might be also interested in these.

Ferrocene 24.08.0 now available!

SBOM support in Rust

Improving 64-bit RISC-V Linux support in Rust

Embedded Rust tooling for the 21st century

Embedded Rust tooling for the 21st century

GDB is great for post-mortem analysis

A story not a snapshot

"Logging is expensive!"

Making logging very cheap

Run embedded apps like native apps

Unit testing on stable

Interactive logs

Visualization

Sponsor this work

You might be also interested in these.

Ferrocene 24.08.0 now available!

SBOM support in Rust

Improving 64-bit RISC-V Linux support in Rust