Article

Testing an embedded application

Published on 20 min read
Knurling icon
Knurling
A tool set to develop embedded applications faster.
❤️ Sponsor

    Welcome to the final post in our "testing embedded Rust" series. In the previous posts we covered how to test a Hardware Abstraction Layer, how to test a driver crate and how to set up a GitHub Actions (GHA) workflow to include Hardware-In-the-Loop (HIL) tests. In this blog post we'll cover three different approaches to testing an embedded application.

    Naming

    In this post we'll refer to the embedded device we are developing the application firmware for as the target. The machine where the application firmware is built will be called the host.

    The example application

    Let's say you are tasked with writing an embedded application that involves a nRF52840 microcontroller and a SCD30 air quality sensor. The end goal is some sort of mesh of connected sensors that will monitor air quality in a building. Right now, you are at an early stage and are working with a single device with no wireless communication.

    Let's start with the folder structure for this project.

    Structuring the project for host- and cross compilation

    NOTE: The full source code for the example covered here can be found on GitHub.

    Our recommended folder structure for embedded applications is 2 nested Cargo workspaces where the inner one is configured for cross compilation. This lets you separate your hardware-dependent code from the hardware-independent code that can be tested on the host, e.g. your development computer or CI/CD server.

    $ # instead of `exa` (Rust tool) you can use the `tree` command
    $ # https://crates.io/crates/exa
    
    $ exa -a -I '.git*|*.rs|*.lock' -T
    .
    ├── .cargo
    │  └── config.toml # <- no `build.target`
    ├── .vscode
    │  └── settings.json
    ├── Cargo.toml # <- outer Cargo workspace
    ├── cross
    │  ├── .cargo
    │  │  └── config.toml # <- build.target = thumbv7*-*
    │  ├── app
    │  │  ├── Cargo.toml
    │  │  └── src
    │  ├── board
    │  │  ├── Cargo.toml
    │  │  └── src
    │  ├── Cargo.toml # <- inner Cargo workspace
    │  └── self-tests
    │     ├── Cargo.toml
    │     └── tests
    ├── host-target-tests
    │  ├── Cargo.toml
    │  └── tests
    ├── messages
    │  ├── Cargo.toml
    │  └── src
    └── xtask
       ├── Cargo.toml
       └── src
    

    The inner workspace, called cross above, contains all the code that can not be executed on the host because e.g. it contains architecture specific instructions (assembly) or does memory mapped IO (MMIO). There are two mandatory packages in this workspace: the application package app, which produces the firmware binary, and the self-tests package used for on-target testing. In this example, there's one more package called board which contains board (peripherals) initialization code.

    The outer workspace, the root workspace, contains code that will be compiled to the host architecture (e.g. x86) by default. The mandatory packages at this level are host-target-tests and xtask; those two are used together to run tests that involve host target interactions – more details about that later. In this example, there is one more packages in the root: messages. messages is a no_std crate that will be used in the application firmware; it defines the messages that can be exchanged between the host and the target.

    🔎 Tool tip: We recommend that VS code users use the following Rust-Analyzer configuration. With it, Rust-Analyzer's "go to definition" works across local crates and all "Run Test" buttons work.

    // .vscode/settings.json
    {
      "rust-analyzer.linkedProjects": [
        // order is important (rust-analyzer/rust-analyzer#7764)
        "Cargo.toml",
        "cross/Cargo.toml",
      ]
    }
    

    Application requirements

    At the current stage, these are the requirements for the application we want to build:

    • REQ1 The nRF52840 gets a new CO2 level measurement from the SCD30 sensor every 2 seconds
    • REQ2 The host can communicate with the nRF52840 over serial port to retrieve the latest CO2 level measurement
    • REQ3 Messages exchanged over serial port shall be binary encoded and the size of an encoded message shall not exceed 64 bytes 1

    Testing on the host

    Whenever possible you should test your firmware code on the host as this provides the fastest edit-compile-test cycles and has the advantage that the standard library can be used in test code. To make this possible you should split off the purely functional parts of your firmware into separate crates and put those crates in the root workspace. Serialization, deserialization, parsing, digital filters, state machines, business logic, etc. all those can go in the root. Everything that's in the root workspace can be tested using Rust's built-in #[test] functionality.

    In our example application, the messages crate is in the root so it can be tested on the host. The messages exchanged between the host and the target will be binary encoded using the postcard library. We can test the size requirement (REQ3) on the host because postcard encoding and decoding is architecture independent.

    The messages crate defines these message types

    // messages/src/lib.rs
    
    #![no_std]
    
    use serde_derive::{Deserialize, Serialize};
    
    /// A message sent from the host to the target
    #[derive(Clone, Copy, Debug, Deserialize, Serialize)]
    pub enum Host2Target {
        GetLastMeasurement,
    }
    
    /// A message sent from the target to the host
    #[derive(Clone, Copy, Debug, Deserialize, Serialize)]
    pub enum Target2Host {
        NotReady,
        Measurement(Measurement),
    }
    
    /// A measurement reported by the target
    #[derive(
        Clone, Copy, Debug, Deserialize, PartialEq, Serialize
    )]
    pub struct Measurement {
        /// The measurement "identifier"
        /// this is a monotonically increasing counter
        pub id: u32,
        /// A timestamp in unspecified units
        /// it may wrap around
        pub timestamp: u32,
        /// The CO2 concentration in parts per million (ppm)
        pub co2: f32,
    }
    

    Accessing std when testing embedded code (testing Host2Target)

    To test the size of the Host2Target message one may write this test 2:

    // messages/src/lib.rs > mod tests
    
    /// Max payload size for a USB (2.0 Full Speed) HID packet
    const MAX_SIZE: usize = 64;
    
    #[test]
    fn host2target_message_size() -> postcard::Result<()> {
        let msg = Host2Target::GetLastMeasurement;
        let bytes = postcard::to_allocvec(&msg)?;
        assert!(dbg!(bytes).len() <= MAX_SIZE);
        Ok(())
    }
    

    This will not compile because the test includes the dbg! macro, which is defined in the std crate, but the messages crate has to be #![no_std] so we can use it in the firmware.

    Using the standard library in test code is perfectly fine because the test will run on the host: we just need to adjust the crate's #![no_std] attribute:

     // messages/src/lib.rs
    
    +// make `std` available when testing
    +#![cfg_attr(not(test), no_std)]
    -#![no_std]
    

    Then we can run the test

    $ cargo test -p messages host2target -- --nocapture
    running 1 test
    [messages/src/lib.rs:42] bytes = [
        0,
    ]
    test tests::host2target_message_size ... ok
    

    Generating random inputs with quickcheck (testing Target2Host)

    The Target2Host message enum has two variants; we should test the size of both. The test for the NotReady variant is going to be similar to the previous Host2Target test. The Measurement variant is more interesting because it has a payload. If postcard does compression (spoilers it does) then the size of the message may depend on the payload of the Measurement variant (spoilers there's no compression in this case).

    For this test either you could cover the edge cases, but you would need to read more about how/when postcard does compression, or you could randomize the message payload.

    To do the latter you would need a crate like quickcheck which depends on the standard library; since we are testing on the host we can use it so let's do that for demonstration's sake.

    quickcheck will be used only in tests so it should go under the dev-dependencies section of the Cargo.toml. Also you don't want crates that depend on std under dependencies because that would make the messages crate depend on std and you wouldn't be able to use it in the firmware, which is a no_std program.

    # messages/Cargo.toml
    
    [dev-dependencies]
    quickcheck = "1"
    quickcheck_macros = "1"
    

    The quickcheck test will look like this.

    // messages/src/lib.rs > mod tests
    
    use quickcheck_macros::quickcheck;
    
    #[quickcheck]
    fn target2host_measurement_message_size(
        id: u32, timestamp: u32, co2: f32,
    ) -> postcard::Result<()> {
        let msg = Target2Host::Measurement(Measurement {
            id, timestamp, co2,
        });
        let bytes = postcard::to_allocvec(&msg)?;
        assert!(bytes.len() <= MAX_SIZE);
        Ok(())
    }
    

    The arguments to this function will be randomly generated and the function will be executed several times giving you wider coverage than writing a few test cases. If you ever change the fields of the Measurement struct you won't need to come up with new edge cases; you'll only need to update the signature of this quickcheck function.

    The bottom line here is that even if you are testing no_std code you can still use the standard library when testing that code on the host. This means that you can use tools like fuzzers and sanitizers, which depend on the standard library, on your no_std code.

    Do be aware that differences in endianness and machine word size between the target and the host could affect test results if the code under test relies on the memory layout of data structures (e.g. mem::size_of on #[repr(C)] structs).

    Mocks

    The previous section focused on testing pure (IO-less) code but it's also possible to test code that does IO on the host by mocking the embedded hardware. For more details check out the previous post in this series about testing a driver crate.

    Testing on the target

    The next set of tests will run on the target without any interaction with the host. We'll use the defmt-test test harness which provides a #[test] attribute similar to the built-in one. The semantics are the same: if a test function panics or returns an Err variant then the test is considered to have failed. Because the target asserts its own state in these tests I've put them in a package named self-tests.

    If you are not familiar with using or setting up defmt-test (and probe-run) check out our previous blog post about using it to test a Hardware Abstraction Layer.

    What can be tested here? The test runs on the target so code that uses the hardware can be tested here.

    What should be tested here? I'll often test hardware components in isolation in these tests. For example, if the application needs to pull data from different sensors, internal or external to the microcontroller, I would test each sensor in a different test file.

    Testing isolated hardware components can help pinpoint issues introduced by changes in the hardware. It's common in embedded development to start writing the firmware for a general purpose development board and then port the firmware to the "production" PCB (Printed Circuit Board) mid-way development. It's also possible that the microcontroller gets replaced with a "smaller" version (less RAM, slower CPU) late in the development cycle to reduce costs and / or power consumption. These target tests can help identify issues like: the production PCB is missing pull up resistors or they are not properly sized; or the new microcontroller has a silicon bug in its I2C peripheral and a workaround needs to be added to the firmware, etc.

    What are the limitations? These tests run without external stimuli; that limits what can be exercised in these tests.

    In this example application, I would check that communication between the nRF52840 (microcontroller) and the SCD30 (sensor) works but not involve serial communication or concurrency in the tests.

    The board initialization lives in the board crate and the available API looks like this.

    // cross/board/src/lib.rs
    
    /// Peripherals and on-board sensors
    pub struct Board {
        pub scd30: Scd30,
        pub serial: Serial,
    }
    
    impl Board {
        /// Initializes the board's peripherals and sensors
        pub fn init(mut dcb: DCB, mut dwt: DWT) -> Self {
            // omitted: I2C and UARTE peripheral configuration
            Self { scd30: Scd30::init(i2c), serial: uarte }
        }
    }
    

    Scd30 is the SCD30 driver we use in our 1st knurling session on 'building an air quality sensor'; this driver was also featured in our previous blog post on [testing a driver crate]. Serial is the serial port (also known as "UART") API provided by the nRF52840 Hardware Abstraction Layer.

    The first test below checks if I2C communication is possible at all.

    // cross/self-tests/tests/scd30.rs > mod tests
    
    #[init]
    fn init() -> Board {
        let cm_periph = unwrap!(cortex_m::Peripherals::take());
        Board::init(cm_periph.DCB, cm_periph.DWT)
    }
    
    #[test]
    fn confirm_firmware_version(board: &mut Board) {
        const EXPECTED: [u8; 2] = [3, 66];
    
        assert_eq!(
            EXPECTED,
            board.scd30.get_firmware_version().unwrap(),
        )
    }
    
    $ # from within the `cross` folder
    $ cargo test -p self-tests
      (HOST) INFO  flashing program (15.38 KiB)
      (HOST) INFO  success!
    ────────────────────────────────────────────────────────────
    0.000009 INFO  (1/1) running `confirm_firmware_version`...
    0.001170 INFO  all tests passed!
    

    This test is useful to find hardware and configuration issues: are the SCD30 and the nRF52840 correctly wired together? Are the pull-up resistors (internal or external) working? Is communication possible at the selected I2C frequency?

    I would additionally check at this level that the sensor works as advertised for the configuration (update rate, sample averaging, etc.) chosen for the application.

    The next test checks that the sensor produces a new measurement, signaled by the data ready flag, every 2 seconds. This covers requirement REQ1.

    #[test]
    fn data_ready_within_two_seconds(board: &mut Board) {
        board
            .scd30
            .start_continuous_measurement()
            .unwrap();
    
        // twice because there may be a cached measurement
        // (the SCD30 sensor is never power-cycled / reset)
        for _ in 0..2 {
            board.delay(Duration::from_millis(2_100));
            assert!(board.scd30.data_ready().unwrap());
    
            // clear data ready flag
            let _ = board.scd30.read_measurement();
        }
    }
    

    This last test 3 checks that the CO2 concentration (f32 value in parts per million) reported by the SCD30 sensor is within the range specified in the SCD30's documentation.

    #[test]
    fn reasonable_co2_value(board: &mut Board) {
        // range reported by the sensor when using I2C
        const MIN_CO2: f32 = 0.;
        const MAX_CO2: f32 = 40_000.;
    
        // do this twice for good measure
        for _ in 0..2 {
            while !board.scd30.data_ready().unwrap() {}
    
            let measurement =
                board.scd30.read_measurement().unwrap();
            assert!(measurement.co2.is_finite());
            assert!(measurement.co2 >= MIN_CO2);
            assert!(measurement.co2 <= MAX_CO2);
        }
    }
    

    These tests may feel redundant if the driver crate itself has target tests but unless the driver crate was tested against the exact same hardware as the one you are using for your application it's still worthwhile to do because you are testing the integration of the generic driver crate with your particular hardware configuration, which is a particular embedded-hal trait implementation.

    Host-target testing

    Next, we'll test the application as a whole instead of testing smaller components. At the current development stage of the example application we can test the host-to-target communication.

    Since these tests involve both the host and the target, they'll go in a host-target-tests package (root workspace). The tests will run on the host and use the built-in #[test] functionality – using the standard library is possible as well.

    To simplify the test code we'll first write an abstraction that represents a serial connection between the host and the target.

    // host-target-tests/tests/serial.rs
    // NOTE for reuse it'd be better to put this into its own crate
    
    /// A connection between the host and the target over
    /// a serial interface
    pub struct TargetSerialConn { /* .. */ }
    
    impl TargetSerialConn {
        /// Opens a serial connection to the target
        pub fn open() -> Result<Self, anyhow::Error> { /* .. */ }
    
        /// Sends a request to the target and waits for a response.
        /// Returns the target response.
        fn request(
            &mut self,
            request: &Host2Target,
        ) -> Result<Target2Host, anyhow::Error> { /* .. */ }
    
        /// Requests the last measurement
        pub fn get_measurement(
            &mut self,
        ) -> Result<Option<Measurement>, anyhow::Error> {
            let resp =
                self.request(&Host2Target::GetLastMeasurement)?;
    
            Ok(match resp {
                Target2Host::NotReady => None,
                Target2Host::Measurement(measurement) => {
                    Some(measurement)
                }
            })
        }
    }
    

    Our first test will check if we can communicate with the target at all. There are no assertions here but if the get_measurement method returns an error, e.g. due to a timeout, then the test fails.

    // host-target-tests/tests/serial.rs
    
    #[test]
    fn get_measurement_succeeds() -> Result<(), anyhow::Error> {
        let mut target = TargetSerialConn::open()?;
        dbg!(target.get_measurement())?;
        Ok(())
    }
    
    $ cargo test -p host-target-tests -- --nocapture
    running 1 test
    [tests/serial.rs:13] target.get_measurement()? = Some(
        Measurement {
            id: 74,
            timestamp: 1129193247,
            co2: 970.54,
        },
    )
    test get_measurement_succeeds ... ok
    

    In the next test we check that polling the target every 2 seconds returns a new measurement (REQ1). Furthermore, these measurements must have contiguous IDs because we are polling at the SCD30 update rate.

    #[test]
    fn new_measurement_every_2_seconds(
    ) -> Result<(), anyhow::Error> {
        let mut target = TargetSerialConn::open()?;
    
        let samples = (0..3)
            .map(|_| {
                thread::sleep(Duration::from_millis(2_100));
                Ok(dbg!(target.get_measurement()?.unwrap()))
            })
            .collect::<Result<Vec<_>, anyhow::Error>>()?;
    
        // [A, B, C] -> { [A, B], [B, C] }
        for pair in samples.windows(2) {
            // all samples should be different
            assert_ne!(pair[0], pair[1]);
            // new measurements should have contiguous IDs
            assert_eq!(pair[0].id.wrapping_add(1), pair[1].id);
            // timestamps should be different
            // (the timestamp can wrap around so we don't use `>=`)
            assert_ne!(pair[0].timestamp, pair[1].timestamp);
        }
    
        Ok(())
    }
    

    We can also test that polling the target faster than the SCD30 update rate returns the same measurement (REQ2).

    #[test]
    fn consecutive_sampling_returns_same_measurement(
    ) -> Result<(), anyhow::Error> {
        let mut target = TargetSerialConn::open()?;
    
        // sample faster than the SCD30 update rate
        let first = dbg!(target.get_measurement()?);
        let second = dbg!(target.get_measurement()?);
        let third = dbg!(target.get_measurement()?);
    
        // at most one new measurement can occur;
        // 2 samples should be the same measurement
        if first != second {
            assert_eq!(second, third);
        }
    
        if second != third {
            assert_eq!(first, second);
        }
    
        Ok(())
    }
    

    Some considerations

    Some care is needed when running host-target tests because usually there's only one target so tests must not run in parallel without some sort of synchronization. Here's what you should be aware of:

    Re-flash with correct firmware

    Before you run host-target tests the application firmware must be flashed and running on the target. The on-target tests (cross/self-tests) flash different test binaries on the target so a cargo flash, or probe-run, invocation is needed between that set of tests and these host-target tests. Furthermore, most of the times you want to test the latest version of the firmware. This requirement can be enforced using an xtask. The details for this are covered in a later section.

    Ensuring sequential test runs

    #[test] functions within a test file must run sequentially because they communicate with the target and there's only one target (in this example application); however, cargo test runs #[test] functions in parallel (using threads) by default. In our example, we can fulfill this requirement by adding a static Mutex to the TargetSerialConn::new constructor as shown below.

    impl TargetSerialConn {
        pub fn open() -> Result<Self, anyhow::Error> {
            static MUTEX: Mutex<()> =
                parking_lot::const_mutex(());
    
            // acquire mutex when constructing this value
            let _guard = MUTEX.lock();
    
            // ..
    
            Ok(Self { _guard, /* .. */ })
        }
    }
    
    pub struct TargetSerialConn {
        // ..
        // mutex is released when this struct is dropped
        _guard: MutexGuard<'static, ()>,
    }
    

    The mutex will act like a semaphore: if two threads try to create a TargetSerialConn value, one of them will succeed and the other one will wait (block / sleep) until the first one drops the TargetSerialConn value.

    The other way to meet this requirement is to always pass the --test-threads=1 flag to the test binary. This prevents #[test] functions from running in parallel but doesn't protect against the programmer error of creating more than one TargetSerialConn per test function.

    Process-level parallelism

    Test binaries must not run in parallel because then you would have more than one process using the serial port. cargo test runs test files (tests/*.rs) sequentially but you need to be careful to not run more than one instance of cargo test. One could add file locking to the TargetSerialConn abstraction to make parallel cargo test processes not collide with each other but this does not protect against other processes that do not respect the file lock, like echo hello > /dev/ttyACM0.

    Other kinds of tests

    Direct host target communication is one type of host-target test but it's not the only possibility. Here are some other ideas.

    Our example application is meant to evolve into a mesh connected application. To test that mesh functionality the host could control several targets via serial port, or USB, to make them join / leave the network.

    Yet another option is to have a second host-controlled microcontroller electrically connected to the target. This second microcontroller can stimulate the target with e.g. GPIO or I2C signals. In the example application, this second micontroller could fake a SCD30 and report custom CO2 values. With this you can test how the application logic reacts to conditions hard, expensive or dangerous to produce in reality, like a very high CO2 concentration.

    xtask orchestration

    We have covered three sets of tests: host-only tests, target self-tests and host-target tests. The last set of tests has a requirement that the firmware needs to be flashed prior to running the tests. We can fulfill that requirement without leaving the Cargo workflow by writing some cargo-xtask tasks. If you haven't heard about cargo-xtask before, it's a technique for creating custom Cargo subcommands within the scope of a single project – a bit like creating make rules but in Rust.

    • cargo xtask test host-target. This xtask will flash the latest version of the firmware on the target and then it will run the host target tests. The shell commands it invokes are listed below.
    $ cargo flash --chip nRF52840_xxAA --release
    $ cargo test -p host-target-tests
    
    • cargo xtask test host. This xtask runs all the host-only tests. It tests the root workspace but excludes the host-target-tests package.
    $ cargo test --workspace --exclude host-target-tests
    
    • cargo xtask test target. This xtask runs all the defmt-tests tests, which run on the target.
    $ pushd cross
    $ cargo test -p self-tests
    $ popd
    
    • cargo xtask test all. This final xtask is a shortcut for running all the sets of tests.
    $ cargo xtask test host
    $ cargo xtask test target
    $ cargo xtask test host-target
    

    Conclusion

    We have seen how to test embedded application code using different approaches:

    • Functional (IO-less) code can be tested on the host using the built-in #[test] machinery
    • Code that interacts with hardware can be tested on the target using defmt-test
    • Code that does IO can also be tested on the host by mocking the embedded hardware
    • The application code itself can be tested from the host side doing host-target communication.

    With the project structure presented in this post you can to perform all these types of testing without leaving the Cargo workflow thanks to cargo xtask!

    Sponsor this work

    defmt-test and probe-run are part of the Knurling-rs project. Knurling-rs is mainly funded through GitHub sponsors. Sponsors get early access to the tools we are building and help us to support and grow the knurling tools and courses. Thank you to all of the people already sponsoring our work through the Knurling project!

    1. The reason for the last requirement is that the plan is to later move the host to target communication to USB HID. The maximum size of a HID packet is 64 bytes (USB 2.0 Full Speed) and being able to fit one message in one HID packet simplifies the design of the HID API. 

    2. The encoding in test does not add COBS framing to the message. COBS framing won't be required when using USB HID because packets are well delimited under the USB protocol. However, when communicating over the serial port framing is necessary. 

    3. By writing this test I learned that the driver crate is relatively low level. If you do read_measurement before the data ready flag is set (i.e. data_ready() returns false) then the sensor returns a junk value (e.g. all ones, 0xFF) with a valid CRC. This is not a compile time or runtime error in the API so you have to make sure you use the methods correctly.