Article

Testing sudo-rs and improving sudo along the way

Published on 9 min read

    sudo-rs is a project by Prossimo jointly implemented by Tweede golf and Ferrous Systems.

    As our friends at Tweede golf announced in a previous blog post, Ferrous Systems and Tweede golf have been working together on a re-implementation of sudo in the Rust programming language.

    In this blog post, I'll talk about the approach we are using to test sudo-rs and ensure sudo-rs is "compliant" with the original sudo. Let's dive in!

    NOTE The first half of this post covers technical details and the second half is more high-level. If you are not super interested in the low-level details, feel free to skim the first half until you reach the the big picture section.

    How to verify that sudo-rs ≡ ogsudo

    Faced with the task of re-implementing sudo, one of the questions that came up early on was: how do we verify that sudo-rs behaves just like the original sudo ("ogsudo")?

    The obvious answer was: given a particular scenario, I should get the same results regardless of whether I use ogsudo or sudo-rs. So, for example, a command invocation like this:

    $ whoami 
    ferris
    
    $ sudo whoami
    root
    

    should return the same output with either sudo implementation.

    In particular, this single command verifies a few things about a sudo implementation:

    • the command, whoami, is executed as the target user.
    • the default target user is the superuser, root.

    These end-to-end (E2E) tests at the command line interface felt like the right approach to verify our implementation.

    From testing on the command line to automated testing

    To scale this approach, we wanted to be able to run these tests automatically as part of our CI pipeline, so we wrote a test library to be able to write the E2E tests as regular Rust #[test] functions.

    Each test often requires a different set of settings:

    • a custom /etc/sudoers file; this file defines sudo's security policy and other sudo configuration
    • users with certain group memberships, login shells, passwords, etc.
    • other custom /etc files to, for example, configure PAM or the syslog daemon and test how sudo interoperates with them

    Making these changes in the system that runs the tests would not only be cumbersome but could potentially compromise its security, even if just temporarily. Instead of doing that, we made it so that each test runs in an ephemeral test environment isolated from the rest of the system.

    With that requirement in mind, we came up with the following test API. Below is the sudo whoami E2E test written in Rust using our sudo_test library.

    use sudo_test::{Command, Env};
    
    #[test]
    fn default_target_user_is_root() -> Result<()> {
        let username = "ferris";
        let sudoers = "ALL ALL=(ALL:ALL) NOPASSWD: ALL";
        let test_environment = Env(sudoers)
            .user(username)
            .build()?;
    
        let actual = Command::new("sudo")
            .arg("whoami")
            .as_user(username)
            .output(&test_environment)?
            .stdout()?;
    
        let expected = "root";
        assert_eq!(expected, actual);
        Ok(())
    }
    

    Let's break down that a bit.

    The chain of sudo_test::Command methods should look familiar because that API is based on the std::process::Command API. It has a few deviations, though: the extra as_user method lets you run the command as a specific user; and the output method takes a test_environment argument because the command is executed there, and not on the system that's running the test. Other than that, that statement should be quite readable: the actual variable contains the stdout output of running sudo whoami as the user ferris.

    Now let's look at the Env statement. The expression on the right hand side builds a test environment using the settings specified in the preceding method calls. The test environment will include a user called ferris. The argument of the Env function is the content of the test environment's /etc/sudoers file. The sudoers file used in this test allows everyone to run any command with sudo without password authentication! That policy is likely not something you want to use on a real system, but it's appropriate in this case, where we don't want to test nor deal with password authentication.

    Finally, we have the assertion that verifies that the user ferris temporarily became root while running sudo.

    Last thing worth mentioning here: when the test finishes, regardless of success or failure, the test environment gets disposed of automatically.

    Pick your sudo

    I said we want to run these tests using both ogsudo and sudo-rs to verify that they behave the same. The previous test didn't specify which sudo implementation was being tested but the code hints that there is a sudo implementation in the test environment.

    The sudo implementation is chosen at runtime using the environment variable: SUDO_UNDER_TEST and is typically set prior to calling cargo test.

    To run the entire test suite against the sudo-rs implementation you execute

    $ SUDO_UNDER_TEST=ours cargo test
    

    And to run the entire test suite against the original sudo implementation you execute

    $ SUDO_UNDER_TEST=theirs cargo test
    

    or simply omit setting SUDO_UNDER_TEST as the default is to use ogsudo.

    At runtime, sudo_test::Env reads SUDO_UNDER_TEST and installs either a fresh build of sudo-rs in the test environment or some known version of ogsudo.

    Dealing with gaps and unimplemented features

    As we started writing these tests early on, we had to deal with the fact that a test would pass with ogsudo but fail with sudo-rs. The way we deal with this scenario is to mark such tests with the #[ignore] attribute. That way, SUDO_UNDER_TEST=ours cargo test runs only the un-#[ignore]-d tests that are known to work.

    #[ignore]-d tests should always pass when run against ogsudo so in CI we pass --include-ignored to a second cargo test invocation to run all tests, including the #[ignore]-d ones, against ogsudo.

    There are two reasons why a test can fail with sudo-rs:

    • the feature is not yet implemented
    • the feature is implemented but we got some detail wrong – I like to think of these as "gaps" between our implementation and ogsudo

    In both cases we track the test failure in the project's issue tracker and, in the latter case, we mark the issue with the non-compliant label.

    As a way to check that all failing tests are linked to more visible issues, we have a CI check that enforces that all #[ignore]-d tests include an issue number using the format shown below:

    #[ignore = "gh1234"]
    fn tests_some_unimplemented_sudo_rs_feature() -> Result<()> {
        // ..
    }
    

    Eventually, features get implemented and gaps get closed. That's when we remove the #[ignore] attributes from the tests and the tests become regression tests for sudo-rs.

    It would be bad if we forgot to un-#[ignore] the tests that a PR fixed, but as we are human, that can happen! So, there is a CI check for that too: all #[ignore]-d tests are run against sudo-rs and if any of them succeeds then CI fails. If an #[ignore]-d test succeeds, that means it should be un-#[ignore]-d and that's what the CI failure message instructs you to do.

    The big picture

    So far I've focused on the implementation and the Rust bits because as a programmer that's the easier and more fun thing to do when writing a blog post but, let's take a step back from the how and change the focus to what we have accomplished.

    The test suite we have built is effectively an executable specification of the original sudo. Each test verifies a piece of sudo behavior described in sudo's user documentation, which is the closest thing we have to a written specification. In other words, all this time we have not been writing sudo-rs tests, but rather ogsudo tests. (… just don't tell the sudo-rs team; they haven't realized yet).

    By running the test suite against our sudo implementation we are finding all the spots where it doesn't quite match ogsudo's behavior. So, from the point of view of the sudo-rs project these are "compliance" tests that verify that our sudo implementation is compliant with the executable ogsudo specification.

    Having these compliance tests, especially the failing ones, is great for a few reasons:

    • it's easier to fix bugs when you have a failing test case that reproduces the bug
    • when you are going to implement a new feature, the existing ogsudo tests that exercise that feature can serve as your checklist or acceptance criteria. One could even view this aspect as test-driven development or behavior-driven development
    • the number and percentage of failing tests let us track progress towards the goal of being a "drop-in replacement" for sudo

    And, at the end of the day, with this test suite we can confidently say that sudo-rs behaves the same as ogsudo in these hundreds of scenarios.

    Another benefit of writing ogsudo tests that operate at the command line interface is that this process can happen independently of the work that happens on sudo-rs. There is no problem committing tests that exercise a feature that has not yet been implemented in sudo-rs. Because we run the tests against ogsudo and CI checks that all those pass, we can be confident in that each test in itself is correct even when sudo-rs fails the test.

    On writing tests

    As you may imagine, we derive test cases from our experience as sudo users as well as the sudo man pages. We test that command-line flags and sudoers settings behave as described in the documentation and that illegal operations and invalid syntax are rejected.

    Sometimes the documentation is not clear about how some features interact with each other. In that case, the observable behaviors of those interactions become test cases.

    For example, these are two features described in the manual:

    • sudo command executes the command in an environment that contains only a handful of environment variables, like SUDO_USER and SUDO_COMMAND, that are set by sudo itself.
    • The env_keep setting in the sudoers file lets you preserve some environment variables set in the invoking user's environment.

    The manual does not describe, however, what happens if you put a variable that sudo sets, like SUDO_USER, in the env_keep list. In that case, is SUDO_USER preserved from the invoking user's environment or does it get set/overwritten by sudo? (what happens in practice is the latter)

    Another example: the manual says that some env vars like PATH and TERM are preserved if set in the invoking user's environment. Does that sound familiar? At first, that may sound like PATH and TERM are special but the reality is that they are default members of the env_keep list.

    The more surprising bit is that you can write Defaults env_keep -= "PATH TERM" or Defaults !env_keep in the sudoers file to avoid preserving said env vars – that behavior is not mentioned in the sudo manual.

    One of the most interesting findings about ogsudo's behavior was due to carelessness rather than paying attention to the details.

    The sudo_test::Env API has a method to create files in the test environment. The initial version of that API created files with very open permissions; basically it chmod 0777-ed the file (whoops, that was me 🙋).

    To our initial surprise, sudo refuses to run when /etc/sudoers has world-write permissions and gives you a helpful error message.

    That fact makes a lot of sense from a security point of view: if the sudoers file is world writable then a user that should not have sudo perms can simply grant themselves sudo permissions by modifying the sudoers file. This detail is not mentioned in the sudo manual so if that had been our only specification we would have missed it, so I'm glad for my carelessness back then.

    All in all, writing tests has improved our collective understanding of sudo, which is a huge plus in my opinion: the better you understand the problem you are trying to solve, the better the solution you'll come up with.

    I should also mention here that the sudo-rs team is in contact with Todd Miller, the author of sudo, and we often receive clarifications on corner cases where it's unclear what the intended sudo behavior should be.

    Finding bugs … in ogsudo

    As I revealed earlier, we have been writing ogsudo tests all this time and those tests have eventually uncovered two bugs in the original sudo. It did come as a surprise though; the test seemed to correctly check the behavior stated in the manual and sudo-rs passed it but ogsudo failed it. That was a first because so far test failures had been exclusive to sudo-rs.

    In any case, I'm happy to report that both bugs have been reported upstream (1, 2), promptly fixed (1, 2) and already released in (og)sudo 1.9.14b2 🎉.

    Wrap up

    To summarize this blog post: with our testing approach, we have:

    • built an executable sudo specification that serves as our sudo compliance test suite
    • found and fixed gaps between our implementation and the original sudo
    • improved our collective understanding of sudo
    • uncovered two bugs in the original sudo implementation

    Now that you know about our testing approach, why not give sudo-rs a try? And if you find that it doesn't support a use case you care about or that it doesn't behave like ogsudo, we would greatly appreciate a pull request adding a new test case to our compliance test suite!