Recap of the first post
In the first post, we started working through the rustls
' pipeline that decodes and decrypts TLS records, removing the heap allocations we came across.
After the changes in that first post, the pipeline we covered looks like this:
- The input to the pipeline is a contiguous byte
buffer
that contains TLS data. - A mutable chunk of bytes – the type is
&'buffer mut [u8]
– is split off from the front ofbuffer
. - The
codec::ReaderMut
API is used to decode the mutable slice into anOpaqueMessage
. OpaqueMessage
consists of a decoded header – you can think of the header as astruct
– and a potentially encrypted payload, which is represented as a mutable slice.- The
RecordLayer::decrypt
API takes theOpaqueMessage
, decrypts the payload (if needed) and returns aPlainMessage
.
All these steps happen in the MessageDeframer::pop
method.
buffer =
+---------------+----------------+----+
| 0x160304007a | 0xd4e4877bac.. |(..)|
+---------------+----------------+----+
rd = codec::ReaderMut::init(&mut buffer[some_range]) ->
+---------------+----------------+
| 0x160304007a | 0xd4e4877bac.. |
+---------------+----------------+
m = OpaqueMessage::read(rd) ->
+---------------+----------------+
| Header { .. } | 0xd4e4877bac.. |
+---------------+----------------+
if m.has_encrypted_payload() then
msg = RecordLayer::::decrypt(m) ->
+---------------+----------------------+
| Header { .. } | b"plaintext payload" |
+---------------+----------------------+
Figure 1. ASCII diagram that depicts the operations that happen inside MessageDeframer::pop
.
In the original rustls
codebase (v0.22.0), the involved types looked roughly like this:
// in mod codec
struct Reader<'a> {
buffer: &'a [u8], // NOTE immutable slice
// ..
}
struct OpaqueMessage {
payload: Vec<u8>,
// ..
}
Note that OpaqueMessage
is a heap allocated buffer (Vec
) so OpaqueMessage::read
needs to allocate that Vec
and copy the data in Reader.buffer
into it.
To remove that allocation, we wanted to change the payload
type from Vec
into a slice but we learned that the RecordLayer::decrypt
operation happens "in place" and needs to mutate OpaqueMessage.payload
.
Therefore, OpaqueMessage
needs to hold a payload with type &mut [u8]
.
However, Reader
cannot create a mutable slice from its buffer
field because that's an immutable slice.
To deal with that, we introduced a mutable version of codec::Reader
, ReaderMut
, that operates on mutable slices and is able to return the mutable slice that'll be stored in OpaqueMessage.payload
.
As we implemented ReaderMut
, we ran into a borrow checker problem which we solved using the core::mem::take
API.
Side note: a reader pointed out that the pattern of using [T]::split_at_mut
, after core::mem::take
, has been referred to as "splitting borrows" in the rustonomicon.
It's great to have names for patterns as it makes it easier to refer to them in discussions so here I'm helping spread the name 📢.
At the end of the first post, our types looked like this:
// mod codec
struct ReaderMut<'a> { // NOTE different type
buffer: &'a mut [u8], // NOTE mutable slice
}
struct OpaqueMessage<'a> {
payload: &'a mut [u8], // NOTE also mutable slice
// ..
}
After the decrypt
operation, the payload does not need to be modified so we coerce it into an immutable slice when we store it in the PlainMessage
struct.
The updated PlainMessage
struct looks like this:
struct PlainMessage<'a> {
payload: &'a [u8],
// this field is part of the header
typ: ContentType,
// .. other header information ..
}
One allocation down, many more to go
By making OpaqueMessage::read
not allocate, we have removed one allocation from the pipeline, but there are more allocations down the line.
Turns out the PlainMessage.payload
field needs further decoding after being decrypted.
The PlainMessage.typ
field indicates which type the payload
bytes should be decoded into.
The decoding itself is done using the Codec
trait (interface) and the codec::Reader
object:
trait Codec: Sized {
// `Reader`'s lifetime parameter has been made explicit
fn read<'r>(
r: &mut Reader<'r>,
) -> Result<Self, /* .. */>;
// ..
}
There are multiple types in the codebase that implement this trait.
The catch is that the implementations perform heap allocations because those types are of the owning kind.
For example, have a look at CertificateRequestPayloadTls13
:
struct CertificateRequestPayloadTls13 {
context: PayloadU8, // <- owning type
extensions: Vec<CertReqExtension>,
}
impl Codec for CertificateRequestPayloadTls13 {
fn read(r: &mut Reader) -> Result<Self, /* .. */> {
Ok(Self {
context: PayloadU8::read(r)?,
extensions: Vec::read(r)?,
})
}
// ..
}
/// Arbitrary, unknown-content, u8-length-prefixed payload
struct PayloadU8(Vec<u8>); // <- owning type
impl Codec for PayloadU8 {
fn read(r: &mut Reader) -> Result<Self, /* .. */> {
let len = u8::read(r)? as usize;
let mut sub = r.sub(len)?;
let body = sub.rest().to_vec(); // <- allocation
Ok(Self(body))
}
// ..
}
To remove those allocations, we'll need to switch from owning types, like Vec<u8>
to borrowed types, like &[u8]
.
The interesting parts of doing that are the changes that need to be done in the Codec
trait and the ramifications of making those changes.
Consider this new, borrowed version of PayloadU8
and the desired Codec
implementation:
struct PayloadU8<'a>(&'a [u8]);
impl<'a> Codec for PayloadU8<'a> {
// ^^
fn read(
r: &mut Reader<'a>,
// ^^
) -> Result<Self, /* ... */> { //~ error
// ..
}
// ..
}
That's what we want to write but note that the impl
-level generic lifetime parameter 'a
appears in the type of input parameter r
so in this implementation the read
method is not generic.
That does not match the declaration of the generic read
method in the Codec
trait so the compiler throws an error:
error: method not compatible with trait
note: expected signature `fn(&mut Reader<'r>) -> Result<_, _>`
found signature `fn(&mut Reader<'a>) -> Result<_, _>`
A borrowing Codec
What we want is a way to tie / associate the lifetime of the Reader
input parameter to the return type of the read
method, Self
.
One way to achieve that is to add a lifetime parameter to the trait itself, like this:
trait Codec<'r>: Sized {
// ^^ new addition
fn read(_: &mut Reader<'r>) -> Result<Self, /* .. */>;
// ^ no generic lifetime parameter here!
}
By doing so, the read
method no longer has a generic lifetime parameter.
With this new Codec
trait, the PayloadU8
implementation becomes:
impl<'a> Codec<'a> for PayloadU8<'a> {
// ^^ ^^ ^^
fn read(r: &mut Reader<'a>) -> Result<Self, /* ... */> {
// ^^
let len = u8::read(r)? as usize;
let mut sub = r.sub(len)?;
let body = sub.rest(); // <- no allocation!
Ok(Self(body))
}
// ..
}
Now we are able to write decoding functions that do not require heap allocations!
By the way, adding a lifetime parameter to the trait is the same approach serde::Deserialize
uses to be able to deserialize into slices and other borrowed data structures.
It can be hard to realize that, however, because serde
has more generics (the Deserializer
trait lets you deserialize not just from byte slices but also from strings, etc.) and a much larger API surface (it uses the visitor pattern to ease the implementation work done by downstream users).
Fewer allocations, more problems
After changing the Codec
interface, all the implementations needed to be updated.
Those updates were not unlike the change in the PayloadU8
implementation so I won't list them here as they don't add anything new.
However, there was code outside of those implementations that broke and deserves some attention.
The easiest code to analyze is the test_enum8
function, which is a test helper:
fn test_enum8<T: Codec>(first: T, last: T) {
// .. ^ lifetime parameter is missing
let mut buf = Vec::new();
// ..
let t = T::read_bytes(&buf).unwrap();
// ..
}
trait Codec<'r>: Sized {
fn read(r: &mut Reader<'r>) -> Result<Self, /* .. */>;
// unveil this method because the signature matters
fn read_bytes(bytes: &'r [u8]) -> Result<Self, /* .. */> {
// ^^
let mut reader = Reader::init(bytes);
Self::read(&mut reader)
}
// ..
}
This unmodified version of the function won't compile because the Codec
trait in the trait bound of T
is missing its lifetime parameter.
The compiler helpfully provides two alternatives to fix the issue:
help: consider making the bound lifetime-generic with a new `'a` lifetime
|
NN | fn test_enum8<T: for<'a> Codec<'a>>(first: T, last: T) {
| +++++++ ++++
help: consider introducing a named lifetime parameter
|
NN | fn test_enum8<'a, T: Codec<'a>>(first: T, last: T) {
| +++ ++++
If you use the trial and error approach, you'll find out that the first alternative works and the second, doesn't, but why is that?
Take 1: adding a lifetime parameter to test_enum8
If we try the second alternative:
fn test_enum8<'a, T: Codec<'a>>(first: T, last: T) {
// .. ^^ ^^
let mut buf = Vec::new(); // start of 'buf
// ..
let t = T::read_bytes(&buf).unwrap();
// ^^^^ error
// ..
// end of 'buf
}
We get this compiler error: "borrowed value does not live long enough".
The rest of the error message tells us that read_bytes
requires that &buf
has lifetime 'a
but buf
has a smaller lifetime.
We used the lifetime parameter 'a
in the trait bound T: Codec<'a>
and the declaration of Codec<'a>::read_bytes
(shown below) requires that its input parameter has type &'a [u8]
.
Those two facts create the requirement that &buf
must be coerce-able into &'a [u8]
.
trait Codec<'r>: Sized {
// ^^
fn read_bytes(bytes: &'r [u8]) -> Result<Self, /* .. */> {
// .. ^^
}
But why is that requirement a problem?
test_enum8
's generic lifetime parameter 'a
is unconstrained because it does not appear in any of the input parameters of the function.
Therefore, the caller of the function is free to pick any value for it.
For example, the caller could instantiate 'a
to 'static
:
test_enum8::<'static, _>(first, last);
&buf
certainly won't have 'static
lifetime because its lifetime lies within the span of the test_enum8
function.
Now you see the problem: &buf
must satisfy the lifetime constraint 'buf: 'a
; that is 'buf
must outlive an arbitrary 'a
chosen by the caller and that can't be satisfied because buf
is destroyed within the body of test_enum8
.
That being said, with a tweak in the function body you can make this alternative work:
fn test_enum8<'a, T: Codec<'a>>(first: T, last: T) {
// ..
let mut buf = Vec::new();
// ..
let boxed_buf: Box<[u8]> = buf.into_boxed_slice();
let static_buf: &'static mut [u8] = Box::leak(boxed_buf);
let t = T::read_bytes(static_buf).unwrap();
// ..
}
By leaking buf
we turn it into a slice with static lifetime, static_buf
, and then pass that into read_bytes
.
This will certainly satisfy the constraint 'static: 'a
where 'a
is an arbitrary lifetime because the "longest" a lifetime can ever be is 'static
.
However, even if this is just test code, we should not be leaking memory just to appease the borrow checker.
Solution: HRTB in test_enum8
How come the first alternative works?
What is this for <'a>
syntax and what are its limitations?
fn test_enum8<T: for<'a> Codec<'a>>(first: T, last: T) {
// .. ^^ ^^
let mut buf = Vec::new(); // start of 'buf
// ..
let t = T::read_bytes(&buf).unwrap();
// ..
}
T: for<'a> Codec<'a>
is a Higher-Ranked Trait Bound and what it's basically saying is that the type T
implements the Codec<'a>
trait for each and every possible value of the generic lifetime parameter 'a
.
How is this different than the previous option? If T
implements Codec<'a>
for any value of 'a
then it certainly implements Codec<'buf>
; 'buf
being the concrete (non-generic) lifetime of the local variable buf
.
This time there's no constraint on what the lifetime of buf
must be, to be able to use read_bytes
so the borrow checker won't complain.
Limitations
This HRTB alternative is good enough for the usage of this generic function because the function is only ever instantiated with C-like enum types like the one shown below.
enum AlertLevel {
Warning,
Fatal,
}
impl Codec<'_> for AlertLevel {
// ..
}
// NOTE: `'_` is short-hand for this
// impl<'a> Codec<'a> for AlertLevel { /* .. */ }
Importantly, these types implement the Codec
trait for every possible lifetime parameter;
that's what the impl Codec<'_>
syntax means.
The limitation of this version of test_enum8
is that it can not be instantiated with types that contain borrowed data.
Consider this made-up generic structure and its accompanying Codec
implementation:
struct Slice<'a> {
bytes: &'a [u8],
}
impl<'a> Codec<'a> for Slice<'a> {
// ..
}
A concrete instance of this type implements the Codec
trait for that single, concrete lifetime.
For example, the type Slice<'static>
only satisfies the bound: Slice<'static>: Codec<'static>
.
Therefore, a concrete type won't satisfy the bound T: for<'a> Codec<'a>
.
Therefore, it cannot be used with the test_enum8
function:
fn fail(first: Slice<'static>, last: Slice<'static>) {
test_enum8(first, last);
//~^ error implementation of `Codec` is not general enough
}
The more detailed error message says
Codec<'0>
would have to be implemented for the typeSlice<'_>
, for any lifetime'0
butCodec<'1>
is actually implemented for the typeSlice<'1>
, for some specific lifetime'1
In the message '1
refers to the concrete lifetime 'static
and '0
refers to the "any" lifetime 'a
in test_enum8
's trait bound: T: for<'a> Codec<'a>
.
Wrap-up
That concludes the second blog post of this series.
In the first one, we removed one allocation in the first decoding stage of the pipeline, which may involve decryption, that all records have to go through.
In this second one, we removed allocations from the second decoding stage that the decrypted payloads have to go through.
In the next and final post, we'll look into storing borrowed data in trait objects (e.g. Box<dyn MyTrait>
) and using Cow
-like enums to avoid and/or delay heap allocations.
If the borrow checker has you in a bind or if you want to optimize the memory usage of your project by switching from heap allocated objects to references, our consulting service can help you with that! Get in touch through our contact form.