Unwrap a value with the question mark operator in Rust

What's wrong with the following function in Rust?

fn measure_cargo_toml() -> usize {
    let toml = std::fs::read_to_string("Cargo.toml")?;
    toml.len()
}

Putting whether this is an optimal way to read a file's length to one side for a moment, let's take a look into why this causes a compile-time error.

std::fs::read_to_string("Cargo.toml") returns Result<String, std::io::Error> and we tried to use the question mark operator ? to 'unwrap' the String part that we really care about.

We're faced with the following error however:

cannot use the `?` operator in a function that returns `()`
this function should return `Result` or `Option` to accept `?`

This is suggesting that the ? is doing a bit more than simply 'unwrapping' a value - why's it mentioning the return type at all?

To understand, let's try to unwrap that String value a couple of other ways...

if let patterns

If we only ever cared about the success case, then technically we could have done the following instead:

fn measure_cargo_toml() -> usize {
    if let Ok(toml) = std::fs::read_to_string("Cargo.toml") {
        toml.len()
    } else {
        0
    }
}

Notes:

  • (line 2): we're performing a pattern-match on the return value from read_to_string here. The code within the if block will only execute if the pattern Ok(toml) is a match against the result.
  • (line 5) It looks like our design is breaking down here. To keep the type checker happy we're having to use a 'default' value of 0 when the file reading fails. This is far from ideal since 0 could actually be a valid value, but more so because it doesn't correctly encode what our function is capable of.

Rust can help us here though, we can use the type system to encode more meaning into our function signatures - in this case the presence vs absence of a value can be communicated through the use of the Option enum.

If we change our return type to Option<usize>, then we can return a None in place of the 0 and callers of this function will have a better understanding about how this works.

fn measure_cargo_toml() -> Option<usize> {
    if let Ok(toml) = std::fs::read_to_string("Cargo.toml") {
        Some(toml.len())
    } else {
        None
    }
}

Notes:

  • (line 1) we changed the return type from usize to Option<usize>
  • (line 3) because of that, we now need to wrap our return value in the Some variant
  • (line 5) None here makes our program more explicit than the previous 'default' value of 0

That's a lot of boilerplate though...

Converting from Result<T, _> to an Option<T> is a fairly a common operation, and the standard library comes with a .ok() method to help us reduce a bit of boilerplate:

pub fn measure_cargo_toml() -> Option<usize> {
    std::fs::read_to_string("Cargo.toml") // Result<String, io::Error>
        .ok()                             // Option<String>
        .map(|toml| toml.len())           // Option<usize>
}

Notes:

  • Calling .ok() gives us Option<String> or None depending on whether read_to_string was successful - it's just a convenience method and allows us to call .map() to change the inner value, if it exists.

This is a pretty slick example, it's one of the reasons I enjoy Rust so much 🦀


But wait a minute - that's ignoring any errors that might occur when reading the file - what if we wanted to handle that error by passing it on to callers, or even just logging out the error?

To do this, we need to think about our design again. Currently, our signature is this:

pub fn measure_cargo_toml() -> Option<usize> {
    // snip
}

We can produce a usize value, or not - that's it. Our design doesn't incorporate error handling at all. So let's change that.

We can alter our return type to be the following instead:

-pub fn measure_cargo_toml() -> Option<usize> {
+pub fn measure_cargo_toml() -> Result<usize, std::io::Error> {
    // snip
}

With this change, callers of our function will now be forced to handle the fact that our function can fail. They may decide (as we did before) to ignore the error and use a default value, but at least this design gives them that choice. Option<usize> loses all information if something goes wrong, and we don't want that.

So, to update our implementation to forward any errors, we can do the following instead:

The re-wrapping method

pub fn measure_cargo_toml() -> Result<usize, std::io::Error> {
    match std::fs::read_to_string("Cargo.toml") {
        Ok(toml) => Ok(toml.len()),
        Err(err) => Err(err)
    }
}
  • When we match on the Ok variant, we get a local binding to the value contained inside, named toml here, then because our return type is also a Result, we need to Ok-wrap the value we get from calling toml.len()

  • For the error case we're just re-wrapping the error since it matches our type std::io::Error

That's quite a bit of syntax and ceremony for what is essentially a 'change the value inside the box' operation though - but fear not, Rust has yet more convenience methods to help with situations like this.

The .map() method on Result

We previously saw .map being used to alter the value inside a Option - well it turns out that we can perform the same type of operation on Result too.

pub fn measure_cargo_toml() -> Result<usize, std::io::Error> {
    let result = std::fs::read_to_string("Cargo.toml");
    result.map(|s| s.len())
}
  • read_to_string returns Result<String, std::io::Error> - that's very close to what we want. The error part matches our signature, but the value part does not.
  • To change just the value then, without explicitly unwrapping it, we can use the .map method as seen here. This will only execute the closure on the value type if the result is of the Ok variant. It's like opening a box, checking that everything inside is ok, then replacing the value and closing the box again.

So that's a couple of techniques that we can use to handle possible errors and read and return different values where there are none.

Understanding use cases for the ? operator

So far though, our implementations have only contained 2 operations

  • read a file from disk into a string
  • return the amount of bytes that make up that string.

Because of this, we've been able to write a couple of different solutions that didn't require much more than 'reaching into a box' to change a value.

However, there are situations where this 'unwrapping' and 're-wrapping' of values is either tedious or just completely overkill for the task at hand.

For example, if we changed our requirement to instead return the combined length from Cargo.toml + Cargo.lock, then we might end up with a solution such as:

pub fn main() -> Result<usize, std::io::Error> {
    let toml = read_to_string("Cargo.toml");
    let lock = read_to_string("Cargo.lock");
    let mut count = 0;
    match toml {
        Ok(str) => count += str.len(),
        Err(e) => return Err(e)
    }
    match lock {
        Ok(str) => count += str.len(),
        Err(e) => return Err(e)
    }
    Ok(count)
}
  • Notice how we need to check each result independently so that we can return early if either operation fails. We don't want to continue reading the second file if the first one has produced an error!

We can of course remove some of that duplication too using a for in loop:

pub fn main() -> Result<usize, std::io::Error> {
    let toml = read_to_string("Cargo.toml");
    let lock = read_to_string("Cargo.lock");
    let mut count = 0;
    for result in [toml, lock] {
        match result {
            Ok(str) => count += str.len(),
            Err(e) => return Err(e)
        }
    }
    Ok(count)
}
  • This is the key part, - we're still returning early with an error should one occur.

Or, if you prefer a more functional approach, we can remove the loop and mutable variables with try_fold:

pub fn main() -> Result<usize, std::io::Error> {
    let paths = ["Cargo.toml", "Cargo.lock"];
    paths
        .iter()
        .map(|path| std::fs::read_to_string(path))
        .try_fold(0, |acc, item| {
            match item {
                Ok(string) => Ok(acc + string.len()),
                Err(e) => Err(e)
            }
        })
}
  • try_fold allows us to reduce a collection down to a single value, but with the added advantage of supporting early returns. It works by continuing to call this closure every time the previous iteration returns an Ok.
  • For us that means that line 9 would forward any errors coming from read_to_string - causing the try_fold to stop iterating and return the error. That error it returns matches our function signature, so we can keep everything in a nice neat package with no external variables to mutate.

We can go one step further here too - notice that inside the try_fold closure we're doing the re-wrapping technique mentioned before. Well since we're forwarding any error as-is we can simplify this down to another .map call.

pub fn main() -> Result<usize, std::io::Error> {
    let paths = ["Cargo.toml", "Cargo.lock"];
    paths
        .iter()
        .map(|path| std::fs::read_to_string(path))
        .try_fold(0, |acc, item| {
            item.map(|string| acc + string.len())
        })
}
  • Here the type of item is Result<String, io::Error>, so if there's an error the closure given to .map will not be executed - the error will be forwarded instead. That will cause the try_fold to exit early which in turn will cause our outer function to also return.

But...

Back to basics for a second: in the examples above, we've taken the requirement of "read 2 files from disk and sum their byte lengths" and we've ended up with a generic solution that can work with any amount of files.

Whether we choose a for x in xs loop, or a chain of iterator methods, we've still leap-frogged from simple -> complex in a heartbeat - is there any middle-ground to explore?

Enter ?

The core issue we're having here is the ergonomics around reaching into a Result type. Because read_from_string forces us to deal with the fact that it can fail, it means we can't just access the values safely without a bit of syntax ceremony...

... but that's exactly what ? (the question mark operator) is here to solve.

If we laser-focus in on just solving the 2-file problem, our solution could be as simple as:

use std::fs::read_to_string;

fn measure_cargo_files() -> Result<usize, std::io::Error> {
    let toml = read_to_string("Cargo.toml")?;
    let lock = read_to_string("Cargo.lock")?;
    Ok(toml.len() + lock.len())
}
  • notice how on both of these lines, we add a ? directly after the read_to_string() call. This will 'unwrap' the value (if it was successful). So the toml and lock bindings here are both of type String - they have been 'unwrapped'. If any of those file-reads were to fail though, we'd return early with the error. 👌

How it works: The error types line up!

This may seem like magic, but it's really just a case of our function signature having a return type that's suitable for all places where we've used ?.

So, our return type is:

Result<usize, std::io::Error>

whilst the return type of read_to_string() is

Result<String, std::io::Error>

The types of the values actually differ - Our return type has usize for the Ok case whereas read_to_string has String. But for the ? operator to work it's only the Err part that needs to line up - and those do! 😎

The Rust compiler will analyze all uses of ? within a function body and determine if each of them is suitable for a possible 'early return' in your function.

A de-sugared version of ? might look something like this:

use std::fs::read_to_string;

fn measure_cargo_files() -> Result<usize, std::io::Error> {
    let toml = match read_to_string("Cargo.toml") {
        Ok(toml) => toml,
        Err(err) => return Err(err)
    };
    // snip
}
  • yep! the ? is just de-sugaring to an early-return like this, not so magical after all!

So that's it. The ? operator can be thought of as unwrap or return early -> with the return early bit being the most important part here. Every time you try to use ? you absolutely must consider the context of what an early return would mean.

That can differ greatly based on where you're using ? - something we'll cover in more detail in part 2.

Part 2...

This first post was just a primer to get you thinking of what using ? really means and why it's useful. It's fundamental Rust 🦀 knowledge that you need to have so that we can discuss the many, many more use cases in depth in part 2.

In part 2, we'll cover:

  • using ? in async blocks
  • using ? in closures
  • how ? causes err.into() to be called - allowing automatic conversion between error types

See you then 👋