Unwrap a value with the question mark operator in Rust
What's wrong with the following function in Rust?
fn measure_cargo_toml() -> usize {
let toml = std::fs::read_to_string("Cargo.toml")?;
toml.len()
}
Putting whether this is an optimal way to read a file's length to one side for a moment, let's take a look into why this causes a compile-time error.
std::fs::read_to_string("Cargo.toml")
returns Result<String, std::io::Error>
and
we tried to use the question mark operator ?
to 'unwrap' the String
part that we really care about.
We're faced with the following error however:
cannot use the `?` operator in a function that returns `()`
this function should return `Result` or `Option` to accept `?`
This is suggesting that the ?
is doing a bit more than simply 'unwrapping' a value -
why's it mentioning the return type at all?
To understand, let's try to unwrap that String
value a couple of other ways...
if let
patterns
If we only ever cared about the success case, then technically we could have done the following instead:
fn measure_cargo_toml() -> usize {
if let Ok(toml) = std::fs::read_to_string("Cargo.toml") {
toml.len()
} else {
0
}
}
Notes:
- (line 2): we're performing a pattern-match on the return value from
read_to_string
here. The code within theif
block will only execute if the patternOk(toml)
is a match against the result. - (line 5) It looks like our design is breaking down here. To keep the type checker happy we're having to
use a 'default' value of
0
when the file reading fails. This is far from ideal since0
could actually be a valid value, but more so because it doesn't correctly encode what our function is capable of.
Rust can help us here though, we can use the type system to encode more meaning into our function signatures -
in this case the presence vs absence of a value can be communicated through the use of the Option
enum.
If we change our return type to Option<usize>
, then we can return a None
in place of the 0
and
callers of this function will have a better understanding about how this works.
fn measure_cargo_toml() -> Option<usize> {
if let Ok(toml) = std::fs::read_to_string("Cargo.toml") {
Some(toml.len())
} else {
None
}
}
Notes:
- (line 1) we changed the return type from
usize
toOption<usize>
- (line 3) because of that, we now need to wrap our return value in the
Some
variant - (line 5)
None
here makes our program more explicit than the previous 'default' value of0
That's a lot of boilerplate though...
Converting from Result<T, _>
to an Option<T>
is a fairly a common operation, and
the standard library comes with a .ok()
method to help us reduce a bit of boilerplate:
pub fn measure_cargo_toml() -> Option<usize> {
std::fs::read_to_string("Cargo.toml") // Result<String, io::Error>
.ok() // Option<String>
.map(|toml| toml.len()) // Option<usize>
}
Notes:
- Calling
.ok()
gives usOption<String>
orNone
depending on whetherread_to_string
was successful - it's just a convenience method and allows us to call.map()
to change the inner value, if it exists.
This is a pretty slick example, it's one of the reasons I enjoy Rust so much 🦀
But wait a minute - that's ignoring any errors that might occur when reading the file - what if we wanted to handle that error by passing it on to callers, or even just logging out the error?
To do this, we need to think about our design again. Currently, our signature is this:
pub fn measure_cargo_toml() -> Option<usize> {
// snip
}
We can produce a usize
value, or not - that's it. Our design doesn't incorporate error
handling at all. So let's change that.
We can alter our return type to be the following instead:
-pub fn measure_cargo_toml() -> Option<usize> {
+pub fn measure_cargo_toml() -> Result<usize, std::io::Error> {
// snip
}
With this change, callers of our function will now be forced to handle the fact that our function can fail.
They may decide (as we did before) to ignore the error and use a default value, but at least this design
gives them that choice. Option<usize>
loses all information if something goes wrong, and we don't want that.
So, to update our implementation to forward any errors, we can do the following instead:
The re-wrapping method
pub fn measure_cargo_toml() -> Result<usize, std::io::Error> {
match std::fs::read_to_string("Cargo.toml") {
Ok(toml) => Ok(toml.len()),
Err(err) => Err(err)
}
}
-
When we match on the
Ok
variant, we get a local binding to the value contained inside, namedtoml
here, then because our return type is also aResult
, we need toOk
-wrap the value we get from callingtoml.len()
-
For the error case we're just re-wrapping the error since it matches our type
std::io::Error
That's quite a bit of syntax and ceremony for what is essentially a 'change the value inside the box' operation though - but fear not, Rust has yet more convenience methods to help with situations like this.
The .map()
method on Result
We previously saw .map
being used to alter the value inside a Option
- well it turns out
that we can perform the same type of operation on Result too.
pub fn measure_cargo_toml() -> Result<usize, std::io::Error> {
let result = std::fs::read_to_string("Cargo.toml");
result.map(|s| s.len())
}
read_to_string
returnsResult<String, std::io::Error>
- that's very close to what we want. The error part matches our signature, but the value part does not.- To change just the value then, without explicitly unwrapping it, we
can use the
.map
method as seen here. This will only execute the closure on the value type if the result is of theOk
variant. It's like opening a box, checking that everything inside is ok, then replacing the value and closing the box again.
So that's a couple of techniques that we can use to handle possible errors and read and return different values where there are none.
Understanding use cases for the ?
operator
So far though, our implementations have only contained 2 operations
- read a file from disk into a string
- return the amount of bytes that make up that string.
Because of this, we've been able to write a couple of different solutions that didn't require much more than 'reaching into a box' to change a value.
However, there are situations where this 'unwrapping' and 're-wrapping' of values is either tedious or just completely overkill for the task at hand.
For example, if we changed our requirement to instead return the combined length from Cargo.toml
+ Cargo.lock
, then we might
end up with a solution such as:
pub fn main() -> Result<usize, std::io::Error> {
let toml = read_to_string("Cargo.toml");
let lock = read_to_string("Cargo.lock");
let mut count = 0;
match toml {
Ok(str) => count += str.len(),
Err(e) => return Err(e)
}
match lock {
Ok(str) => count += str.len(),
Err(e) => return Err(e)
}
Ok(count)
}
- Notice how we need to check each result independently so that we can return early if either operation fails. We don't want to continue reading the second file if the first one has produced an error!
We can of course remove some of that duplication too using a for in
loop:
pub fn main() -> Result<usize, std::io::Error> {
let toml = read_to_string("Cargo.toml");
let lock = read_to_string("Cargo.lock");
let mut count = 0;
for result in [toml, lock] {
match result {
Ok(str) => count += str.len(),
Err(e) => return Err(e)
}
}
Ok(count)
}
- This is the key part, - we're still returning early with an error should one occur.
Or, if you prefer a more functional approach, we can remove the loop and mutable variables with try_fold
:
pub fn main() -> Result<usize, std::io::Error> {
let paths = ["Cargo.toml", "Cargo.lock"];
paths
.iter()
.map(|path| std::fs::read_to_string(path))
.try_fold(0, |acc, item| {
match item {
Ok(string) => Ok(acc + string.len()),
Err(e) => Err(e)
}
})
}
try_fold
allows us to reduce a collection down to a single value, but with the added advantage of supporting early returns. It works by continuing to call this closure every time the previous iteration returns anOk
.- For us that means that line 9 would forward any errors coming from
read_to_string
- causing thetry_fold
to stop iterating and return the error. That error it returns matches our function signature, so we can keep everything in a nice neat package with no external variables to mutate.
We can go one step further here too - notice that inside the try_fold
closure we're doing
the re-wrapping technique mentioned before. Well since we're forwarding any error as-is we can simplify
this down to another .map
call.
pub fn main() -> Result<usize, std::io::Error> {
let paths = ["Cargo.toml", "Cargo.lock"];
paths
.iter()
.map(|path| std::fs::read_to_string(path))
.try_fold(0, |acc, item| {
item.map(|string| acc + string.len())
})
}
- Here the type of
item
isResult<String, io::Error>
, so if there's an error the closure given to.map
will not be executed - the error will be forwarded instead. That will cause thetry_fold
to exit early which in turn will cause our outer function to also return.
But...
Back to basics for a second: in the examples above, we've taken the requirement of "read 2 files from disk and sum their byte lengths" and we've ended up with a generic solution that can work with any amount of files.
Whether we choose a for x in xs
loop, or a chain of iterator methods, we've still leap-frogged from simple -> complex in a heartbeat - is there
any middle-ground to explore?
Enter ?
The core issue we're having here is the ergonomics around reaching into a Result
type. Because read_from_string
forces us
to deal with the fact that it can fail, it means we can't just access the values safely without a bit of syntax ceremony...
... but that's exactly what ?
(the question mark operator) is here to solve.
If we laser-focus in on just solving the 2-file problem, our solution could be as simple as:
use std::fs::read_to_string;
fn measure_cargo_files() -> Result<usize, std::io::Error> {
let toml = read_to_string("Cargo.toml")?;
let lock = read_to_string("Cargo.lock")?;
Ok(toml.len() + lock.len())
}
- notice how on both of these lines, we add a
?
directly after theread_to_string()
call. This will 'unwrap' the value (if it was successful). So thetoml
andlock
bindings here are both of typeString
- they have been 'unwrapped'. If any of those file-reads were to fail though, we'd return early with the error. 👌
How it works: The error types line up!
This may seem like magic, but it's really just a case of our function signature having a return type that's
suitable for all places where we've used ?
.
So, our return type is:
Result<usize, std::io::Error>
whilst the return type of read_to_string()
is
Result<String, std::io::Error>
The types of the values actually differ - Our return type has usize
for the Ok
case whereas read_to_string
has String
. But for the
?
operator to work it's only the Err
part that needs to line up - and those do! 😎
The Rust compiler will analyze all uses of ?
within a function body and determine if each of them is suitable
for a possible 'early return' in your function.
A de-sugared version of ?
might look something like this:
use std::fs::read_to_string;
fn measure_cargo_files() -> Result<usize, std::io::Error> {
let toml = match read_to_string("Cargo.toml") {
Ok(toml) => toml,
Err(err) => return Err(err)
};
// snip
}
- yep! the
?
is just de-sugaring to an early-return like this, not so magical after all!
So that's it. The ?
operator can be thought of as unwrap or return early
-> with the return early
bit
being the most important part here. Every time you try to use ?
you absolutely must consider the
context of what an early return would mean.
That can differ greatly based on where you're using ?
- something we'll cover in more detail in part 2.
Part 2...
This first post was just a primer to get you thinking of what using ?
really means and why it's useful. It's fundamental
Rust 🦀 knowledge that you need to have so that we can discuss the many, many more use cases in depth in part 2.
In part 2, we'll cover:
- using
?
in async blocks - using
?
in closures - how
?
causeserr.into()
to be called - allowing automatic conversion between error types
See you then 👋