What's wrong with the following function in Rust?
rust
fn measure_cargo_toml() -> usize { let toml = std::fs::read_to_string("Cargo.toml")?; toml.len()}
Putting whether this is an optimal way to read a file's length to one side for a moment, let's take a look into why this causes a compile-time error.
std::fs::read_to_string("Cargo.toml")
returns Result<String, std::io::Error>
and
we tried to use the question mark operator ?
to 'unwrap' the String
part that we really care about.
We're faced with the following error however:
cannot use the `?` operator in a function that returns `()`this function should return `Result` or `Option` to accept `?`
This is suggesting that the ?
is doing a bit more than simply 'unwrapping' a value -
why's it mentioning the return type at all?
To understand, let's try to unwrap that String
value a couple of other ways...
if let
patternsIf we only ever cared about the success case, then technically we could have done the following instead:
rust
fn measure_cargo_toml() -> usize { if let Ok(toml) = std::fs::read_to_string("Cargo.toml") { toml.len() } else { 0 }}
Notes:
read_to_string
here. The code within the
if
block will only execute if the pattern Ok(toml)
is a match against the result.0
when the file reading fails. This is far from ideal since 0
could actually be a valid value,
but more so because it doesn't correctly encode what our function is capable of.Rust can help us here though, we can use the type system to encode more meaning into our function signatures -
in this case the presence vs absence of a value can be communicated through the use of the Option
enum.
If we change our return type to Option<usize>
, then we can return a None
in place of the 0
and
callers of this function will have a better understanding about how this works.
rust
fn measure_cargo_toml() -> Option<usize> { if let Ok(toml) = std::fs::read_to_string("Cargo.toml") { Some(toml.len()) } else { None }}
Notes:
usize
to Option<usize>
Some
variantNone
here makes our program more explicit than the previous 'default' value of 0
Converting from Result<T, _>
to an Option<T>
is a fairly a common operation, and
the standard library comes with a .ok()
method to help us reduce a bit of boilerplate:
rust
pub fn measure_cargo_toml() -> Option<usize> { std::fs::read_to_string("Cargo.toml") // Result<String, io::Error> .ok() // Option<String> .map(|toml| toml.len()) // Option<usize>}
Notes:
.ok()
gives us Option<String>
or None
depending on whether read_to_string
was successful -
it's just a convenience method and allows....map()
to change the inner value, if it exists.This is a pretty slick example, it's one of the reasons I enjoy Rust so much 🦀
But wait a minute - that's ignoring any errors that might occur when reading the file - what if we wanted to handle that error by passing it on to callers, or even just logging out the error?
To do this, we need to think about our design again. Currently, our signature is this:
rust
pub fn measure_cargo_toml() -> Option<usize> { // snip}
We can produce a usize
value, or not - that's it. Our design doesn't incorporate error
handling at all. So let's change that.
We can alter our return type to be the following instead:
rust
pub fn measure_cargo_toml() -> Result<usize, std::io::Error> { // snip}
With this change, callers of our function will now be forced to handle the fact that our function can fail.
They may decide (as we did before) to ignore the error and use a default value, but at least this design
gives them that choice. Option<usize>
loses all information if something goes wrong, and we don't want that.
So, to update our implementation to forward any errors, we can do the following instead:
The re-wrapping method
rust
pub fn measure_cargo_toml() -> Result<usize, std::io::Error> { match std::fs::read_to_string("Cargo.toml") { Ok(toml) => Ok(toml.len()), Err(err) => Err(err) }}
Ok
variant, we get a local binding to the value contained inside, named toml
here, then because our return
type is also a Result
, we need to Ok
-wrap the value we get from calling toml.len()
std::io::Error
That's quite a bit of syntax and ceremony for what is essentially a 'change the value inside the box' operation though - but fear not, Rust has yet more convenience methods to help with situations like this.
The .map()
method on Result
We previously saw .map
being used to alter the value inside a Option
- well it turns out
that we can perform the same type of operation on Result too.
rust
pub fn measure_cargo_toml() -> Result<usize, std::io::Error> { let result = std::fs::read_to_string("Cargo.toml"); result.map(|s| s.len())}
Notes:
read_to_string
returns Result<String, std::io::Error>
- that's very close to what we want. The error part
matches our signature, but the value part does not..map
method as seen here. This will only execute the closure on the value type if the result is of the Ok
variant.
It's like opening a box, checking that everything inside is ok, then replacing the value and closing the box again.So that's a couple of techniques that we can use to handle possible errors and read and return different values where there are none.
?
operatorSo far though, our implementations have only contained 2 operations
Because of this, we've been able to write a couple of different solutions that didn't require much more than 'reaching into a box' to change a value.
However, there are situations where this 'unwrapping' and 're-wrapping' of values is either tedious or just completely overkill for the task at hand.
For example, if we changed our requirement to instead return the combined length from Cargo.toml
+ Cargo.lock
, then we might
end up with a solution such as:
rust
pub fn main() -> Result<usize, std::io::Error> { let toml = read_to_string("Cargo.toml"); let lock = read_to_string("Cargo.lock"); let mut count = 0; match toml { Ok(str) => count += str.len(), Err(e) => return Err(e) } match lock { Ok(str) => count += str.len(), Err(e) => return Err(e) } Ok(count)}
Notes:
We can of course remove some of that duplication too using a for in
loop:
rust
pub fn main() -> Result<usize, std::io::Error> { let toml = read_to_string("Cargo.toml"); let lock = read_to_string("Cargo.lock"); let mut count = 0; for result in [toml, lock] { match result { Ok(str) => count += str.len(), Err(e) => return Err(e) } } Ok(count)}
Or, if you prefer a more functional approach, we can remove the loop and mutable variables with try_fold
:
rust
pub fn main() -> Result<usize, std::io::Error> { let paths = ["Cargo.toml", "Cargo.lock"]; paths .iter() .map(|path| std::fs::read_to_string(path)) .try_fold(0, |acc, item| { match item { Ok(string) => Ok(acc + string.len()), Err(e) => Err(e) } })}
Notes:
try_fold
allows us to reduce a collection down to a single value, but with the added advantage of supporting early returns.
It works by continuing to call this closure every time the previous iteration returns an Ok
.read_to_string
- causing the try_fold
to
stop iterating and return the error. That error it returns matches our function signature, so we can keep everything in a nice neat package
with no external variables to mutate.We can go one step further here too - notice that inside the try_fold
closure we're doing
the re-wrapping technique mentioned before. Well since we're forwarding any error as-is we can simplify
this down to another .map
call.
rust
pub fn main() -> Result<usize, std::io::Error> { let paths = ["Cargo.toml", "Cargo.lock"]; paths .iter() .map(|path| std::fs::read_to_string(path)) .try_fold(0, |acc, item| { item.map(|string| acc + string.len()) })}
Notes:
item
is Result<String, io::Error>
, so if there's an error the closure given to
.map
will not be executed - the error will be forwarded instead. That will cause the try_fold
to exit early which in turn will cause our outer
function to also return.Back to basics for a second: in the examples above, we've taken the requirement of "read 2 files from disk and sum their byte lengths" and we've ended up with a generic solution that can work with any amount of files.
Whether we choose a for x in xs
loop, or a chain of iterator methods, we've still leap-frogged from simple -> complex in a heartbeat - is there
any middle-ground to explore?
Enter ?
The core issue we're having here is the ergonomics around reaching into a Result
type. Because read_from_string
forces us
to deal with the fact that it can fail, it means we can't just access the values safely without a bit of syntax ceremony...
... but that's exactly what ?
(the question mark operator) is here to solve.
If we laser-focus in on just solving the 2-file problem, our solution could be as simple as:
rust
use std::fs::read_to_string; fn measure_cargo_files() -> Result<usize, std::io::Error> { let toml = read_to_string("Cargo.toml")?; let lock = read_to_string("Cargo.lock")?; Ok(toml.len() + lock.len())}
?
directly after the read_to_string()
call. This will 'unwrap' the
value (if it was successful). So the toml
and lock
bindings here are both of type
String
- they have been 'unwrapped'. If any of those file-reads were to fail though, we'd return early with the error. 👌This may seem like magic, but it's really just a case of our function signature having a return type that's
suitable for all places where we've used ?
.
So, our return type is:
Result<usize, std::io::Error>
whilst the return type of read_to_string()
is
Result<String, std::io::Error>
The types of the values actually differ - Our return type has usize
for the Ok
case whereas read_to_string
has String
. But for the
?
operator to work it's only the Err
part that needs to line up - and those do! 😎
The Rust compiler will analyze all uses of ?
within a function body and determine if each of them is suitable
for a possible 'early return' in your function.
A de-sugared version of ?
might look something like this:
rust
use std::fs::read_to_string; fn measure_cargo_files() -> Result<usize, std::io::Error> { let toml = match read_to_string("Cargo.toml") { Ok(toml) => toml, Err(err) => return Err(err) }; // snip}
?
is just de-sugaring to an early-return like this, not so magical after all!So that's it. The ?
operator can be thought of as unwrap or return early
-> with the return early
bit
being the most important part here. Every time you try to use ?
you absolutely must consider the
context of what an early return would mean.
That can differ greatly based on where you're using ?
- something we'll cover in more detail in part 2.
This first post was just a primer to get you thinking of what using ?
really means and why it's useful. It's fundamental
Rust 🦀 knowledge that you need to have so that we can discuss the many, many more use cases in depth in part 2.
In part 2, we'll cover:
?
in async blocks?
in closures?
causes err.into()
to be called - allowing automatic conversion between error typesSee you then 👋