Question: how to encourage such patterns within a team? I often find it difficult to do it during code reviews and leading to unproductive arguments about "code style" and "preferences".
Funnily, these arguments do not happen when a linter pops a warning instead...
So many rust articles are focused on people doing dark sorcery with "unsafe", and this is just normal every day api design, which is far more practical for most people.
struct PizzaOrder {
size: PizzaSize,
toppings: Vec<Topping>,
crust_type: CrustType,
ordered_at: SystemTime,
}
The problem they want to address is partial equality when you want to compare orders but ignoring the ordered_at timestamp. To me, the problem is throwing too many unrelated concerns into one struct. Ideally instead of using destructuring to compare only the specific fields you care about, you'd decompose this into two structs: #[derive(PartialEq, Eq)]
struct PizzaDetails {
size: PizzaSize,
toppings: Vec<Topping>,
crust_type: CrustType,
… // additional fields
}
#[derive(Eq)]
struct PizzaOrder {
details: PizzaDetails,
ordered_at: SystemTime,
}
impl PartialEq for PizzaOrder {
fn eq(&self, rhs: &Self) -> bool {
self.details == rhs.details
}
}
I get that this is a toy example meant to illustrate the point; there are certainly more complex cases where there's no clean boundary to split your struct across. But this should be the first tool you reach for.If your function gets ownership of, or an exclusive reference to an object, then you know for sure that this reference, for as long as it exists, is the only one in the entire program that can access this object (across all threads, 3rd party libraries, recursion, async, whatever).
References can't be null. Smart pointers can't be null. Not merely "can't" meaning not allowed and may throw or have a dummy value, but just can't. Wherever such type exists, it's already checked (often by construction) that it's valid and can't be null.
If your object's getter lends an immutable reference to its field, then you know the field won't be mutated by the caller (unless you've intentionally allowed mutable "holes" in specific places by explicitly wrapping them in a type that grants such access in a controlled way).
If your object's getter lends a reference, then you know the caller won't keep the reference for longer than the object's lifetime. If the type is not copyable/cloneable, then you know it won't even get copied.
If you make a method that takes ownership of `self`, then you know for sure that the caller won't be able to call any more methods on this object (e.g. `connection.close(); connection.send()` won't compile, `future.then(next)` only needs to support one listener, not an arbitrary number).
If you have a type marked as non-thread safe, then its instances won't be allowed in any thread-spawning functions, and won't be possible to send through channels that cross threads, etc. This is verified globally, across all code including 3rd party libraries and dynamic callbacks, at compile time.
The same day Cloudflare had its unwrap fiasco, I found a bug in my code because of a slice that in certain cases went past the end of a vector. Switched it to use iterators and will definitely be more careful with slices and array indexes in the future.
Rust encourages that behavior. Sometimes rightly, but it does build a habit.
I spoke previously about how the Rust book uses the external rand create as a key example and it sets the tone for developers. I'm changing that stance somewhat since it was a decent strategic choice to have crypto packages plug-and-play. But tit still builds a habit.
One question about avoiding boolean parameters, I’ve just been using structs wrapping bools. But you can’t treat them like bools… you have to index into them like wrapper.0.
Is there a way to treat the enum style replacement for bools like normal bools, or is just done with matches! Or match statements?
It’s probably not too important but if we could treat them like normal bools it’d feel nicer.
Like whenever I read posts like this, they're always fairly anecdotal. Sometimes there will even be posts about how large refactor x unlocked new capability y. But the rationale always reads somewhat retconned (or again, anecdotal*). It seems to me that maybe such continuous meta-analysis of one's own codebases would have great potential utility?
I'd imagine automated code smell checking tools can only cover so much at least.
* I hammer on about anecdotes, but I do recognize that sentiment matters. For example, if you're planning work, if something just sounds like a lot of work, that's already going to be impactful, even if that judgement is incorrect (since that misjudgment may never come to light).
Actually the From trait documentation is now extremely clear about when to implement it (https://doc.rust-lang.org/std/convert/trait.From.html#when-t...)
"Defensive programming" has multiple meanings. To the extent it means "avoid using _ as a catch-all pattern so that the compiler nags you if someone adds an enum arm you need to care about", "defensive" programming is good.
That said, I wouldn't use the word "defensive" to describe it. The term lacks precision. The above good practice ends up getting mixed up with the bad "defensive" practices of converting contract violations to runtime errors or just ignoring them entirely --- the infamous pattern in Java codebases of scrawling the following like of graffiti all over the clean lines of your codebase:
if (someArgument == null) {
throw new NullPointerException("someArgument cannot be null");
}
That's just noise. If someArgument can't be null, let the program crash.Needed file not found? Just return ""; instead.
Negative number where input must be contractually not negative? Clamp to zero.
Program crashing because a method doesn't exist? if not: hasattr(self, "blah") return None
People use the term "defensive" to refer to code like the above. They programs that "defend" against crashes by misbehaving. These programs end up being flakier and harder to debug than programs that are "defensive" in that they continually validate their assumptions and crash if they detect a situation that should be impossible.
The term "defensive programming" has been buzzing around social media the past few weeks and it's essential that we be precise that
1) constraint verification (preferably at compile time) is good; and
2) avoidance of crashes at runtime at all costs after an error has occurred is harmful.
The 'defensive' nature refers to the mindset of the programmer (like when guilty people are defensive when being asked a simple question), that he isn't sure of anything in the code at any point, so he needs to constantly check every invariant.
Enterprise code is full of it, and it can quickly lead to the program becoming like 50% error handling by volume, many of the errors being impossible to trigger because the app logic is validating a condition already checked in the validation layer.
Its presence usually betrays a lack of understanding of the code structure, or even worse, a faulty or often bypassed validation layer, which makes error checking in multiple places actually necessary.
One example is validating every parameter in every call layer, as if the act of passing things around has the ability to degrade information.
Ideally, imho, a struct is a dumb data holder - it is there to pass associated pieces of data together (or hold a complex unavoidable state change hidden from the user like Arc or Mutex).
All that is to say that adding a field to an existing struct and possibly populating it sparsely in some remote piece of code should not changed existing behavior.
I wonder whether there's a way to communicate to whoever makes changes to the pizza details struct that it might have unintended consequences down the line.
Should one wrap PizzaDetails with PizzaComparator? Or better even provide it as a field in PizzaOrder? Or we are running into Java-esq territory of PizzaComparatorBuilderDefaultsConstructorFactory?
Should we introduce a domain specific PizzaFlavor right under PizzaDetails that copies over relevant fields from PizzaDetails, and PizzaOrder compares two orders by constructing and comparing their flavours instead? A lot of boilerplate.. but what is being considered important to the pizza flavor is being explicitly marked.
In a prod codebase I'd annotate this code with "if change X chaange Y" pre submit hook - this constraint appears to be external to the language itself and live in the domain of "code changes over time". Protobufs successfully folded versioning into the language itself though. Protobufs also have field annotations, "{important_to_flavour=true}" field annotation would be useful here.
In other languages I get most of the benefits by sticking to functional programming practices and not mutating stuff all over the place. Rust's type system sort of encodes that, and maybe a little more, by making safe mutation a known non-interfering thing.
Yeah, that originally turned me off from the language entirely. I also changed my mind eventually.
Using ..Default::default() means “whatever additional fields are added later, I don’t care”. Which is great until someone needs to add a field to the struct, and they rely on the compiler to tell them all the places that don’t have a value for the field (so they can pass the right value depending on the situation.) Then the callers with Default are missed, and bugs can result.
Any time you say “I don’t care what happens in the future here”, you better have a good reason for that to be the case, IMO.
It's not too uncommon in other languages (sometimes under the name "immediately invoked function expression"), though depending on the language you may see lambdas involved. For example, here's one of the examples from the article ported to C++:
auto data = []() {
auto data = get_vector();
auto temp = compute_something();
data.insert_range(data.end(), temp);
std::ranges::sort(data);
return data;
}();Yes. Defensively handle all the failure modes you know how to handle, but nothing else. If you're writing a service daemon and the user passes in a config filename that doesn't exist, crash and say why. Don't try to guess, or offer up a default config, or otherwise try to paper over the idea that the user asked you to do something impossible. Pretty much anything you try other than just crashing is guaranteed to be wrong.
And for the love of Knuth, don't freaking clamp to zero or otherwise convert inputs into semantically different value than specified. (Like, it's fine to load a string representation of a float into an IEEE754 datatype if you're not working with money or other exact values. But don't parse 256 as 255 and call it good enough. It isn't.)
1. It tells you which variable is null. While I think modern Java will include that detail in the exception, that's fairly new. So if you had `a.foo(b.getBar(), c.getBaz())`, was a, b, or c null? Who knows!
2. Putting it in the constructor meant you'd get a stack trace telling you where the null value came from, while waiting until it was used made it a lot harder to track down the source.
Not applicable to all situations, but it could be genuinely helpful, and has been to me.
In PHP a pattern I often employ is:
match ($value) {
static::VALUE_1 => ...,
static::VALUE_2 => ...,
default => static::unreachable()
}
Where unreachable is literally just: static function unreachable() {
throw new Exception('Unreachable');
}
Now, we don't actually need the default match arm. If we just leave it off entirely, and someone passes in something we can't match, it'll throw a PHP error about unmatched cases.But what I've found is that if I do that, then other programmers go in later and just add in the case to the match statement so it runs. Which, of course, breaks other stuff down stream, because it's not a valid value we can actually use. Or worse: they add a default match arm that doesn't work! Just so the PHP interpreter doesn't complain.
But with this, now the reader knows "the person who wrote this considered what happens when something bad is passed in, and decided we cant handle it. There's probably a good reason for that". So they don't touch it.
Now, PHP has unique challenges because it's so dynamic. If someone passes in the wrong thing we might end up coercing null to zero and messing up calculations, or we might end up truncating a float or something. Ideally we prevent this with enums, but enums are a pain in the ass to write because of autoloading semantics (I don't want to write a whole new file for just a few cases)
A function must check its arguments. It cannot assume that the arguments are already checked (against its own requirements). This is regardless of what called it, or where the values came from.
> Underscore expressions, denoted with the symbol _, are used to signify a placeholder in a destructuring assignment.
[0]: https://doc.rust-lang.org/reference/expressions/underscore-e...
PartialEq and Eq for PizzaDetails is good. If there is a business function that computes whether or not someone orders the same thing, then that should start by projecting the details.
Was it a fiasco? Really? The rust unwrap call is the equivalent to C code like this:
int result = foo(…);
assert(result >= 0);
If that assert tripped, would you blame the assert? Of course not. Or blame C? No. If that assert tripped, it’s doing its job by telling you there’s a problem in the call to foo().You can write buggy code in rust just like you can in any other language.
It's not difficult to write the predicate same_details_as() and then it's obvious to reviewers if that's what we meant and discourages weird ad-hoc code which might stop working when the PizzaDetails is redefined.
I probably don't have enough context but whatever identity makes up "your order" goes in the PizzaOrder and not the PizzaDetails. The delivery address, for example, goes in the PizzaOrder.
We do the work that’s too large in scope for other teams to handle, and clearly documenting and enforcing best practices is one component of that. Part of that is maintaining a comprehensive linting suite, and the other part is writing documentation and educating developers. We also maintain core libraries and APIs, so if we notice many teams are doing the same thing in different ways, we’ll sit down and figure out what we can build that’ll accommodate most use cases.
https://docs.rs/itertools/latest/itertools/trait.Itertools.h...
As you said, calling first() is a far better approach.
So much end user software tries to be "friendly" by just saying "An error occurred" regardless of what's wrong or whether you can do anything about it. Rust does better and it's a reminder that you can too.
enum MyType{
...
}
impl MyType{
pub fn is_useable_in_this_way(&self) -> bool{
// possibly ...
match self {...}
}
}
and later: pub fn use_in_that_way(e: MyType) {
if e.is_useable_in_this_way() {...}
}
Or if you hate all that there's always: if let MyType::Member(x) = e {
...
}Anyone who has learned how to program Rust knows that unwrap() will panic if the thing you are unwrapping is Err/None. It's not unassuming at all. When the only person who could be tripped up by a method name is a complete newbie to the language, I don't think it's actually a problem.
Similarly, assert() isn't immediately obvious to a beginner that it will cause a panic. Heck, the term "panic" itself is non obvious to a beginner as something that will crash the program. Yet I don't hear anyone arguing that the panic! macro needs to be changed to crash_this_program. The fact of the matter is that a certain amount of jargon is inevitable in programming (and in my view this is a good thing, because it enables more concise communication amongst practitioners). Unwrap is no different than those other bits of jargon - perhaps non obvious when you are new, but completely obvious once you have learned it.
I have always been critical of the Rust hype but unwrap is completely fine. Is an escape hatch has legitimate uses. Some code is fine to just fail.
It is easy to spot during code review. I have never programmed Rust professional and even I would have asked about the unwrap in the cloudfare code if I had reviewed that. You can even enforce to not use unwrap at all through automatic tooling.
For ints you can implement the deref trait on structs. So you can treat YourType(u64) as a u64 without destructing. I couldn’t figure out a way to do that with YouType(bool).
If you read the postmortem, they talk in depth about what the issue really was - which from memory is that their software statically allocated room for 20 rules or something. And their database query unexpected returned more than 20 items. Oops!
I can see the argument for renaming unwrap to unwrap_or_panic. But no alternate spelling of .unwrap() would have saved cloudflare from their buggy database code.
You can go ahead and grep your codebase for this today, instead of waiting for an incident.
I'm a fairly new migrant from Java to C#, and when I do some kind of collection lookup, I still need to check whether the method will return a null, throw an exception, expect an out+variable, or worst of all, make up some kind of default. C#'s equivalent to unwrap seems to be '!' (or maybe .Val() or something?)
"The fromJust function extracts the element out of a Just and throws an error if its argument is Nothing."
The argument to the contrary is that reading the error out-load showed “the config initializer failing to return a valid configuration”. A panic trace saying “config init failed” is a minor improvement.
If we’re gonna guess and point fingers, I think the configuration init should be doing its own panicking and logging when it blows up.
You might think that the Haskell behavior is “safer” in some sense, but there’s a huge gotcha: exceptions in pure code are the mortal enemy of lazy evaluation. Lazy evaluation means that an exception can occur after the catch block that surrounded the code in question has exited, so the exception isn’t guaranteed to get caught.
Exceptions can be ok in a monad like IO, which is what they’re intended for - the monad enforces an evaluation order. But if you use a partial function like fromJust in pure code, you have to be very careful about forcing evaluation if you want to be able to catch the exception it might generate. That’s antithetical to the goal of using exceptions - now you have to write to code carefully to make sure exceptions are catchable.
The bottom line is that for reliable code, you need to avoid fromJust and friends in Haskell as much you do in Rust.
The solution in both languages is to use a linter to warn about the use of partial functions: HLint for Haskell, Clippy for Rust. If Cloudflare had done that - and paid attention to the warning! - they would have caught that unwrap error of theirs at linting time. This is basically a learning curve issue.
Nope. Rust never makes any guarantees that code is panic-free. Quite the opposite. Rust crashes in more circumstances than C code does. For example, indexing past the end of an array is undefined behaviour in C. But if you try that in rust, your program will detect it and crash immediately.
More broadly, safe rust exists to prevent undefined behaviour. Most of the work goes to stopping you from making common memory related bugs, like use-after-free, misaligned reads and data races. The full list of guarantees is pretty interesting[1]. In debug mode, rust programs also crash on integer overflow and underflow. (Thanks for the correction!). But panic is well defined behaviour, so that's allowed. Surprisingly, you're also allowed to leak memory in safe rust if you want to. Why not? Leaks don't cause UB.
You can tell at a glance that unwrap doesn't violate safe rust's rules because you can call it from safe rust without an unsafe block.
[1] https://doc.rust-lang.org/reference/behavior-considered-unde...
Fun facts: Rust’s default panic handler also throws a runtime exception just like C++ and other languages. Rust also has catch blocks (std::panic::catch_unwind). But its rarely used. By convention, panicking in rust is typically used for unrecoverable errors, when the program should probably crash. And Result is used when you expect something to be fallable - like parsing user input.
You see catch_unwind in the unit test runner. (That’s why a failing test case doesn’t stop other unit tests running). And in web servers to return 50x. You can opt out of this behaviour with panic=abort in Cargo.toml, which also makes rust binaries a bit smaller.
All integer overflow, not just unsigned. Similarly, in release mode (by default) all integer overflow is fully defined as two's complement wrap.
Almost every codebase I’ve ever worked in, in every programming language, could use better human readable error messages. But they’re notoriously hard to figure out ahead of time. You can only write good error messages for error cases you’ve thought through. And most error cases only become apparent when you stub your toe on them for real. Then you wonder how you missed it in the first place.
In any case, this sort of thing has nothing to do with rust.
It's an overstatement to say that these are "shunned by practically everybody". They're commonly used in scenarios where the author is confident that the failure condition can't happen due to e.g. a prior test or guard, or that failure can be reliably caught. For example, you can catch a `read` exception reliably in IO. They're also commonly used in GHCi or other interactive environments.
I disagree that the Rust perspective on unwrap is significantly different. Perhaps for beginning programmers in the language? But the same is often true of Haskell. Anyone with some experience should understand the risks of these functions, and if they don't, they'll eventually learn the hard way.
One pattern in Rust that may mislead beginners is that unwrap is often used on things like builders in example docs. The logic here is that if you're building some critical piece of infra that the rest of the program depends on, then if it fails to build the program is toast anyway, so letting it panic can make sense. These examples are also typically scenarios where builder failure is unusual. In that case, it's the author's choice whether to handle failure or just let it panic.
Also, when I say safety guarantees, I'm not talking about safe rust. I'm talking about Rust features that prevent bugs, like the borrow checker, types like Result and many others.
https://www.moderndescartes.com/essays/readability/
(I have not read this article closely, but it is about the right concept, so I provide it as a starting point since "readability" writ large can be an ambiguous term.)
You’re right that rust forces you to explicitly decide what to do with Result::Err. But that’s exactly what we see here. .unwrap() is handling the error case explicitly. It says “if this is an error, crash the program. Otherwise give me the value”. It’s a very useful function that was used correctly here. And it functioned correctly by crashing the program.
I don’t see the problem in this code, beyond it not giving a good error message as it crashed. As the old joke goes, “Task failed successfully.”
These roles don’t really have standard titles in the industry, as far as I’m aware. At Google we were part of the larger language/library/toolchain infrastructure org.
Much of what we did was quasi-political … basically coaxing and convincing people to adopt best practices, after first deciding what those practices are. Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Speaking to the original question, no, there were no teams just manually reading code and looking for mistakes. If buggy code could be detected in an automated way, then we’d do that and attempt to fix it everywhere. Otherwise we’d attempt to educate and get everyone to level up their code review skills.
> Half of the tips above were probably written by interested people from the engineering org at large and we provided the platform and helped them get it published.
Are you aware how those engineers established their recommendations? Did they maybe perform case studies? Or was it more just a distillation of lived experience type of deal?
I have a hobby.
Whenever I see the comment // this should never happen in code, I try to find out the exact conditions under which it could happen. And in 90% of cases, I find a way to do just that. More often than not, the developer just hasn’t considered all edge cases or future code changes.
In fact, the reason why I like this comment so much is that it often marks the exact spot where strong guarantees fall apart. Often, violating implicit invariants that aren’t enforced by the compiler are the root cause.
Yes, the compiler prevents memory safety issues, and the standard library is best-in-class. But even the standard library has its warts and bugs in business logic can still happen.
All we can work with are hard-learned patterns to write more defensive Rust code, learned throughout years of shipping Rust code to production. I’m not talking about design patterns here, but rather small idioms, which are rarely documented, but make a big difference in the overall code quality.
Here’s some innocent-looking code:
if !matching_users.is_empty() {
let existing_user = &matching_users[0];
// ...
}
What if you refactor it and forget to keep the is_empty() check? The problem is that the vector indexing is decoupled from checking the length. So matching_users[0] can panic at runtime if the vector is empty.
Checking the length and indexing are two separate operations, which can be changed independently. That’s our first implicit invariant that’s not enforced by the compiler.
If we use slice pattern matching instead, we’ll only get access to the element if the correct match arm is executed.
match matching_users.as_slice() {
[] => todo!("What to do if no users found!?"),
[existing_user] => { Safe! Compiler guarantees exactly one element
No need to index into the vector,
we can directly use `existing_user` here
}
_ => Err(RepositoryError::DuplicateUsers)
}
Note how this automatically uncovered one more edge case: what if the list is empty? We hadn’t explicitly considered this case before. The compiler-enforced pattern matching requires us to think about all possible states! This is a common pattern in all robust Rust code: putting the compiler in charge of enforcing invariants.
DefaultWhen initializing an object with many fields, it’s tempting to use ..Default::default() to fill in the rest. In practice, this is a common source of bugs. You might forget to explicitly set a new field later when you add it to the struct (thus using the default value instead, which might not be what you want), or you might not be aware of all the fields that are being set to default values.
Instead of this:
let foo = Foo {
field1: value1,
field2: value2,
..Default::default() // Implicitly sets all other fields
};
Do this:
let foo = Foo {
field1: value1,
field2: value2,
field3: value3, // Explicitly set all fields
field4: value4,
// ...
};
Yes, it’s slightly more verbose, but what you gain is that the compiler will force you to handle all fields explicitly. Now when you add a new field to Foo, the compiler will remind you to set it here as well and reflect on which value makes sense.
If you still prefer to use Default but don’t want to lose compiler checks, you can also destructure the default instance:
let Foo { field1, field2, field3, field4 } = Foo::default();
This way, you get all the default values assigned to local variables and you can still override what you need:
let foo = Foo {
field1: value1, // Override what you need
field2: value2, // Override what you need
field3, // Use default value
field4, // Use default value
};
This pattern gives you the best of both worlds:
Completely destructuring a struct into its components can also be a defensive strategy for API adherence. For example, let’s say you’re building a pizza ordering system and have an order type like this:
struct PizzaOrder {
size: PizzaSize,
toppings: Vec<Topping>,
crust_type: CrustType,
ordered_at: SystemTime,
}
For your order tracking system, you want to compare orders based on what’s actually on the pizza - the size, toppings, and crust_type. The ordered_at timestamp shouldn’t affect whether two orders are considered the same.
Here’s the problem with the obvious approach:
impl PartialEq for PizzaOrder {
fn eq(&self, other: &Self) -> bool {
self.size == other.size
&& self.toppings == other.toppings
&& self.crust_type == other.crust_type
}
}
Now imagine your team adds a field for customization options:
struct PizzaOrder {
size: PizzaSize,
toppings: Vec<Topping>,
crust_type: CrustType,
ordered_at: SystemTime,
extra_cheese: bool, New field added
}
Your PartialEq implementation still compiles, but is it correct? Should extra_cheese be part of the equality check? Probably yes - a pizza with extra cheese is a different order! But you’ll never know because the compiler won’t remind you to think about it.
Here’s the defensive approach using destructuring:
impl PartialEq for PizzaOrder {
fn eq(&self, other: &Self) -> bool {
let Self {
size,
toppings,
crust_type,
ordered_at: _,
} = self;
let Self {
size: other_size,
toppings: other_toppings,
crust_type: other_crust,
ordered_at: _,
} = other;
size == other_size && toppings == other_toppings && crust_type == other_crust
}
}
Now when someone adds the extra_cheese field, this code won’t compile anymore. The compiler forces you to decide: should extra_cheese be included in the comparison or explicitly ignored with extra_cheese: _?
This pattern works for any trait implementation where you need to handle struct fields: Hash, Debug, Clone, etc. It’s especially valuable in codebases where structs evolve frequently as requirements change.
From Impls That Are Really TryFromSometimes there’s no conversion that will work 100% of the time. That’s fine. When that’s the case, resist the temptation to offer a From implementation out of habit; use TryFrom instead.
Here’s an example of TryFrom in disguise:
impl From<&DetectorStartupErrorReport> for DetectorStartupErrorSubject {
fn from(report: &DetectorStartupErrorReport) -> Self {
let postfix = report
.get_identifier()
.or_else(get_binary_name)
.unwrap_or_else(|| UNKNOWN_DETECTOR_SUBJECT.to_string());
Self(StreamSubject::from(
format!("apps.errors.detectors.startup.{postfix}").as_str(),
))
}
}
The unwrap_or_else is a hint that this conversion can fail in some way. We set a default value instead, but is it really the right thing to do for all callers? This should be a TryFrom implementation instead, making the fallible nature explicit. We fail fast instead of continuing with a potentially flawed business logic.
It’s tempting to use match in combination with a catch-all pattern like _ => {}, but this can haunt you later. The problem is that you might forget to handle a new case that was added later.
Instead of:
match self {
Self::Variant1 => { ... }
Self::Variant2 => { ... }
_ => { catch-all }
}
Use:
match self {
Self::Variant1 => { ... }
Self::Variant2 => { ... }
Self::Variant3 => { ... }
Self::Variant4 => { ... }
}
By spelling out all variants explicitly, the compiler will warn you when a new variant is added, forcing you to handle it. Another case of putting the compiler to work.
If the code for two variants is the same, you can group them:
match self {
Self::Variant1 => { ... }
Self::Variant2 => { ... }
Self::Variant3 | Self::Variant4 => { shared logic }
}
_ Placeholders for Unused VariablesUsing _ as a placeholder for unused variables can lead to confusion. For example, you might get confused about which variable was skipped. That’s especially true for boolean flags:
match self {
Self::Rocket { _, _, .. } => { ... }
}
In the above example, it’s not clear which variables were skipped and why. Better to use descriptive names for the variables that are not used:
match self {
Self::Rocket { has_fuel: _, has_crew: _, .. } => { ... }
}
Even if you don’t use the variables, it’s clear what they represent and the code becomes more readable and easier to review without inline type hints.
If you only want your data to be mutable temporarily, make that explicit.
let mut data = get_vec();
data.sort();
let data = data; // Shadow to make immutable
// Here `data` is immutable.
This pattern is often called “temporary mutability” and helps prevent accidental modifications after initialization. See the Rust unofficial patterns book for more details.
You can go one step further and do the initialization part in a scope block:
let data = {
let mut data = get_vec();
data.sort();
data // Return the final value
};
// Here `data` is immutable
This way, the mutable variable is confined to the inner scope, making it clear that it’s only used for initialization. In case you use any temporary variables during initialization, they won’t leak into the outer scope. In our case above, there were none, but imagine if we had a temporary vector to hold intermediate results:
let data = {
let mut data = get_vec();
let temp = compute_something();
data.extend(temp);
data.sort();
data // Return the final value
};
Here, temp is only accessible within the inner scope, which prevents it from accidental use later on.
This is especially useful when you have multiple temporary variables during initialization that you don’t want accessible in the rest of the function. The scope makes it crystal clear that these variables are only meant for initialization.
The following pattern is only truly helpful for libraries and APIs that need to be robust against future changes. In such a case, you want to ensure that all instances of a type are created through a constructor function that enforces validation logic. Because without that, future refactorings can easily lead to invalid states.
For application code, it’s probably best to keep things simple. You typically have all the call sites under control and can ensure that validation logic is always called.
Let’s say you have a simple type like the following:
pub struct S {
pub field1: String,
pub field2: u32,
}
Now you want to add validation logic to ensure invalid states are never created. One pattern is to return a Result from the constructor:
impl S {
pub fn new(field1: String, field2: u32) -> Result<Self, String> {
if field1.is_empty() {
return Err("field1 cannot be empty".to_string());
}
if field2 == 0 {
return Err("field2 cannot be zero".to_string());
}
Ok(Self { field1, field2 })
}
}
But nothing stops someone from bypassing your validation by creating an instance directly:
let s = S {
field1: "".to_string(),
field2: 0,
};
This should not be possible! It is our implicit invariant that’s not enforced by the compiler: the validation logic is decoupled from struct construction. These are two separate operations, which can be changed independently and the compiler won’t complain.
To force external code to go through your constructor, add a private field:
pub struct S {
pub field1: String,
pub field2: u32,
_private: (), This prevents external construction
}
impl S {
pub fn new(field1: String, field2: u32) -> Result<Self, String> {
if field1.is_empty() {
return Err("field1 cannot be empty".to_string());
}
if field2 == 0 {
return Err("field2 cannot be zero".to_string());
}
Ok(Self { field1, field2, _private: () })
}
}
Now code outside your module cannot construct S directly because it cannot access the _private field. The compiler enforces that all construction must go through your new() method, which includes your validation logic!
Note that the underscore prefix is just a naming convention to indicate the field is intentionally unused; it’s the lack of pub that makes it private and prevents external construction.
For libraries that need to evolve over time, you can also use the #[non_exhaustive] attribute instead:
#[non_exhaustive]
pub struct S {
pub field1: String,
pub field2: u32,
}
This has the same effect of preventing construction outside your crate, but also signals to users that you might add more fields in the future. The compiler will prevent them from using struct literal syntax, forcing them to use your constructor.
There’s a big difference between these two approaches:
#[non_exhaustive] only works across crate boundaries. It prevents construction outside your crate._private works at the module boundary. It prevents construction outside the module, but within the same crate.On top of that, some developers find _private: () more explicit about intent: “this struct has a private field that prevents construction.”
With #[non_exhaustive], the primary intent is signaling that fields might be added in the future, and preventing construction is more of a side effect.
But what about code within the same module? With the patterns above, code in the same module can still bypass your validation:
// Still compiles in the same module!
let s = S {
field1: "".to_string(),
field2: 0,
_private: (),
};
Rust’s privacy works at the module level, not the type level. Anything in the same module can access private items.
If you need to enforce constructor usage even within your own module, you need a more defensive approach using nested private modules:
mod inner {
pub struct S {
pub field1: String,
pub field2: u32,
_seal: Seal,
}
This type is private to the inner module
struct Seal;
impl S {
pub fn new(field1: String, field2: u32) -> Result<Self, String> {
if field1.is_empty() {
return Err("field1 cannot be empty".to_string());
}
if field2 == 0 {
return Err("field2 cannot be zero".to_string());
}
Ok(Self { field1, field2, _seal: Seal })
}
}
}
// Re-export for public use
pub use inner::S;
Now even code in your outer module cannot construct S directly because Seal is trapped in the private inner module. Only the new() method, which lives in the same module as Seal, can construct it. The compiler guarantees that all construction, even internal construction, goes through your validation logic.
You could still access the public fields directly, though.
let s = S::new("valid".to_string(), 42).unwrap();
s.field1 = "".to_string(); // Still possible to mutate fields directly
To prevent that, you can make the fields private and provide getter methods instead:
mod inner {
pub struct S {
field1: String,
field2: u32,
_seal: Seal,
}
struct Seal;
impl S {
pub fn new(field1: String, field2: u32) -> Result<Self, String> {
if field1.is_empty() {
return Err("field1 cannot be empty".to_string());
}
if field2 == 0 {
return Err("field2 cannot be zero".to_string());
}
Ok(Self { field1, field2, _seal: Seal })
}
pub fn field1(&self) -> &str {
&self.field1
}
pub fn field2(&self) -> u32 {
self.field2
}
}
}
Now the only way to create an instance of S is through the new() method, and the only way to access its fields is through the getter methods.
To enforce validation through constructors:
_private: () or use #[non_exhaustive]The key insight is that by making construction impossible without access to a private type, you turn your validation logic from a convention into a guarantee enforced by the compiler. So let’s put that compiler to work!
#[must_use] on Important TypesThe #[must_use] attribute is often neglected. That’s sad, because it’s such a simple yet powerful mechanism to prevent callers from accidentally ignoring important return values.
#[must_use = "Configuration must be applied to take effect"]
pub struct Config {
...
}
impl Config {
pub fn new() -> Self {
}
pub fn with_timeout(mut self, timeout: Duration) -> Self {
self.timeout = timeout;
self
}
}
Now if someone creates a Config but forgets to use it, the compiler will warn them (even with a custom message!):
let config = Config::new();
// Warning: Configuration must be applied to take effect
config.with_timeout(Duration::from_secs(30));
// Correct usage:
let config = Config::new()
.with_timeout(Duration::from_secs(30));
apply_config(config);
This is especially useful for guard types that need to be held for their lifetime and results from operations that must be checked. The standard library uses this extensively. For example, Result is marked with #[must_use], which is why you get warnings if you don’t handle errors.
Boolean parameters make code hard to read at the call site and are error-prone. We all know the scenario where we’re sure this will be the last boolean parameter we’ll ever add to a function.
// Too many boolean parameters
fn process_data(data: &[u8], compress: bool, encrypt: bool, validate: bool) {
...
}
// At the call site, what do these booleans mean?
process_data(&data, true, false, true); // What does this do?
It’s impossible to understand what this code does without looking at the function signature. Even worse, it’s easy to accidentally swap the boolean values.
Instead, use enums to make the intent explicit:
enum Compression {
Strong,
Medium,
None,
}
enum Encryption {
AES,
ChaCha20,
None,
}
enum Validation {
Enabled,
Disabled,
}
fn process_data(
data: &[u8],
compression: Compression,
encryption: Encryption,
validation: Validation,
) {
...
}
// Now the call site is self-documenting
process_data(
&data,
Compression::Strong,
Encryption::None,
Validation::Enabled
);
This is much more readable and the compiler will catch mistakes if you pass the wrong enum type. You will notice that the enum variants can be more descriptive than just true or false. And more often than not, there are more than two meaningful options; especially for programs which grow over time.
For functions with many options, you can configure them using a parameter struct:
struct ProcessDataParams {
compression: Compression,
encryption: Encryption,
validation: Validation,
}
impl ProcessDataParams {
Common configurations as constructor methods
pub fn production() -> Self {
Self {
compression: Compression::Strong,
encryption: Encryption::AES,
validation: Validation::Enabled,
}
}
pub fn development() -> Self {
Self {
compression: Compression::None,
encryption: Encryption::None,
validation: Validation::Enabled,
}
}
}
fn process_data(data: &[u8], params: ProcessDataParams) {
...
}
// Usage with preset configurations
process_data(&data, ProcessDataParams::production());
// Or customize for specific needs
process_data(&data, ProcessDataParams {
compression: Compression::Medium,
encryption: Encryption::ChaCha20,
validation: Validation::Enabled,
});
This approach scales much better as your function evolves. Adding new parameters doesn’t break existing call sites, and you can easily add defaults or make certain fields optional. The preset methods also document common use cases and make it easy to use the right configuration for different scenarios.
Rust is often criticized for not having named parameters, but using a parameter struct is arguably even better for larger functions with many options.
Many of these patterns can be enforced automatically using Clippy lints. Here are the most relevant ones:
| Lint | Description |
|---|---|
clippy::indexing_slicing |
Prevents direct indexing into slices and vectors |
clippy::fallible_impl_from |
Warns about From implementations that can panic and should be TryFrom instead. |
clippy::wildcard_enum_match_arm |
Disallows wildcard _ patterns. |
clippy::unneeded_field_pattern |
Identifies when you’re ignoring too many struct fields with .. unnecessarily. |
clippy::fn_params_excessive_bools |
Warns when a function has too many boolean parameters (4 or more by default). |
clippy::must_use_candidate |
Suggests adding #[must_use] to types that are good candidates for it. |
You can enable these in your project by adding them at the top of your crate, e.g.
#![deny(clippy::indexing_slicing)]
#![deny(clippy::fallible_impl_from)]
#![deny(clippy::wildcard_enum_match_arm)]
#![deny(clippy::unneeded_field_pattern)]
#![deny(clippy::fn_params_excessive_bools)]
#![deny(clippy::must_use_candidate)]
Or in your Cargo.toml:
[lints.clippy]
indexing_slicing = "deny"
fallible_impl_from = "deny"
wildcard_enum_match_arm = "deny"
unneeded_field_pattern = "deny"
fn_params_excessive_bools = "deny"
must_use_candidate = "deny"
Defensive programming in Rust is about leveraging the type system and compiler to catch bugs before they happen. By following these patterns, you can:
It’s a skill that doesn’t come naturally and it’s not covered in most Rust books, but knowing these patterns can make the difference between code that works but is brittle, and code that is robust and maintainable for years to come.
Remember: if you find yourself writing // this should never happen, take a step back and ask how the compiler could enforce that invariant for you instead. The best bug is the one that never compiles in the first place.