But the example seems backwards to me: unless every callsite that locks any item always locks the big global lock first (probably not true, because if you serialize all item access on a global lock then a per-item lock serves no purpose...), aren't you begging for priority inversions by acquiring the big global lock before you acquire the item lock?
My only gripe is missing the obvious opportunity for Ferengi memes ("rules of acquisition") :D :D
>This is a deliberate design decision. lock_tree uses a DAG, which lets you declare that branches A and B are independent β neither needs to come before the other. Sounds great, but it has a subtle problem: if thread 1 acquires A then B, and thread 2 acquires B then A, and both orderings are valid in the DAG, you have a deadlock that the compiler happily approved.
Would it be possible to build one at compile time? Static levels seem like they won't let you share code without level-collaboration, so that might be kinda important for larger-scale use.
I don't know enough about Rust's type system to know if that's possible though. Feels like it's pushing into "maybe" territory, like maybe not with just linear types but what about proc macros?
I can definitely see why it's easier to build this way though, and for some contexts that limitation seems entirely fine. Neat library, and nice post :)
One thing I didn't see in the post or the repo: does this work with async code?
I couldn't find the "search" button on Codeberg, and tests/integration.rs didn't have any async.
For embedded, I have had my eye on https://github.com/embassy-rs/embassy (which has an async runtime for embedded) and would love a nice locking crate to go with it.
A pattern I've definitely both seen and used is
let guard1 = datastructure_containing_the_whole_world.lock();
let guard2 = guard1.subset_of_that_datastructure.lock();
guard1.unlock();
// Do expensive work
guard2.unlock();
Which works to parallelize work so long as guard2 isn't contended... and at least ensures correctness and forward progress the rest of the time.Thereβs no priority inversion possible because locks can only ever be held in decreasing orders of priority - you canβt acquire a low priority lock and then a high priority lock since your remaining MutexKey wonβt have the right level.
First, lock acquisition seems to be a blocking method. And I don't see a `try_lock` method, so the naive pattern of spinning on `try_lock` and yielding on failure won't work. It'll still work in an async function, you'll just block the executor if the lock is contested and be sad.
Second, the key and guard types are not Send, otherwise it would be possible to send a key of a lower level to a thread that has already acquired a lock of a higher level, allowing deadlocks. (Or to pass a mutex guard of a higher level to a thread that has a key of a lower level.)
Therefore, holding a lock or a key across an await point makes your Future not Send.
Technically, this is fine. Nothing about Rust async in general requires that your Futures are Send. But in practice, most of the popular async runtimes require this. So if you want to use this with Tokio, for example, then you have to design your system to not hold locks or keys across await points.
This first restriction seems like it could be improved with the addition of an `AsyncLockable` trait. But the second restriction seems to me to be fundamental to the design.
Supercomputing is another domain that has deep insights into scalable systems that is famously so insular that ideas rarely cross over into mainstream scalable systems. My detour through supercomputing probably added as much to my database design knowledge as anything I actually did in databases.
Is easy, or hard?
Demand a new paradigm at large, or is only a inconvenience in the few places is used?
Because if the answer is "turns the language into Haskell" then is a big NOPE!
Also to note, regarding βfuture not send,β that, in tokio codebases where the general expectation is that futures will be Send, enabling the clippy lint βfuture_not_sendβ is extremely helpful in avoiding these kinds of issues and also in keeping the error localized to the offending function, rather than it being miles away somewhere it happens to be getting indirectly spawned or whatever: https://rust-lang.github.io/rust-clippy/stable/index.html?se...
One thing that I think do affect things, is that language design discussions tend to be concentrated into their own communities based on the programming language itself, rather than one "programming language discussions" place where everyone can easier cross-pollinate ideas across languages. Luckily, there are some individuals who move between communities without effort, which does lead to a bit of ideas making it across, but it feels like we're missing out on so much evolution and ideas from various languages across the ecosystem.
http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-...
Mutex::new(AppConfig::default());
...is meant to be acquiring a mutex protecting some global config object, yes? That's what I'm calling a "global lock".> Thereβs no priority inversion possible because locks can only ever be held in decreasing orders of priority
T1 T2
-- --
small_lock();
big_lock();
small_lock(); <--- Spins waiting for T1
...and now any other thread that needs big_lock() spins waiting for T2 to release it, but T2 is spinning waiting for T1 to release the (presumably less critical) small lock.If small_lock is never ever acquired without acquiring big_lock first, small_lock serves no purpose and should be deleted from the program.
Oh, many of these travelers spend a lot of effort!
> Models can be pulled along other axes, however, such as whether memory locations must be tagged in order to be used in a transaction or not, etc. Haskell requires this tagging (via TVars) so that side-effects are evident in the type system as with any other kind of monad. We quickly settled on unbounded transactions.
Snip
> In hindsight, this was a critical decision that had far-reaching implications. And to be honest, I now frequently doubt that it was the right call. We had our hearts in the right places, and the entire industry was trekking down the same path at the same time (with the notable exception of Haskell)
So basically not that TM isnβt workable, but unbounded TM is likely a foolβs errand but Haskellβs is bounded TM that requires explicit annotation of memory that will participate in atomicity.
Look at the API - if big_lock and small_lock are at the same level, you would need to acquire the lock simultaneously for both locks which is accomplished within the library by sorting* the locks and then acquiring. If you fail to acquire small_lock, big lock isnβt held (itβs an all or nothing situation). This exact scenario is explained in the link by the way. You canβt bypass the βacquire simultaneouslyβ api because you only have a key for one level
Your terminology is also off. A lock around a configuration is typically called a fine grained lock unless youβre holding that lock for large swathes of program. Global as it refers to locking doesnβt refer to visibility of the lock or that it does mutual exclusion. For example, a lock on a database that only allows one thread into a hot path operation at a time is a global lock.
* sorting is done based on global construction order grabbed at construction - thereβs a singleton atomic that hands out IDs for each mutex.
Mutex::new(AppConfig::default()) might very well be a small, leaf mutex.
That's exactly the problem. It sounds like one aggregate person. It's quite unpleasant to read the same turns of phrase again and again and again, especially when it means that the author copped out of writing it themselves.
In fairness I think in this case they mostly did write it themselves.
Email the mods about it rather than replying, subject βAccusation of AI in FP commentβ or whatever. Itβs a guidelines violation to make the accusation in a comment rather than to them by email, and they have tools to deal with it!
The closest actually human style to LLM writing is obnoxious marketing speak. So that also sucks.
So many people who are not great writers lean on LLMs to write, but aren't good enough to see how bad it is. They should be criticised for this. Either use them and be good enough to make it read as human, or just don't use them. No free lunch.
NOTE
Hello
r/rust! Thank you for the interest in this work and for the delightful conversation β€οΈ
I hate deadlocks. Maybe you do too. Back at Fission, whenever someone would suggest a mutex weβd start a chant of βI say mutex, you say deadlock: Mutex! DEADLOCK! Mutex! DEADLOCK!β. Deadlocks lurk β perfectly invisible in code review, happy to pass CI a thousand times, then lock your system up at 3am under a request pattern that no one anticipated.
They have their own tradeoffs, but I miss TVars from Haskell. AFAICT thereβs no way to do proper TVars in languages that have no way to enforce purity. We βshouldβ prefer lock-free programming, but mutexes are very common in the Rust standard style. I often hear that actors eliminate deadlocks, but as someone whoβs written her fair share of Elixir, this is 100% a lie (though they are less frequent).
Rust famously catches data races at compile time, but for deadlocks? You get a Mutex, a pat on the back, and βgood luck, babeβ. There are some tools that help analyze your code that are fairly good, but I want feedback during development. Iβve been thinking about a better approach to this problem for a while, looked at a bunch of other attempts and have come up with what I hope is a decent ergonomic balance that covers many common use cases in Rust: surelock, a deadlock-freedom library. If your code compiles, it doesnβt deadlock. No Result, no Option, no runtime panic on the lock path. Every acquisition is either proven safe by the compiler or rejected at build time1.
WARNING
This is a pre-release. I think the core design is solid, but I wouldnβt be surprised if there are rough edges. Feedback and contributions welcome!
LockSet)Level<N>)unsafe is confined to the raw mutex internalsno_std compatible, zero required runtime dependencies2The honest answer is that being careful doesnβt scale. Itβs easy to shoot yourself in the foot while composing code. But wait, isnβt this the kind of thing that Rust us supposed to help us with? The borrow checker catches data races at compile time β why shouldnβt we expect the same for deadlocks?

The fundamental problem is well-understood. Coffman et al. identified four necessary conditions for deadlock back in 1971:
| Condition | Meaning |
|---|---|
| Mutual exclusion | At least one resource is held exclusively |
| Hold and wait | A thread holds one lock while waiting for another |
| No preemption | Locks canβt be forcibly taken away |
| Circular wait | A cycle exists in the wait-for graph |
Break any one of these, and deadlocks become impossible. Mutual exclusion is the whole point of a mutex. Preemption introduces its own class of bugs. That leaves hold-and-wait and circular wait.
QUOTE
A language that doesnβt affect the way you think about programming is not worth knowing.
β Alan Perlis
We identified the solution space over 50 years ago(!). Itβs a thorny enough problem that we have no single agreed solution, and yet mutexes are sufficiently useful that we still use them. This suggests that until something comes along that completely replaces mutexes, weβre in the domain of improving the ergonomics for using mutexes safely.
Only rarely do safety techniques exactly match safe use. Type systems somewhat famously either allow some unsound behaviour, or rule out legitimate use (sometimes both) β hence all of the effort going into making fancier type systems that can more directly match all safe uses while ruling out unsafe ones. The trick is in lining those up as closely as possible while introducing the easiest model to work with: minimal ceremony and/or easy to reason about. I would argue that we want our tools to help us to think about the problem.
Surelock is built around a physical-world analogy: to interact with locks, you need a key. in our case, weβre going to keep that key while the mutex is in use. You only get that key back when you unlock it.
We call this a MutexKey β a linear3 scope token. You get one when you enter a locking scope. When you call .lock(), the key is consumed and a new one is returned alongside the guard. The new key carries a type-level record of what youβve already locked, so the compiler knows what youβre still allowed to acquire. Try to go backwards and the code doesnβt compile.
π‘ This is the core trick: by making the key a move-only value that threads through every acquisition, we get a compile-time witness of the current lock state. No global analysis, no runtime tracking β just the type checker doing what it does best.
This analogy only goes so far: MutexKey actually grants you the ability to lock multiple mutexes together atomically. Locks in surelock may be grouped into levels to enable incremental acquisition, and locking returns an attenuated key that can lock fewer levels.
Two existing libraries informed the design, and I want to give them proper credit.
happylockhappylock by botahamec introduced the key insight that a capability token combined with sorted multi-lock acquisition can prevent deadlocks. Surelock borrows this pattern directly.
Where the approaches diverge: happylock breaks the hold-and-wait condition. When you lock through the collection, your key is consumed β you canβt go acquire more locks at all until itβs released. This is safe, but it means you MUST acquire all of your locks at once. You canβt do things like βlock the config, read which account to update, then lock that accountβ. In concurrent systems that need incremental acquisition, this is a real limitation β when you need incremental locks, you really need incremental locks.
happylock also sorts locks by memory address, which is not stable across Vec reallocations or moves. The library goes to some length to maintain address stability with unsafe (Box::leak, transmute), and that unsafety propagates to the public API.
lock_treelock_tree from Googleβs Fuchsia project introduced compile-time lock ordering via a LockAfter trait. Declare that level A comes before level B, and the compiler rejects any code that tries to acquire them in the wrong order.
Surelock extends this in a few ways: same-level multi-lock (which lock_tree canβt express), per-instance level assignment via new_higher, and scope-uniqueness enforcement. Surelock also deliberately drops lock_treeβs DAG-based ordering in favour of a strict total order β more on why below.
Surelock uses two mechanisms that cover complementary cases. Neither overlaps with the other:
| Mechanism | When | Acquisition | Enforcement |
|---|---|---|---|
LockSet |
Multiple locks at the same level | Atomic (all-or-nothing) | Runtime sort by stable ID |
Levels (Level<N>) |
Locks at different levels | Incremental | Compile-time trait bounds |
LockSetEvery Mutex gets a unique, monotonically-increasing LockId from a global AtomicU64 counter at creation time. The ID lives inside the mutex and moves with it β no address stability needed.
When you need multiple locks at the same level, LockSet pre-sorts them by ID at construction time. Two threads requesting the same locks in opposite order both sort to the same acquisition order. No cycle can form.
let alice = Mutex::new("alice");
let bob = Mutex::new("bob");
// Regardless of argument order, acquisition order is deterministic
let set = LockSet::new((&alice, &bob));
key_handle.scope(|key| {
let (a, b) = set.lock(key);
// Both locked, no deadlock possible
});
Thread A Thread B
β β
βΌ βΌ
LockSet::new((&acct_1, &acct_2)) LockSet::new((&acct_2, &acct_1))
β β
βΌ βΌ
sorted: [acct_1, acct_2] sorted: [acct_1, acct_2]
β β
ββ takes acct_1 lock β
β ββ waits for acct_1 lock
ββ takes acct_2 lock β
β β
~~~~~~~~~~~~~~~~~~~~~~~TIME PASSES~~~~~~~~~~~~~~~~~~~~~~~
β β
ββ release acct_2 β
β β (still waiting for lock 1 first)
ββ release acct_1 β
β ββ takes acct_1 lock
β ββ takes acct_2 lock
β β
βΌ βΌ
OK OK (no cycle possible)
When locks represent different kinds of resources β say a config guard vs. a per-account lock β you assign them to different levels. Level<N> types with LockAfter trait bounds enforce strictly ascending acquisition:
use surelock::level::{Level, lock_scope};
type Config = Level<1>;
type Account = Level<2>;
let config: Mutex<_, Config> = Mutex::new(AppConfig::default());
let account: Mutex<_, Account> = Mutex::new(Balance(0));
lock_scope(|key| {
let (cfg, key) = config.lock(key); // Level 1 -> ok
let (acct, key) = account.lock(key); // Level 2 after 1 -> ok
// account.lock(key) then config.lock(key) -> compile error
});
The key (ha!) is MutexKey. Itβs consumed by each lock() call and re-emitted at the new level. If you try to go backwards β trying to acquire a Level<3> followed by a Level<1> β the LockAfter bound isnβt satisfied and the compiler rejects it with a helpful (custom) error message.
This is a deliberate design decision. lock_tree uses a DAG, which lets you declare that branches A and B are independent β neither needs to come before the other. Sounds great, but it has a subtle problem: if thread 1 acquires A then B, and thread 2 acquires B then A, and both orderings are valid in the DAG, you have a deadlock that the compiler happily approved.
Locks are sorted in a total order by (Level, LockId). A total order is more restrictive but actually sound. If two locks participate in the system at all, their relative order is fixed. We could in concept linearise a DAG (much like we do in many op-based CRDTs), but that is harder to reason about and adds a lot of additional machinery.
Both mechanisms β LockSet and levels β use MutexKey. You can think of the key as a kind of scope: itβs consumed by each lock() call and re-emitted at the new level. Itβs branded with an invariant lifetime (same technique as std::thread::scope) so it canβt escape the closure, and itβs !Send so it stays on the thread that created it.
There are two ways to get one: implicit and explicit.
lock_scopeI expect this is what most people will reach for. On std, lock_scope handles all the setup internally β you just write:
use surelock::{Mutex, lock_scope};
let data = Mutex::new(42);
lock_scope(|key| {
let (guard, key) = data.lock(key);
assert_eq!(*guard, 42);
});
Under the hood, lock_scope manages a thread_local! KeyHandle for you. The KeyHandleβs scope method takes &mut self, which means the borrow checker prevents nested scopes β you canβt start a new locking scope while one is already active.
Locksmith and KeyVoucherFor no_std, embedded, or cases where you want tighter control over which core gets which capability, thereβs a fully explicit chain:
Locksmith Program-wide singleton (!Send)
|
v
KeyVoucher Transferable to other threads (Send)
|
v
KeyHandle Per-thread / claimed once (!Send)
|
v
MutexKey<'scope> Branded lifetime, consumed-and-re-emitted
|
v
MutexGuard RAII access to the data
The Locksmith is a singleton β enforced via AtomicBool, not convention. It is !Clone, !Copy, and importantly !Send as an extra layer of safety to prevent it from being duplicated among threads. It issues KeyVouchers which are Send, so you can distribute keys among threads during setup. A voucher gets redeemed on its destination thread for a KeyHandle, which is !Send and stays put.
This might sound like a lot of ceremony, but itβs mainly type machinery to enforce safety invariants. On bare metal with no thread_local!, you want this level of control β youβre deciding at init time exactly which core gets a key, and the type system makes sure no one else can conjure one out of thin air.
Here is an example diagrammed out. Youβll notice that the lock levels are in a strict compile-time sequence, but the lock IDs β being assigned at runtime β show up in arbitrary order. This is completely fine; they merely need to be consistent across all callers.
Here we start with the root key that can lock any level (βabove level 0β). We acquire locks across two levels, sort them by ID, and acquire them in that order (for the reasons described in the timeline earlier). This consumes the root key, and emits a key that can lock above level 2. In the future, we can lock level 3 and onwards, until we unlock and recover the earlier keys.
Key{Level > 0}
β
β
ββββββββββLevel 1ββββββββββββ βΌ
β β ββββββββββLockSetβββββββββ
β β β β
β lock_1 β β β
β lock_3 βββββΌββββββββββββββΌβββββΆ lvl_1-lock_3 ββββββΌβββββββββΆ guard-3
β β β β
β lock_8 βββββββββββββΌββββββββββββββΌβββββΆ lvl_1-lock_8 ββββββΌβββββββββΆ guard-8
β β β β
βββββββββββββββββββββββββββββ β β
β β
ββββββββββLevel 2ββββββββββββ β β
β β β β
β β β β
β lock_5 βββββΌββββββββββββββΌβββββΆ lvl_2-lock_5 ββββββΌβββββββββΆ guard-5
β β β β
β lock_7 β β β
β β β β
βββββββββββββββββββββββββββββ ββββββββββββββββββββββββββ
β
β
βΌ
Key{Level > 2}
β
β
βΌ
ββββββββββLevel 3ββββββββββββ β β β β β β β β β β β β β
β β β
β β β
β lock_2 β βΌ β β β β β ββΆ β
β lock_4 β β
β β β
β lock_6 β β
β β β
βββββββββββββββββββββββββββββ β β β β β β β β β β β β β
no_std and EmbeddedThe crate is #![no_std] at its core, requiring only alloc. On std, you get StdMutex as the default backend and lock_scope as a convenient entry point. On no_std, you bring your own backend β anything implementing lock_api::RawMutex works via a blanket impl behind a feature flag.
It also supports targets without native CAS (e.g. Cortex-M0) via portable-atomic and critical-section features. I think this matters because the places where you most need deadlock freedom β bare-metal firmware with no debugger attached β are exactly the places where current tooling helps you least.
Real-world code sometimes needs to break the rules. Surelock provides Mutex::unchecked_lock() behind the escape-hatch feature flag. Itβs feature-gated so that if you use it, itβs visible in Cargo.toml, itβs grep-able, and it stands out in code review. IMO this is better than the alternative: developers silently reaching for parking_lot::Mutex alongside their surelock mutexes and defeating the whole system.
Deadlocks are a solved problem in theory β weβve known how to prevent them since 1971. The challenge is making that prevention ergonomic enough that people actually use it. Surelock is my attempt at that: lean into Rustβs type system to make the correct thing the easy thing, and make the wrong thing a compiler error.
The crate is on Codeberg and published to crates.io. Dual-licensed MIT/Apache-2.0. Iβd love feedback, especially from folks working on embedded or no_std targets β thatβs where Iβve had the least real-world testing so far.
With one exception: acquring the SAME mutex atomically more than once. Iβm not happy that this edge case exists, but also itβs not common that you would try to take the same mutex alongside itself. β©
thiserror is the sole runtime dependency, which is zero-cost at runtime. The default std backend wraps std::sync::Mutex directly. β©
Technically affine in Rustβs type system (can be dropped but not duplicated), but the intent is linear β youβre expected to use it, and the API is designed so that dropping it early just ends the scope. β©