Surelock: Deadlock-Free Mutexes for Rust

The Level<> abstraction is a really neat way to have your cake and eat it too: you only need a consistent arbitrary order to avoid deadlocks, but the order can have performance consequences when some locks are more coarse than others.

But the example seems backwards to me: unless every callsite that locks any item always locks the big global lock first (probably not true, because if you serialize all item access on a global lock then a per-item lock serves no purpose...), aren't you begging for priority inversions by acquiring the big global lock before you acquire the item lock?

My only gripe is missing the obvious opportunity for Ferengi memes ("rules of acquisition") :D :D

I'm intrigued! I was fighting deadlocks in some Java code this week, and I'm working on a Rust project to maybe replace some of that.

One thing I didn't see in the post or the repo: does this work with async code?

I couldn't find the "search" button on Codeberg, and tests/integration.rs didn't have any async.

For embedded, I have had my eye on https://github.com/embassy-rs/embassy (which has an async runtime for embedded) and would love a nice locking crate to go with it.

I feel like Fuschia’s DAG approach can still be made compile time lock free by either disallowing holding locks from different branches or requiring an ordering when that does happen to prevent cycles (ie you can’t acquire them independently, you have to acquire all independent branches as a single group.

>Why a Total Order, Not a DAG?

>This is a deliberate design decision. lock_tree uses a DAG, which lets you declare that branches A and B are independent — neither needs to come before the other. Sounds great, but it has a subtle problem: if thread 1 acquires A then B, and thread 2 acquires B then A, and both orderings are valid in the DAG, you have a deadlock that the compiler happily approved.

Would it be possible to build one at compile time? Static levels seem like they won't let you share code without level-collaboration, so that might be kinda important for larger-scale use.

I don't know enough about Rust's type system to know if that's possible though. Feels like it's pushing into "maybe" territory, like maybe not with just linear types but what about proc macros?

I can definitely see why it's easier to build this way though, and for some contexts that limitation seems entirely fine. Neat library, and nice post :)

That's pretty awesome. Dead locks are extremely tough to debug. There are even cases where I saw behavior in code that might have been a dead lock. I never found out though.

What is the "graph" view on the right side?

I appreciate that this appears to be an incremental improvement on Fuschia's tree_lock, with the sharp edges sanded off. Good work! I hope I won't have to use it :p

I agree with the author: it's a shame that TVars aren't catching on in more languages. They are a great idea from the database world, that we could use in the rest of computing, too.

I'd read this, but I can't stomach this ChatGPT voice. It's absolutely grating.

My only gripe is missing the obvious opportunity for Ferengi memes ("rules of acquisition") :D :D

> unless every callsite that locks any item always locks the big global lock first (probably not true, because if you serialize all item access on a global lock then a per-item lock serves no purpose...)

A pattern I've definitely both seen and used is

    let guard1 = datastructure_containing_the_whole_world.lock();
    let guard2 = guard1.subset_of_that_datastructure.lock();
    guard1.unlock();
    // Do expensive work
    guard2.unlock();

Which works to parallelize work so long as guard2 isn't contended... and at least ensures correctness and forward progress the rest of the time.

There’s no global lock. There’s a linear MutexKey<N> that a lock of Level >= N has to be acquired with. Aquiring it consumes MutexKey<N> and hands you back MutexKey<Level+1> where Level is the N of the level you’re locking.

There’s no priority inversion possible because locks can only ever be held in decreasing orders of priority - you can’t acquire a low priority lock and then a high priority lock since your remaining MutexKey won’t have the right level.

What is the "graph" view on the right side?

That's pretty awesome. Dead locks are extremely tough to debug. There are even cases where I saw behavior in code that might have been a dead lock. I never found out though.

I appreciate that this appears to be an incremental improvement on Fuschia's tree_lock, with the sharp edges sanded off. Good work! I hope I won't have to use it :p

>Why a Total Order, Not a DAG?

Would it be possible to build one at compile time? Static levels seem like they won't let you share code without level-collaboration, so that might be kinda important for larger-scale use.

I don't know enough about Rust's type system to know if that's possible though. Feels like it's pushing into "maybe" territory, like maybe not with just linear types but what about proc macros?

I can definitely see why it's easier to build this way though, and for some contexts that limitation seems entirely fine. Neat library, and nice post :)

I'm intrigued! I was fighting deadlocks in some Java code this week, and I'm working on a Rust project to maybe replace some of that.

One thing I didn't see in the post or the repo: does this work with async code?

I couldn't find the "search" button on Codeberg, and tests/integration.rs didn't have any async.

For embedded, I have had my eye on https://github.com/embassy-rs/embassy (which has an async runtime for embedded) and would love a nice locking crate to go with it.

IIUC, this crate has similar restrictions to the std Mutex. So it depends on what you mean by "work with async code."

First, lock acquisition seems to be a blocking method. And I don't see a `try_lock` method, so the naive pattern of spinning on `try_lock` and yielding on failure won't work. It'll still work in an async function, you'll just block the executor if the lock is contested and be sad.

Second, the key and guard types are not Send, otherwise it would be possible to send a key of a lower level to a thread that has already acquired a lock of a higher level, allowing deadlocks. (Or to pass a mutex guard of a higher level to a thread that has a key of a lower level.)

Therefore, holding a lock or a key across an await point makes your Future not Send.

Technically, this is fine. Nothing about Rust async in general requires that your Futures are Send. But in practice, most of the popular async runtimes require this. So if you want to use this with Tokio, for example, then you have to design your system to not hold locks or keys across await points.

This first restriction seems like it could be improved with the addition of an `AsyncLockable` trait. But the second restriction seems to me to be fundamental to the design.

A pattern I've definitely both seen and used is

    let guard1 = datastructure_containing_the_whole_world.lock();
    let guard2 = guard1.subset_of_that_datastructure.lock();
    guard1.unlock();
    // Do expensive work
    guard2.unlock();

Which works to parallelize work so long as guard2 isn't contended... and at least ensures correctness and forward progress the rest of the time.

I'd read this, but I can't stomach this ChatGPT voice. It's absolutely grating.

Author here! This post was human written, LLM proofread, and edited a couple times as folks pointed out broken links and minor errors when it was posted to r/rust a few days ago. As someone mentioned lower in the thread, there's a form of what is sometimes called Bay Area Standard that both very online humans and LLMs have absorbed. I find it FASCINATING that we're in an era where we have to prove our humanity, and the downstream behaviours of things like killing em-dash use in response are interesting to watch in real time. I've made the same mistake, so it's honestly difficult to tell!

tbh I'm not getting GPT-voice from this

So tired of this sort of comment. LLMs are trained using (primarily, generally) online material. It sounds like online humans, in aggregate, plus or minus a bit of policy on the part of the model builders.

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

I agree with the author: it's a shame that TVars aren't catching on in more languages. They are a great idea from the database world, that we could use in the rest of computing, too.

The entire programming (or even computing) ecosystem suffers from this issue where very useful ideas don't always propagate across domains even though they just make a whole lot of sense. I'm not sure if it's because they truly wouldn't work out in practice, or if it's just a discovery/communication thing.

One thing that I think do affect things, is that language design discussions tend to be concentrated into their own communities based on the programming language itself, rather than one "programming language discussions" place where everyone can easier cross-pollinate ideas across languages. Luckily, there are some individuals who move between communities without effort, which does lead to a bit of ideas making it across, but it feels like we're missing out on so much evolution and ideas from various languages across the ecosystem.

The cross-fertilization of ideas across computer science domains is more limited than I think people assume. Databases are just one area that contains a lot of good ideas that never seem to leak into other parts of the software world.

Supercomputing is another domain that has deep insights into scalable systems that is famously so insular that ideas rarely cross over into mainstream scalable systems. My detour through supercomputing probably added as much to my database design knowledge as anything I actually did in databases.

The canonical industrial explanation “why not” is probably this 2010 piece from Joe Duffy @ Microsoft:

http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-...

It is a big reason why I picked Scala3/Zio over Rust for my most recent project.

Well, what means to support, truly, TVars?

Is easy, or hard?

Demand a new paradigm at large, or is only a inconvenience in the few places is used?

Because if the answer is "turns the language into Haskell" then is a big NOPE!

In the example it seems pretty clear to me that:

    Mutex::new(AppConfig::default());

...is meant to be acquiring a mutex protecting some global config object, yes? That's what I'm calling a "global lock".

> There’s no priority inversion possible because locks can only ever be held in decreasing orders of priority

    T1               T2
    --               --
    small_lock();
                     big_lock();
                     small_lock(); <--- Spins waiting for T1

...and now any other thread that needs big_lock() spins waiting for T2 to release it, but T2 is spinning waiting for T1 to release the (presumably less critical) small lock.

If small_lock is never ever acquired without acquiring big_lock first, small_lock serves no purpose and should be deleted from the program.

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

IIUC, this crate has similar restrictions to the std Mutex. So it depends on what you mean by "work with async code."

Therefore, holding a lock or a key across an await point makes your Future not Send.

This first restriction seems like it could be improved with the addition of an `AsyncLockable` trait. But the second restriction seems to me to be fundamental to the design.

Just wanted to add to your great summary a link to tokio’s docs on which kind of mutex to use, which seem applicable to the mutex in TFA as well: https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html#wh...

Also to note, regarding “future not send,” that, in tokio codebases where the general expectation is that futures will be Send, enabling the clippy lint “future_not_send” is extremely helpful in avoiding these kinds of issues and also in keeping the error localized to the offending function, rather than it being miles away somewhere it happens to be getting indirectly spawned or whatever: https://rust-lang.github.io/rust-clippy/stable/index.html?se...

tbh I'm not getting GPT-voice from this

I'm not either. If this was GPT-voice, I'd be happy. It's concise, technical, with good emphasis but no drama or AI tropes.

It's there in places ("The honest answer is...") but I think most of this is human written. They probably started with an AI draft I'd guess.

> So tired of this sort of comment.

Email the mods about it rather than replying, subject “Accusation of AI in FP comment” or whatever. It’s a guidelines violation to make the accusation in a comment rather than to them by email, and they have tools to deal with it!

> It sounds like online humans, in aggregate

That's exactly the problem. It sounds like one aggregate person. It's quite unpleasant to read the same turns of phrase again and again and again, especially when it means that the author copped out of writing it themselves.

In fairness I think in this case they mostly did write it themselves.

They write like the worst possible person. It's terrible and obnoxious, there is no reason to put up with it.

Except nobody writes like the aggregate, hence why it's so jarring.

The closest actually human style to LLM writing is obnoxious marketing speak. So that also sucks.

So many people who are not great writers lean on LLMs to write, but aren't good enough to see how bad it is. They should be criticised for this. Either use them and be good enough to make it read as human, or just don't use them. No free lunch.

It is a big reason why I picked Scala3/Zio over Rust for my most recent project.

Well, what means to support, truly, TVars?

Is easy, or hard?

Demand a new paradigm at large, or is only a inconvenience in the few places is used?

Because if the answer is "turns the language into Haskell" then is a big NOPE!

I'm not either. If this was GPT-voice, I'd be happy. It's concise, technical, with good emphasis but no drama or AI tropes.

It's there in places ("The honest answer is...") but I think most of this is human written. They probably started with an AI draft I'd guess.

> Luckily, there are some individuals who move between communities without effort, [...]

Oh, many of these travelers spend a lot of effort!

It's discovery and communication. Public education for adults is way under-appreciated in many many scopes.

The canonical industrial explanation “why not” is probably this 2010 piece from Joe Duffy @ Microsoft:

http://joeduffyblog.com/2010/01/03/a-brief-retrospective-on-...

I don’t think we read the same thing.

> Models can be pulled along other axes, however, such as whether memory locations must be tagged in order to be used in a transaction or not, etc. Haskell requires this tagging (via TVars) so that side-effects are evident in the type system as with any other kind of monad. We quickly settled on unbounded transactions.

Snip

> In hindsight, this was a critical decision that had far-reaching implications. And to be honest, I now frequently doubt that it was the right call. We had our hearts in the right places, and the entire industry was trekking down the same path at the same time (with the notable exception of Haskell)

So basically not that TM isn’t workable, but unbounded TM is likely a fool’s errand but Haskell’s is bounded TM that requires explicit annotation of memory that will participate in atomicity.

In the example it seems pretty clear to me that:

    Mutex::new(AppConfig::default());

...is meant to be acquiring a mutex protecting some global config object, yes? That's what I'm calling a "global lock".

> There’s no priority inversion possible because locks can only ever be held in decreasing orders of priority

    T1               T2
    --               --
    small_lock();
                     big_lock();
                     small_lock(); <--- Spins waiting for T1

...and now any other thread that needs big_lock() spins waiting for T2 to release it, but T2 is spinning waiting for T1 to release the (presumably less critical) small lock.

If small_lock is never ever acquired without acquiring big_lock first, small_lock serves no purpose and should be deleted from the program.

Mutex::new creates a lock, it doesn’t acquire one.

Look at the API - if big_lock and small_lock are at the same level, you would need to acquire the lock simultaneously for both locks which is accomplished within the library by sorting* the locks and then acquiring. If you fail to acquire small_lock, big lock isn’t held (it’s an all or nothing situation). This exact scenario is explained in the link by the way. You can’t bypass the “acquire simultaneously” api because you only have a key for one level

Your terminology is also off. A lock around a configuration is typically called a fine grained lock unless you’re holding that lock for large swathes of program. Global as it refers to locking doesn’t refer to visibility of the lock or that it does mutual exclusion. For example, a lock on a database that only allows one thread into a hot path operation at a time is a global lock.

* sorting is done based on global construction order grabbed at construction - there’s a singleton atomic that hands out IDs for each mutex.

Usually a global lock is a lock that is taken outside all others and is taken for large parts of the runtime (or even, everywhere the thread isn't waiting on a condition variable, file descriptor and the like).

Mutex::new(AppConfig::default()) might very well be a small, leaf mutex.

It's discovery and communication. Public education for adults is way under-appreciated in many many scopes.

> Luckily, there are some individuals who move between communities without effort, [...]

Oh, many of these travelers spend a lot of effort!

I don’t think we read the same thing.

Snip

So basically not that TM isn’t workable, but unbounded TM is likely a fool’s errand but Haskell’s is bounded TM that requires explicit annotation of memory that will participate in atomicity.

Mutex::new creates a lock, it doesn’t acquire one.

* sorting is done based on global construction order grabbed at construction - there’s a singleton atomic that hands out IDs for each mutex.

Mutex::new(AppConfig::default()) might very well be a small, leaf mutex.

> It sounds like online humans, in aggregate

In fairness I think in this case they mostly did write it themselves.

> So tired of this sort of comment.

Nobody is making an accusation of an AI comment - people are pointing out that the article is at least partially AI generate, which does not go against any HN guidelines, and neither does complaining about those comments.

Except nobody writes like the aggregate, hence why it's so jarring.

The closest actually human style to LLM writing is obnoxious marketing speak. So that also sucks.

They write like the worst possible person. It's terrible and obnoxious, there is no reason to put up with it.

NOTE

Hello r/rust! Thank you for the interest in this work and for the delightful conversation ❤️

I hate deadlocks. Maybe you do too. Back at Fission, whenever someone would suggest a mutex we’d start a chant of “I say mutex, you say deadlock: Mutex! DEADLOCK! Mutex! DEADLOCK!“. Deadlocks lurk — perfectly invisible in code review, happy to pass CI a thousand times, then lock your system up at 3am under a request pattern that no one anticipated.

They have their own tradeoffs, but I miss TVars from Haskell. AFAICT there’s no way to do proper TVars in languages that have no way to enforce purity. We “should” prefer lock-free programming, but mutexes are very common in the Rust standard style. I often hear that actors eliminate deadlocks, but as someone who’s written her fair share of Elixir, this is 100% a lie (though they are less frequent).

Rust famously catches data races at compile time, but for deadlocks? You get a Mutex, a pat on the back, and “good luck, babe”. There are some tools that help analyze your code that are fairly good, but I want feedback during development. I’ve been thinking about a better approach to this problem for a while, looked at a bunch of other attempts and have come up with what I hope is a decent ergonomic balance that covers many common use cases in Rust: surelock, a deadlock-freedom library. If your code compiles, it doesn’t deadlock. No Result, no Option, no runtime panic on the lock path. Every acquisition is either proven safe by the compiler or rejected at build time1.

WARNING

This is a pre-release. I think the core design is solid, but I wouldn’t be surprised if there are rough edges. Feedback and contributions welcome!

TL;DR

Deadlocks happen when all four Coffman Conditions occur simultaneously
Surelock breaks one of them — circular wait — using two complementary mechanisms
Same-level locks are acquired atomically in a deterministic order (LockSet)
Cross-level locks are acquired incrementally with compile-time ordering enforcement (Level<N>)
The entire public API is safe — unsafe is confined to the raw mutex internals
no_std compatible, zero required runtime dependencies2

Why Not Just Be Careful?

The honest answer is that being careful doesn’t scale. It’s easy to shoot yourself in the foot while composing code. But wait, isn’t this the kind of thing that Rust us supposed to help us with? The borrow checker catches data races at compile time — why shouldn’t we expect the same for deadlocks?

The fundamental problem is well-understood. Coffman et al. identified four necessary conditions for deadlock back in 1971:

Condition	Meaning
Mutual exclusion	At least one resource is held exclusively
Hold and wait	A thread holds one lock while waiting for another
No preemption	Locks can’t be forcibly taken away
Circular wait	A cycle exists in the wait-for graph

Break any one of these, and deadlocks become impossible. Mutual exclusion is the whole point of a mutex. Preemption introduces its own class of bugs. That leaves hold-and-wait and circular wait.

QUOTE

A language that doesn’t affect the way you think about programming is not worth knowing.

― Alan Perlis

We identified the solution space over 50 years ago(!). It’s a thorny enough problem that we have no single agreed solution, and yet mutexes are sufficiently useful that we still use them. This suggests that until something comes along that completely replaces mutexes, we’re in the domain of improving the ergonomics for using mutexes safely.

Only rarely do safety techniques exactly match safe use. Type systems somewhat famously either allow some unsound behaviour, or rule out legitimate use (sometimes both) — hence all of the effort going into making fancier type systems that can more directly match all safe uses while ruling out unsafe ones. The trick is in lining those up as closely as possible while introducing the easiest model to work with: minimal ceremony and/or easy to reason about. I would argue that we want our tools to help us to think about the problem.

Surelock is built around a physical-world analogy: to interact with locks, you need a key. in our case, we’re going to keep that key while the mutex is in use. You only get that key back when you unlock it.

We call this a MutexKey — a linear3 scope token. You get one when you enter a locking scope. When you call .lock(), the key is consumed and a new one is returned alongside the guard. The new key carries a type-level record of what you’ve already locked, so the compiler knows what you’re still allowed to acquire. Try to go backwards and the code doesn’t compile.

💡 This is the core trick: by making the key a move-only value that threads through every acquisition, we get a compile-time witness of the current lock state. No global analysis, no runtime tracking — just the type checker doing what it does best.

This analogy only goes so far: MutexKey actually grants you the ability to lock multiple mutexes together atomically. Locks in surelock may be grouped into levels to enable incremental acquisition, and locking returns an attenuated key that can lock fewer levels.

Prior Art

Two existing libraries informed the design, and I want to give them proper credit.

`happylock`

happylock by botahamec introduced the key insight that a capability token combined with sorted multi-lock acquisition can prevent deadlocks. Surelock borrows this pattern directly.

Where the approaches diverge: happylock breaks the hold-and-wait condition. When you lock through the collection, your key is consumed — you can’t go acquire more locks at all until it’s released. This is safe, but it means you MUST acquire all of your locks at once. You can’t do things like “lock the config, read which account to update, then lock that account”. In concurrent systems that need incremental acquisition, this is a real limitation — when you need incremental locks, you really need incremental locks.

happylock also sorts locks by memory address, which is not stable across Vec reallocations or moves. The library goes to some length to maintain address stability with unsafe (Box::leak, transmute), and that unsafety propagates to the public API.

`lock_tree`

lock_tree from Google’s Fuchsia project introduced compile-time lock ordering via a LockAfter trait. Declare that level A comes before level B, and the compiler rejects any code that tries to acquire them in the wrong order.

Surelock extends this in a few ways: same-level multi-lock (which lock_tree can’t express), per-instance level assignment via new_higher, and scope-uniqueness enforcement. Surelock also deliberately drops lock_tree’s DAG-based ordering in favour of a strict total order — more on why below.

The Dual Mechanism Design

Surelock uses two mechanisms that cover complementary cases. Neither overlaps with the other:

Mechanism	When	Acquisition	Enforcement
`LockSet`	Multiple locks at the same level	Atomic (all-or-nothing)	Runtime sort by stable ID
Levels (`Level<N>`)	Locks at different levels	Incremental	Compile-time trait bounds

Same Level: `LockSet`

Every Mutex gets a unique, monotonically-increasing LockId from a global AtomicU64 counter at creation time. The ID lives inside the mutex and moves with it — no address stability needed.

When you need multiple locks at the same level, LockSet pre-sorts them by ID at construction time. Two threads requesting the same locks in opposite order both sort to the same acquisition order. No cycle can form.

let alice = Mutex::new("alice");
let bob = Mutex::new("bob");
 
// Regardless of argument order, acquisition order is deterministic
let set = LockSet::new((&alice, &bob));
 
key_handle.scope(|key| {
    let (a, b) = set.lock(key);
    // Both locked, no deadlock possible
});

Thread A                        Thread B
   │                               │
   ▼                               ▼
LockSet::new((&acct_1, &acct_2))  LockSet::new((&acct_2, &acct_1))
   │                               │
   ▼                               ▼
sorted: [acct_1, acct_2]          sorted: [acct_1, acct_2]
   │                               │
   ├─ takes acct_1 lock            │
   │                               ├─ waits for acct_1 lock
   ├─ takes acct_2 lock            │
   │                               │
 ~~~~~~~~~~~~~~~~~~~~~~~TIME PASSES~~~~~~~~~~~~~~~~~~~~~~~
   │                               │
   ├─ release acct_2               │
   │                               │ (still waiting for lock 1 first)
   ├─ release acct_1               │
   │                               ├─ takes acct_1 lock
   │                               ├─ takes acct_2 lock
   │                               │
   ▼                               ▼
   OK                              OK (no cycle possible)

Different Levels: Compile-Time Ordering

When locks represent different kinds of resources — say a config guard vs. a per-account lock — you assign them to different levels. Level<N> types with LockAfter trait bounds enforce strictly ascending acquisition:

use surelock::level::{Level, lock_scope};
type Config = Level<1>;
type Account = Level<2>;
 
let config: Mutex<_, Config> = Mutex::new(AppConfig::default());
let account: Mutex<_, Account> = Mutex::new(Balance(0));
 
lock_scope(|key| {
    let (cfg, key) = config.lock(key);  // Level 1 -> ok
    let (acct, key) = account.lock(key); // Level 2 after 1 -> ok
    // account.lock(key) then config.lock(key) -> compile error
});

The key (ha!) is MutexKey. It’s consumed by each lock() call and re-emitted at the new level. If you try to go backwards — trying to acquire a Level<3> followed by a Level<1> — the LockAfter bound isn’t satisfied and the compiler rejects it with a helpful (custom) error message.

Why a Total Order, Not a DAG?

This is a deliberate design decision. lock_tree uses a DAG, which lets you declare that branches A and B are independent — neither needs to come before the other. Sounds great, but it has a subtle problem: if thread 1 acquires A then B, and thread 2 acquires B then A, and both orderings are valid in the DAG, you have a deadlock that the compiler happily approved.

Locks are sorted in a total order by (Level, LockId). A total order is more restrictive but actually sound. If two locks participate in the system at all, their relative order is fixed. We could in concept linearise a DAG (much like we do in many op-based CRDTs), but that is harder to reason about and adds a lot of additional machinery.

Getting a Key

Both mechanisms — LockSet and levels — use MutexKey. You can think of the key as a kind of scope: it’s consumed by each lock() call and re-emitted at the new level. It’s branded with an invariant lifetime (same technique as std::thread::scope) so it can’t escape the closure, and it’s !Send so it stays on the thread that created it.

There are two ways to get one: implicit and explicit.

Implicit: `lock_scope`

I expect this is what most people will reach for. On std, lock_scope handles all the setup internally — you just write:

use surelock::{Mutex, lock_scope};
 
let data = Mutex::new(42);
 
lock_scope(|key| {
    let (guard, key) = data.lock(key);
    assert_eq!(*guard, 42);
});

Under the hood, lock_scope manages a thread_local! KeyHandle for you. The KeyHandle’s scope method takes &mut self, which means the borrow checker prevents nested scopes — you can’t start a new locking scope while one is already active.

Explicit: `Locksmith` and `KeyVoucher`

For no_std, embedded, or cases where you want tighter control over which core gets which capability, there’s a fully explicit chain:

Locksmith            Program-wide singleton (!Send)
    |
    v
KeyVoucher           Transferable to other threads (Send)
    |
    v
KeyHandle            Per-thread / claimed once (!Send)
    |
    v
MutexKey<'scope>     Branded lifetime, consumed-and-re-emitted
    |
    v
MutexGuard           RAII access to the data

The Locksmith is a singleton — enforced via AtomicBool, not convention. It is !Clone, !Copy, and importantly !Send as an extra layer of safety to prevent it from being duplicated among threads. It issues KeyVouchers which are Send, so you can distribute keys among threads during setup. A voucher gets redeemed on its destination thread for a KeyHandle, which is !Send and stays put.

This might sound like a lot of ceremony, but it’s mainly type machinery to enforce safety invariants. On bare metal with no thread_local!, you want this level of control — you’re deciding at init time exactly which core gets a key, and the type system makes sure no one else can conjure one out of thin air.

Diagrams

Here is an example diagrammed out. You’ll notice that the lock levels are in a strict compile-time sequence, but the lock IDs — being assigned at runtime — show up in arbitrary order. This is completely fine; they merely need to be consistent across all callers.

Here we start with the root key that can lock any level (“above level 0”). We acquire locks across two levels, sort them by ID, and acquire them in that order (for the reasons described in the timeline earlier). This consumes the root key, and emits a key that can lock above level 2. In the future, we can lock level 3 and onwards, until we unlock and recover the earlier keys.

                                                Key{Level > 0}
                                                       │
                                                       │
┌─────────Level 1───────────┐                          ▼
│                           │             ┌─────────LockSet────────┐
│                           │             │                        │
│   lock_1                  │             │                        │
│                lock_3 ────┼─────────────┼────▶ lvl_1-lock_3 ─────┼────────▶ guard-3
│                           │             │                        │
│        lock_8 ────────────┼─────────────┼────▶ lvl_1-lock_8 ─────┼────────▶ guard-8
│                           │             │                        │
└───────────────────────────┘             │                        │
                                          │                        │
┌─────────Level 2───────────┐             │                        │
│                           │             │                        │
│                           │             │                        │
│                lock_5 ────┼─────────────┼────▶ lvl_2-lock_5 ─────┼────────▶ guard-5
│                           │             │                        │
│    lock_7                 │             │                        │
│                           │             │                        │
└───────────────────────────┘             └────────────────────────┘
                                                       │
                                                       │
                                                       ▼
                                                Key{Level > 2}
                                                       │
                                                       │
                                                       ▼
┌─────────Level 3───────────┐              ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
│                           │                                       │
│                           │              │
│                  lock_2 ─ ┼ ─ ─ ─ ─ ─ ─▶                          │
│  lock_4                   │              │
│                           │                                       │
│          lock_6           │              │
│                           │                                       │
└───────────────────────────┘              └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─

`no_std` and Embedded

The crate is #![no_std] at its core, requiring only alloc. On std, you get StdMutex as the default backend and lock_scope as a convenient entry point. On no_std, you bring your own backend — anything implementing lock_api::RawMutex works via a blanket impl behind a feature flag.

It also supports targets without native CAS (e.g. Cortex-M0) via portable-atomic and critical-section features. I think this matters because the places where you most need deadlock freedom — bare-metal firmware with no debugger attached — are exactly the places where current tooling helps you least.

Escape Hatch

Real-world code sometimes needs to break the rules. Surelock provides Mutex::unchecked_lock() behind the escape-hatch feature flag. It’s feature-gated so that if you use it, it’s visible in Cargo.toml, it’s grep-able, and it stands out in code review. IMO this is better than the alternative: developers silently reaching for parking_lot::Mutex alongside their surelock mutexes and defeating the whole system.

Wrap Up

Deadlocks are a solved problem in theory — we’ve known how to prevent them since 1971. The challenge is making that prevention ergonomic enough that people actually use it. Surelock is my attempt at that: lean into Rust’s type system to make the correct thing the easy thing, and make the wrong thing a compiler error.

The crate is on Codeberg and published to crates.io. Dual-licensed MIT/Apache-2.0. I’d love feedback, especially from folks working on embedded or no_std targets — that’s where I’ve had the least real-world testing so far.

Footnotes

With one exception: acquring the SAME mutex atomically more than once. I’m not happy that this edge case exists, but also it’s not common that you would try to take the same mutex alongside itself. ↩
thiserror is the sole runtime dependency, which is zero-cost at runtime. The default std backend wraps std::sync::Mutex directly. ↩
Technically affine in Rust’s type system (can be dropped but not duplicated), but the intent is linear — you’re expected to use it, and the API is designed so that dropping it early just ends the scope. ↩

Hacker Times

Hacker Times

Surelock: Deadlock-Free Mutexes for Rust

Discussion

Discussion

TL;DR

Why Not Just Be Careful?

Prior Art

`happylock`

`lock_tree`

The Dual Mechanism Design

Same Level: `LockSet`

Different Levels: Compile-Time Ordering

Why a Total Order, Not a DAG?

Getting a Key

Implicit: `lock_scope`

Explicit: `Locksmith` and `KeyVoucher`

Diagrams

`no_std` and Embedded

Escape Hatch

Wrap Up

Footnotes

Hacker Times

Hacker Times

Surelock: Deadlock-Free Mutexes for Rust

Discussion

Discussion

TL;DR

Why Not Just Be Careful?

Prior Art

happylock

lock_tree

The Dual Mechanism Design

Same Level: LockSet

Different Levels: Compile-Time Ordering

Why a Total Order, Not a DAG?

Getting a Key

Implicit: lock_scope

Explicit: Locksmith and KeyVoucher

Diagrams

no_std and Embedded

Escape Hatch

Wrap Up

Footnotes

`happylock`

`lock_tree`

Same Level: `LockSet`

Implicit: `lock_scope`

Explicit: `Locksmith` and `KeyVoucher`

`no_std` and Embedded