I've been trying to figure out a compile-time approach to that problem. Here's an early version of a tech note on that.[1] It looks possible, but there are many issues. But it's important to look at now. The C++ people are trying to figure out safe backlinks.
[1] https://github.com/John-Nagle/technotes/blob/main/docs/rust/...
If your goal is to not do refcounting, You can tie compile time guaranteed access on a 'parent getter' (but only ever as a non-mut reference) by saying:
- parent element MUST be present at time of construction
- When doing a 'parent get' call, you MUST still prove you're in the parent's lifetime.
This can be done at compile time at 0 runtime cost. The problem is, that these constraints drastically reduce the usability of those back-links.
There is no compiler solution in Rust. Rust lifetimes form a DAG of (overlapping) ranges. (in terms that some leaf nodes are inductive proofs)
To relax those 2 constraints require a bi-directional graph, and I'm not sure if that can ever be computed in a reasonable time.
When we compare one method of early binding with another, it's probably going to be a comparison of the granularity. Arenas are "stupid simple" - they partition out memory into chunks that limit the blast radius of error, but you still make the error. Ownership logic is a "picky generalization" - it lets you be more intricate and gain some assurances if you put up with the procedures necessary, but it starts to inhibit certain uses because its view into usage is too narrow with too many corner cases.
If we take Go's philosophy as an example of "what do you do if you want idioms for a really scalable codebase" - though you can certainly argue that it didn't work - it's that you usually want to lean on the stupid simple stuff and customize what you need for the rest. You don't opt into a picky Swiss Army Knife unless there's a specific problem to solve with it. Larger codebases have proportionately smaller sections that demand intricacy because more and more of the code is going to itself be a method of defining late binding, of configuring and deferring parts of processing.
That said, Rust lets you fall back to "stupid simple", it just doesn't pave the way to make that the default.
The ability for crates to specify whether they use certain features of Rust:
- unsafe
- arena allocation or other future "useful" stuff
- proc_macros, eg. serde (a huge compile speed hit)
- deep transitive dependency (total #, or depth)
- some proxy for the "number of compiler steps" (compiler workload complexity)
We should be able to set a crate-wide or workspace-wide policy that can ban other crates that violate these rules.
We could make this hand-written early on, but eventually enforced by the compiler. You could then easily guarantee that no downstream dependency uses unsafe code, has a crazy number of dependencies, or has crazy compile times.
You could opt-in or opt-out to these things without having to think too much about them or constantly (manually) be on the lookout.
This would be hugely beneficial to the Rust ecosystem, safety, code quality, and developer productivity / happiness.
> Given that the arena will still be subject to array bounds checking, handle bugs won't allow an attacker to overwrite arbitrary memory the way pointer bugs do.
Just because you're in-bounds of the arena, doesn't mean you're not stomping on in-bounds neighboring data -- which can introduce "nondeterminism" as the author defines it. These are hand-waving arguments.
Curious to know in the Linux context whether moves away from doubly linked lists have ever been tested?
This and sibling comments about the supposed uselessness or outdatedness of linked lists taught me something about the disconnect between many systems engineers resistant to Rust and many members of the (evangelical) Rust community.
To address the technical topic first: a doubly linked list is what you reach for when you’re looking for an intrusive data structure with fast append and fast delete. You need to iterate over it, but don’t need random access. Queues where you may need to remove elements. Think a list of completions that can be cancelled.
Why an intrusive data structure? Because you don’t want to, or can’t allocate. Because you want only a non-owning reference to the object, and you don’t want to have to manage any other lifetimes or dynamic (potentially unbounded, or arbitrarily bounded) allocations. Intrusive data structures are a great tool to minimize allocations, and to keep allocations (and ownership) logically separated across modules.
And now onto the cultural topic. That a member of the Rust community might not know this (which is of course totally fine) is what taught me something, which was maybe obvious to many: Rust brings safety to a C++ style language and audience. For the same reasons many dislike C++, many dislike Rust. For the same reasons many would opt for C over C++ when they have the choice (either across all their projects or for certain projects), many continue to opt for C over Rust.
Rust will not be taking over systems programming for the same reasons C++ did not. Well, by the numbers, it probably did (and Rust probably will too), so maybe better to say that it will not stamp out C for systems programming for the same reasons C++ did not. And it’s not just because of the stubbornness of C programmers.
Systems programming has 2 cultures and practice groups: the C camp, and the C++ camp. The C++ camp is a big tent. I’d argue it includes Rust and Swift (and maybe D). Zig belongs to the C camp.
Safety is important, and maybe Rust has the right model for it, but it is not a solution for the C camp. It likely did not intend to be, and again, by the numbers, replacing C++ is probably the more important task.
I thought the term "Arena" referred to linear allocators, but maybe it's not so narrowly defined.
> At one level, this question is not very fair, because an answer to it as stated would be that one simply does not use doubly linked lists. They have been popular in introductory computer science lectures because they are a neat way to explain pointers in a data structure that's easy to draw on a whiteboard, but they are not a good match to modern hardware. The last time I used one was in the nineties. I know the Linux kernel uses them, but that design was also laid down in the nineties; if you were designing a kernel from scratch today, you would probably not do it that way.
The arena data structure you described is inefficient unless it uses a linked list to track empty slots. All general-purpose heap allocators use linked lists in their implementations. Linked lists show up wherever you want pointer stability, low fragmentation, or a way to decouple the storage of objects from their participation in data structures. I struggle to imagine how it would be possible to implement something like Linux without at least one linked list in it.
> Essentially you are bypassing the notion of pointers provided directly by the hardware
Pointers aren't a hardware concept, they're a language concept. Array indices and pointers both rely on indirect addressing, which is the underlying hardware concept. The "handles" strategy in Rust feels like a kludgy approach not because it bypasses the hardware but because Rust's borrow checker isn't actually involved in ensuring safety in that case, just its bounds checking and mandatory initialization (and if all you need is those two things then...).
- Single ownership with back references that never outlive the owning reference
This requires a construct that manages the forward and back references together, preserving the invariant that A->B->A. It's also necessary to insure that no non-weak back reference from B to A exists when A is dropped.
- True multiple ownership with weak back references
This is the hard case. It might be solveable for complicated situations. See the example in my tech note that links to a Rust Playground file.[1] That's not a contrived example; it's code in actual use. It builds a set where all elements of the set can find the set itself. If two sets are merged (union), all elements move to one set struct, and the other now-empty set struct is discarded. All the back references are maintained during this operation.
Most back references seem to fall into one of those two cases. In fact, the second includes the first. The first is so common that it's worth handling separately.
The core problem is finding an approach which is 1) powerful enough to be useful on real code, and 2) not too hard to check at compile time. The second case requires some restrictions on coding style.
The preferred coding style is short borrows with very narrow scope, typically within one statement. That's normally considered bad, because it means extra time checking for double borrows. But if this kind of borrow can be made compile-time, it's free. The compiler is going to have to check for overlapping borrow scopes, and the more local the borrow scope, the less compile time work is involved. Storing a borrowed link in a structure or a static requires too much compile-time analysis, but if the lifetime of the borrow is very local, the analysis isn't hard. Borrow scopes which include a function call imply the need for cross-function call tree analysis. It's been pointed out to me that traits and generics make this especially hard, because you don't know what the instantiation can do before generics are instantiated. That might require some kind of annotation that says to the compiler "Implementations of function foo for trait Foobar are not allowed to do an upgrade of an Rc<Bar>". Then, at the call, the compiler can tell it's OK for a borrow scope to include that call, and in an implementation of the called function, upgrade of Rc<Bar> is prohibited. In fact, if this is done for all functions that receive an Rc<Bar> parameter, it might be possible to do all this without call tree analysis. It's becomes a type constraint. Need to talk to the type theorists, of which I am not one.
All those .borrow() calls are annoying. It might be possible to elide them most of the time, in the same way that you can refer to the item within an Rc without explicitly de-referencing the Rc. That would be nice from an ergonomic perspective. But it runs the risk of generating confusing compiler messages when a problem is detected. It would be like those confusing error messages that arise when type inference can't figure something out or the borrow checker needs explicit lifetime info to resolve an ambiguity.
[1] https://play.rust-lang.org/?version=stable&mode=debug&editio...
Like if your arena contains both a buffer that OOB's and a buffer passed to a syscall, then having memory safety around the arena doesn't help you
And I agree, a custom arena is usually the best option here. Even though it seems like "giving up" and going back to C, it's still way better.
There's this great comparison of Arenas in Rust: https://donsz.nl/blog/arenas/
Some of them have keys that can't be reused too, so you can avoid "use after free" style bugs (at runtime) which is way better than C pointer wrangling.
It's not perfect though. I kind of wish Rust did have some kind of support for this stuff natively. Pretty low on my list of Rust desires though - way behind better compile time and fewer async footguns.
Have people been thinking about that?
It competes against languages that give you memory safety by having garbage collection, or I guess even by using arenas in a few cases. GC comes with a performance hit, but a bit of a performance hit is usually much easier to deal with than having a smaller pool of effective developers
Worse is better
Here's a situation where a traditional pointer-based double-linked list is fine: when the payload is very large (e.g., an entire document in some app).
Actually, it is correct. The more generally you use "arenas" the more they are subject to the same kinds of issues as other manual memory management systems. "arenas" can only avoid/reduce the "nondeterminism" and "security" issues of general manual memory management on top of a buffer of bytes by not becoming general manual memory management on top of a buffer of bytes.
I understand this defeats the point of using Rust. But if the argument is "I'm using C because I actually need doubly linked lists and Rust makes that hard/slow", well C is also unsafe, so using Rust with an unsafe block for your doubly linked list seems like an improvement.
I feel like at least part of the reason not to do this is "because the Rust community will yell at you." Well, maybe they shouldn't do that? If you were using C no one would care, so switching to unsafe Rust shouldn't imply the sky is falling.
1. Implementing a linked list is a one and done deal and should be included in a (standard) library.
2. Would a classic pointer based implementation of a linked list in C really be a cause of security vulnerability??
[workspace.lints.rust]
unsafe_code = "forbid"
https://doc.rust-lang.org/rustc/lints/index.htmlPerhaps what would be useful is crates.io determining through analysis which in the set of all lint(s) a crate is compatible with and exposing that as automatic metadata?
The two times I've had to deal with a really tough debugging problem in Rust, it's been related to some crate which did that.
I certainly appreciate the performance benefits of custom allocators but it's disheartening to see people naively adopting them just as the world's allocators are starting to get their acts together with regards to security.
Leaf pages in InnoDB’s B+tree implementation is the big one I can think of. Turns out being able to quickly walk pages for a range scan is helpful.
I agree that there isn’t wide variety of applications where they jump out as being the best choice. So, yeah, “approximately useless” sounds about right.
I don't see how you would do that in safe rust though. The point of the arena would just be to solve the lifetime issue. It just moves the ownership problem one level higher. Instead of having circular ownership in a linked list, all linked list nodes are owned by the arena and hence have the same lifetime as the arena.
The hardest problems I think is that this requires a fundamental change to the language that something might be already borrowed just by existing. It also likely needs to introduce a notion of "stable" places that will not move when you move a prefix of them, e.g. to allow moving a `Box` holding self-referential data. In other words, there's a lot of bikeshedding to do.
A builtin language feature will for sure be more convenient (and also probably more performant, as most of those crates require an allocation) but it's just really hard to design. Even in very old self-referential crates soundness issues are still being discovered to this day. So it will need to a lot of time and care to design.
And another thing is with current coding agents, like the gpt-5-codex, Rust really shines here. The agent can easily guide you through the first borrow checker issues, and the compiler helps the agent to write good code.
Source: been writing Rust for a decade now. It is easily the best language I've ever used, still.
- Insert a new item
- Delete a specific item (that you have a pointer to
- Move a specific item from list A to list B
- Get the next item to work on
And where pointers to an item need to be stable. Doubly-linked lists are very fast for all of that; their main downside is they are slow to iterate over due to cache incoherency but that doesn't matter if you very rarely or never iterate through the entire list. And since you need to have stable pointers to items in the list that don't change when you insert/remove items, most other data structures would need an extra level of indirection so you lose the cache coherency benefit anyways.
And from its tracking issue [1], it seems there's alternatives with really important trade-offs (especially the very interesting storage API alternative), and it'd be really sad if we got stuck with something that ends up being unusable in many cases people actually want to use them on.
I'm aware of the reasons Linux uses doubly linked lists. I'm still of the opinion this design decision made a lot of sense in the 1990s, and would make less sense in a new project today. You may disagree, and that's fine! Engineering is constrained by hard truths, but within those constraints, is full of judgment calls.
I'm not a member of the evangelical Rust community. I'm not proclaiming 'thou shalt rewrite it all in Rust'. Maybe you have good reasons to use something else. That's fine.
But if you are considering Rust, and are concerned about its ability to handle cyclic data structures, there are ways to do it, that don't involve throwing out all the benefits the language offers.
As for the C camp, I agree it's different. The problem is that we don't know a way to design memory safe GC-free language without it being big and complex. Maybe it is possible. But until we figure out how, projects that need to be memory safe (which I believe is the vast majority of projects, although not all of them) will probably use Rust (and probably should use Rust), even if they would prefer to be pure C, because memory safety is just more important.
If you really wanted this in Rust, you could probably get away with just initialising a Vec::with_capacity(), then have each element’s next and prev be indexes into the vec.
Option<Rc<RefCell<Node<T>>>>
works, but you have to manage your own teardown of the list. If you use Weak, releasing the strong reference to the head of the list tears down the whole list.> The next step of course is to write your own equivalents of malloc and free to allocate and deallocate objects within the arena.
A logic bug in that "allocator" - which are plain indices and not covered by borrow checking - could 100% stomp memory.
It’s a pretty strong value proposition, but it doesn’t mean there aren’t good reasons to pick a GC language if you can afford the performance hit.
The list of reasons to pick C++ over Rust is very short, mostly momentum, recruitment, and legacy. For greenfield projects, the list is even shorter.
The general experience is that developing software in Rust is slightly more expensive compared with, say, C#, and way, way cheaper than C++.
I wouldn’t use it where I could get away with garbage collection (nor C/C++) but that still leaves enough space for Rust to exist in.
* The performance difference can be pretty massive depending on scenarios. Sure, most programs don't need it, but still * Rust has not only memory safety but full safety around it. For instance, in Java if you iterate over a list while mutating it you'll get a runtime error. In Rust, it won't compile, preventing you from a possible bug later on * The time you spend learning Rust is so much time saved afterwards because the language is specifically designed to reduce the risk of programming errors. Many things like the absence of a `null` value like we have in Java is an immense time saver when your program gets bigger
Normal pointers cause UB. That’s astronomically bad. If your arena pointers crash your program in a well-defined way, that’s already a much better situation.
Now I learn that linked lists aren't possible and one has to use some, frankly, bullshit scheme to pretend to have them. IMO the documentation isn't really preparing people for the new reality of what can be done and how to do things.
An extrusive list needs at least 1 more pointer (to the item, separate from the link node), and possibly an additional backpointer to the link node when you want to unlink that node. It also adds allocation overhead and cache misses.
Intrusive lists are one of the few essential tools to achieve performance and low latency.
Or were you thinking of dynamically reallocating vectors? They are not an alternative, they are almost completely unusable in hardcore systems programming. Reallocating destroys pointer stability and adds latency, both very bad for concurrency.
As for "serious code", I admit that Rust is maybe better for low-level security-critical stuff. But for native applications it's on par thanks to smart pointers and also just -fsanitize=address.
(also default copy semantics just seem more intuitive i dunno why)
I'm not convinced not using `Weak` is better, though.
[Citation needed]
On the technical point, I think I do disagree, but open to changing my mind. What would be better? I’m working on an async runtime currently, written in C, and I’m using several intrusive doubly linked lists because of their properties I mentioned.
By the way, unsafe doesn’t defeat the purpose of rust. The entire point of rust is that it lets you build safe abstractions on top of unsafe code, not that it eliminates unsafe code entirely.
They are just an example of the borrow checker’s limitations, which mean that you can only form a DAG of mutable references. But that’s why we have `unsafe`.
But, Rust already provides a linked list in it's standard library: https://doc.rust-lang.org/std/collections/struct.LinkedList.... So a Rust programmer might say there is no need use unsafe if you want a linked list. There is a "source" link on that page, so it isn't hard to see how it's done. Yes, it does use "unsafe".
Where it gets a little complex is a C programmer will look at that implementation and scoff at the linked list the Rust standard library provides. Amazingly the C and Rust programmers agree that this implementation is mostly useless. We know that because the Rust documentation says: "NOTE: It is almost always better to use Vec or VecDeque because array-based containers are generally faster, more memory efficient, and make better use of CPU cache [than this Rust linked list]."
When a C programmer uses a linked list, he almost invariably uses what someone above termed above an "intrusive" implementation. An intrusive implementation is typically faster, more memory efficient, and makes just as good use of the CPU cache as the Rust Vec or VecDeque. An intrusive implementation reserves a bit of memory inside of the object you want to put in a linked list for the linked list code. That memory is typically "owned" by the linked list code, as in it manages it's lifetime, whereas the lifetime of the object the memory is in is managed by some other mechanism. For example, if you delete some unrelated item it list, it might have to reach in and modify the linked list pointer in this item. But Rust has a simple memory model: if you modify an object you have to prove you own it. The linked list code reaching into another object can not do that.
That's one example, but intrusive linked lists break Rust's ownership and borrowing rules in so many ways, they are like oil and water. As far as I can tell there is no way to provide an intrusive linked list library in Rust that is safe to use, which is to say all code that uses it would have to be flagged as unsafe too.
This probably raises as more questions that it answered, and you are probably wondering how could two implementations of the same thing (a linked list) be so different. I'm not going to go down that rabbit hole, other than to remark it's clear a lot of Rust programmers commenting here don't understand the flexibility the intrusive method gives you. But their instincts are right about one thing: they are dangerous. It's the sort of danger a C programmer is entirely comfortable with, but that same danger is what has lead to C programmers being responsible for a of CVE's. They are right about another thing too: if you put a bit of thought into it, you can usually come up with something that is for all practical purposes just as fast, and the Rust compiler can prove is safe. Sometimes though, "a bit of thought" is an understatement, and sometimes it feels like you are in a straight jacket.
The article code be read is someone straining against that straight jacket, doing a lot of thinking and coming up with what a C programmer would call a fairly ugly (and slower) solution. The old C programmer in me says that's a fair summary. But I also have to acknowledge it is a very safe solution.
I say this having years of coding background in languages like C, C#, Java, JavaScript, Python, etc.
As to what would be better - this is also a reply to your sibling comments above - I don't have a single across-the-board solution; the equivalent of std::vector everywhere is fine for some kinds of application code, but not necessarily for system code. Instead, I would start by asking questions.
What kinds of entities are you dealing with, what kinds of collections, and, critically, how many entities along each dimension, to an order of magnitude, p50 and p99? What are your typical access patterns? What are your use cases, so that you can figure out what figures of merit to optimize for? How unpredictable will be the adding of more use cases in the future?
In most kinds of application code, it's okay to just go for big-O, but for performance critical system code, you also need to care about constant factors. As an intuition primer, how many bytes can you memcpy in the time it takes for one cache miss? If your intuition for performance was trained in the eighties and nineties, as mine initially was, the answer may be larger than you expect.
C and Zig have a big advantage for writing maintainable, easy to read unsafe code over unsafe Rust.
> As far as I can tell there is no way to provide an intrusive linked list library in Rust that is safe to use, which is to say all code that uses it would have to be flagged as unsafe too.
But if you need a linked list, would that be so bad? Again, the alternative is C where literally the whole program is unsafe...
All true.
> and has poorly defined rules.
The rules aren't so poorly documented. The options for working with/around them, like say using UnsafeCell, do seem to be scattered across the internet.
But you omitted a key point: unlike C and Zig, unsafe code in Rust is contained. In the Rust std library for example, there are 35k functions, 7.5k are unsafe. In C or Zig, all 35k would be unsafe. If you are claiming those 35k unsafe lines in C or Zig would be easier to maintain safely than those 7.5k lines of unsafe Rust, I'd disagree.
And if you're trying to trade away good big-O for better cache locality, don't forget that in many cases, you're dealing with stateful objects that need to be put into the list. That means you likely need to have a list or queue of pointers to these objects. And no matter how flat or cache-friendly the queue is, adding this indirection is similarly cache-unfriendly whenever you have to actually access the state inside the container.
> as this is often the sign of a bug.
Why? I've just written that yesterday, I had never a problem with that. For example after deleting an element from an array I need to commit this change, by adjusting the size, etc. Why is it so unexpected that I also need to adjust the index? The index needs to be modified anyway.
Where is the notion coming from that this is less preferable than the workarounds, like marking some elements for deletion and then iterating a second time or by making a second container and moving everything that shouldn't be deleted (.filter) ?
As far as I can see, you are indeed going to incur one extra memory access apart from the object itself, for any design other than just 'Temporarily flag the object deleted, sweep deleted objects in bulk later' (which would only be good if deleted objects are uncommon). It still matters how many extra memory accesses; deleting an object from a doubly linked list accesses two other objects.
It also matters somewhat how many cache lines each object takes up. I say 'somewhat' because even if an object is bulky, you might be able to arrange it so that the most commonly accessed fields fit in one or two cache lines at the beginning.
The answer from a Rust programmers perspective is: yes, it's bad.
Rust's secret sauce is if you don't use unsafe, you have a memory safe language. In fact it's better than that. Due to its strong type system "safe Rust" is safer than almost all languages, including Java and the popular interpreted languages like Python, the one notable exception being Ada spark.
The problem with this rosy picture is unsafe Rust. It's harder to get right than C (which 100% unsafe), and you can't write a non-trivial Rust program without at least calling unsafe code. That means there is a trade-off going on. If Rust can keep the amount of unsafe code small and bounded, it wins over 100% unsafe C. If it can't, well you may as well use C.
Today's Rust does manage to keep unsafe code bounded. Just about any program can be written using purely safe code by using Rusts standard library ("std"). Std itself does contain a high'ish percentage of unsafe code (about 20%), but crucially a programmer using std never needs to use unsafe if they are prepared to make the occasional speed tradeoff (as could have been done in the article). Thus Rust + std + the rule "no unsafe", makes for a perfectly memory safe programming environment.
Based on that, a pact has now arisen in the Rust community. You are allowed to write libraries like "std" that use unsafe, but normal usage of those libraries must never require the caller to use unsafe code. The pact ensures unsafe code doesn't spread like a virus, and so Rust will always maintain it's safety advantage over C for your average programmer.
But, it's not possible to write an intrusive implementation of a linked list that doesn't required the caller to use unsafe. Therefore, any such library would be ostracised by the Rust community.
And sometimes it's a simple bug where you never meant to modify the collection.
Compared to C the situation is outlandishly, even hellishly impossible to understand. If I can't understand this one thing then I feel there's no point in continuing so I must stay and battle it till I get it or give up completely. I don't think I've ever hit anything like this in any other language I've ever learned.
> And sometimes it's a simple bug where you never meant to modify the collection.
How do you accidentally modify an array, which would be different from any other variable?
Using one when you don't need to would be considered bad, but unsafe exists to be used when necessary. Nobody shuns unsafe code purely for it being unsafe code.
You can also create a struct, put your "global" variable inside it and then put all the functions that need the variable into an Impl block of that struct. If you then add the parameter `&self` to these functions, you can access the "global"variable any time via `self.global_variable`.
If that is not enough, then you can always make an actual global variable by first wrapping it in a Mutex, to prevent simultaneous access and then wrapping that in an Arc for Atomic Reference Counting. That allows you to pass "copies" of that variable around anywhere, satisfying the borrow-checker (since the variable is now reference-counted in a thread-safe way).
If you need a lot of parallel reading, replacing the Mutex with an RwLock is a good idea, since it allows locking from multiple threads, if you want to read it in most cases.
Maybe if you provided the example, someone might provide an insight. I hate to suggest this - but have you asked an AI? They've seen a lot of code. If you give an AI enough context it will likely cough up examples of how others have solved it.
> How do you accidentally modify an array, which would be different from any other variable?
Like you cause any other simple bug. When there are tons of things and it's not a simple variable. Or even just a typo. That also happens.
For the others he's abstracted the unsafe calls to just cursor_from_ptr and cursor_mut_from_ptr. I suspect they would see heavy use, but he's reduced the safety guarantees you have to prove to the absolute minimum. Colour me impressed. I wish I had thought of that.
I might become the 5,243,766 + 1 user.
As for this:
> Using one when you don't need to would be considered bad, but unsafe exists to be used when necessary. Nobody shuns unsafe code purely for it being unsafe code.
Hmmm. You've just contradicted yourself. You've just said if you were given a safe and unsafe implementation of the same thing, you would shun the unsafe version. Than go on to say "Nobody shuns unsafe code purely for it being unsafe code".
But if you want a another example, try: https://news.ycombinator.com/item?id=45516255. Two of his three reasons are easily solved with unsafe.
I don't want to waste your time. What I'm doing now is just trying to make global variables with builtin data types like String. This takes away some of the options for error. My idea is to get this working before I try to do it with my own data structure:
So with a simpler case where I'm making a global string instead
// ....
static mut HEAP_INSTANCE : OnceLock<String> = OnceLock::new();
#[rocket::main]
async fn main() -> Result<(), rocket::Error> {
HEAP_INSTANCE.set("blah".to_string());
^^^^^^^^^^^^^ use of mutable static
let _rocket = rocket::build()
.mount("/", routes![min,create_heap])
.launch()
.await?;
Ok(())
}
As you can see it's an error to use a static mut. I realise that this thing isn't safe in a multi-threaded program but I had hoped that I might ignore that in my single-threaded program or add some kind of locking container that would make it acceptable to the compiler.If I try to use my heap data structure then I start having a multitude of issues, the most common being that I create something which is owned by a scope that disappears too soon. IOW I need to initialise the global with a structure that has a 'static lifetime and I'm doing that in a context which definitely isn't static. i.e. I get "creates a temporary value which is freed while still in use" or something similar
While writing this reply I might have solved my issue:
//static mut HEAP_INSTANCE : OnceLock<String> = OnceLock::new();
static HEAP_INSTANCE : OnceLock<Box<Heap<String>>> = OnceLock::new();
async fn main() -> Result<(), rocket::Error> {
let heap = Box::new(Heap::new(100));
let _ = HEAP_INSTANCE.set(heap);
// ....
}
I don't fully understand why I don't need the mut and I don't know if I really need a Box or not but at least this compiles.Thank you for helping me to duck debug! :-D
> Like you cause any other simple bug.
Yes, but we don't forbid modifying other things, just because you could modify them by accident, because you want to modify them some of the time. Why does a[i] = ... needs to be prevented, but a = ... is fine?
No, I did not. It is true that nobody shuns unsafe code for simply being unsafe. Choosing a safe alternative over an unsafe one means that there is an alternative. Unsafe code to do the job that unsafe code is meant to do is totally fine. That’s the reason it exists!
That's no longer a simple iterator; it's a collection wrapper that represents in-iteration collection. It can be useful, and it is possible to write! But I don't think this is what programming languages should offer as their default iterator. Also, how do you solve the problem of mutation done through the collection without involving the iterator?
> Yes, but we don't forbid modifying other things, just because you could modify them by accident, because you want to modify them some of the time. Why does a[i] = ... needs to be prevented, but a = ... is fine?
I agree, this is not a strong reason on its own, but it strengthen the main reason.
I am not using a language with iterators in daily work, but that doesn't sound like a real problem to me. The iterator is already provided by the container type and the container already supports removing things:
iter<T>::drop_current () {
this->container.remove_at (this->index);
this->index--;
}
> Also, how do you solve the problem of mutation done through the collection without involving the iteratorSame like you solve mutation done to the container while the iterator exists, when the iterator doesn't allow mutating. That problem already exists.
I fail to see how mutating a container while iterating is more likely to be a bug, than mutating any other thing.
August 5, 2025

Photo by Ant Rozetsky on Unsplash
A fast way to start an argument in a room full of programmers: “How do you implement a doubly linked list in Rust?”
At one level, this question is not very fair, because an answer to it as stated would be that one simply does not use doubly linked lists. They have been popular in introductory computer science lectures because they are a neat way to explain pointers in a data structure that's easy to draw on a whiteboard, but they are not a good match to modern hardware. The last time I used one was in the nineties. I know the Linux kernel uses them, but that design was also laid down in the nineties; if you were designing a kernel from scratch today, you would probably not do it that way.
At another level, it is fair because it is a simple, familiar proxy for all data structures with any kind of circular references. Consider a compiler holding a set of modules that may refer to each other. Or a game where objects may refer to their container. Or a graphic user interface where widgets may refer to a parent window. It might be reasonable to say some particular such structure is not the best solution in some particular case, but it is not reasonable to say that about all such structures in all cases.
And they are tricky in Rust because the language is founded on the idea that memory management should in general, be done by ownership. Essentially, Rust is an answer to the question, what happens if you start with C++ and encode the common ownership/RAII design patterns into the type system so fallible human brains don't need to enforce them. (And while we're at it, drop some of the legacy C baggage. And sprinkle in some features from ML family languages. And... okay, it's never really just one thing. But memory management is the focus here.) Circular references don't necessarily break ownership, but they do break the ability of the type system to keep track of it.
There are several ways to solve this problem. One way is to avoid using direct references to the particular class of objects at all. Instead, allocate a big array of objects, and refer to them with integer indexes into that array. There are several names that have been used for such arrays and indexes; let's call them arenas and handles.
At this point, smart programmers will spot what's going on. Essentially you are bypassing the notion of pointers provided directly by the hardware, only to reimplement your own address space, and your own notion of pointers within it. The next step of course is to write your own equivalents of malloc and free to allocate and deallocate objects within the arena.
Doesn't that mean you are throwing out all the memory safety properties you were hoping to achieve by using Rust in the first place? Wouldn't you be as well off to just go back to C and at least be honest about the fact that you have reverted to purely manual memory management?
That argument sounds logical, but it's not actually correct.
Consider: why were we so scared of memory unsafety in the first place? What's so bad about memory safety bugs that it was considered worth inventing a whole new language to fix them? (Yes, as I mentioned earlier, Rust has some other nice properties, but those would not have sufficed to drive a movement to replace C++ with a new language. They are bonuses. Memory safety was the driving force behind Rust.)
Memory safety bugs have two properties that make them scarier than most other kinds.
An array overflow, or use after free, is likely to manifest as an intermittent crash with no clear connection to the cause. Try to reproduce the problem in a debug build, maybe it goes away. Recompile with a couple of extra logging statements, maybe the crash goes away. Rerun the same binary with the same inputs, and thanks to ASLR, maybe the crash goes away. Maybe one day it shows a minute after the triggering cause, maybe another day it shows an hour after, or not at all.
Handles are deterministic. If a bug made your program crash on the last run, it will crash the same way on this run.
This one is even bigger.
If there exists an input that will cause a given C program to crash with a segmentation fault, what's the probability there exists another input that will allow remote code execution? In practice, the answer tends to be high, more than even expert intuition started off expecting.
And if the program is, say, a web browser, that can be bad. (It's not a coincidence Rust was invented by a browser company.)
There was a time you could say, okay but that only applies to the special category of security-critical programs. But these days (setting aside little scripts for personal use, assuming we are talking about published software), it's programs that don't have to cope with adversarial input, that are the special and shrinking category. This is true even of embedded systems; it's rare nowadays to find a smart device that doesn't expect to be connected to the Internet.
Given that the arena will still be subject to array bounds checking, handle bugs won't allow an attacker to overwrite arbitrary memory the way pointer bugs do. So using handles for your memory management, preserves the property, that bugs in your Rust code may lead to denial of service, but they are much less likely to lead to remote code execution.
And that is why, though arena memory management is isomorphic to old-fashioned manual memory management, it does not keep the particularly bad failure modes that motivated the move to Rust in the first place.