Anti-pattern: Regarding the 2023 e-book edition, I do not see a way to buy it from the site, or even a link to buy.
I’ve seen a lot of threads here and on reddit where people were arguing about terminology purely because of this book alone.
By that definition, C++ code has garbage collection if it uses std::shared_ptr, going against widespread common usage of the term “garbage collected programming language” which specifically contrasts manual languages like C++ or Rust against garbage collected ones.
“Automatic Memory Management” is a lot more suitable description to what programmers have to do to manage memory; it is now in the title but still hasn’t become the primary term.
A: Copying Garbage Collector (semi space). Chapter 4!
Great book. I was always fascinated by bakers treadmill. Always wanted a real world case where I could implement one with Fibonacci sized mills.
When the explicit "free" was invented, automatic memory reclamation while avoiding the non-determinism of garbage collectors had already been known for 4 years, since 1960, when another IBM employee had invented reference counting (as a reaction to the garbage collector of LISP I).
When implemented naively, reference counting has some disadvantages, but those can be circumvented relatively easily in an optimized implementation. The book discussed in the parent article also has a chapter about reference counting.
I have written C programs for many decades, but I have never invoked "free" directly, because I have always used reference counts. I have never encountered a circumstance when I would have wanted to invoke "free" directly.
C has the disadvantage that the compiler will not do implicitly things like virtual function invocation, reference counts handling etc. but any such techniques that are provided by higher-level languages can still be used in a language like C, even if they require more boilerplate code.
I do not like the "shared_ptr" implementation of reference counting in C++, because that data type is not directly usable in places where a plain reference or pointer is expected. Implementations that do not have this problem exist.
https://www.routledge.com/The-Garbage-Collection-Handbook-Th...
This has been the standard terminology in memory management research for many decades. The only programmers who don't like it are those who don't understand the principles of memory management.
> By that definition, C++ code has garbage collection if it uses std::shared_ptr
That's right.
> going against widespread common usage of the term “garbage collected programming language” which specifically contrasts manual languages like C++ or Rust against garbage collected ones.
Since this contrast mostly exists in the minds of people who don't understand memory management, going against this common misconception is good. That's not to say that there aren't some interesting tradeoffs that often align with the colloquial perception, "garbage collection" isn't the interesting part. As you said, both C++ and Rust use GC; in fact, they use a GC somewhat similar to the one used by CPython.
GC simply is the only way to approach the clarity of pseudo-code in real code. That's one of my later realizations concerning the subject (https://world-playground-deceit.net/blog/2024/11/how-i-learn...)
The hard part is doing it correctly on a global scope with non-trivial lifetimes, possibly influenced by multiple threads.
And in my experience LLMs are still hit or miss on these kind of problems, they can find problems from time to time, but they can't really reason well about more complex global state reliably. They will come up with "hypotheses" that 'oh sure this is the root cause of the issue' only to say something completely wrong (which you may notice or not, only to fail later)
The colloquial term is clear in context, and it draws its boundaries in useful places. If academia prefers other boundaries to simplify its formal definitions, that’s understandable. But the rest of us shouldn’t restrict our language in that way.
For allocation there is no difference between automatic memory management with garbage collectors or reference counts and manual memory management, where the programmer is responsible for invoking "free".
These alternative memory management methods differ only in how deallocation is handled.
Allocation must always be done by defining a new object, regardless of how memory is managed. Moreover, allocation also does not depend on whether an object is allocated in static storage, in a stack or in a heap. You always must define the object, so that memory should be allocated for it at compile-time if in static storage, or at run-time if in a stack or in a heap.
If anything, I often see a bias against tracing GCs from the people misusing the term, to "hype up" their choice of language that it must be better for not having (tracing) GC, when it usually just has ref counting which in many metrics is actually worse, given equal usage -- rust/cpp gets away from that because they only use it on a handful of objects, other lifetimes being driven by RAII, which is pretty much just compile-time decidable ref counting?
The source of my second revelation: GC should be opt-out (e.g. SBCL's arena system) instead of opt-in via refcounted types.
As for me personally, I consider refcounting and GC overlapping categories. I am perfectly willing to call CPython’s reference counting plus cycle collector a form of garbage collection, because it is transparent to the programmer. Every memory management technique has tradeoffs and pathological edge cases, but since you don’t have to consider them in the ordinary course of programming I’d say it counts. If you had to break cycles manually, or to annotate which references should be counted, I’d call that refcounting but not GC – as in the C++ stdlib.
In all seriousness, this is likely a nudge to a preference they have for how they want to sell this and how you should want to buy it.
However, it does inspire me to write.. The kernel of all this terminology confusion is under-exposure of industrial programmers to not just academic terminology, but also the very design space you mention (which has always been nicely covered by Jones' outstanding book). Just to take an example from the root of this thread:
>widespread common usage of the term “garbage collected programming language” which specifically contrasts manual languages like C++ or Rust against garbage collected ones
Boehm-Wiser conservative collection for C, among the most manual languages of all, pre-dates its very first ANSI 1989 standard.
This underexposure itself is downstream of the kinds of oversimplifications/lies of marketing and in this particular case came from Java. The evolution I witnessed was roughly 1) linking Boehm with -lgc and deleting (or #define'ing away) all your `free()` calls is conservative - to be precise you need compiler aid and a lot of programmers are "not perfect==awful" personality types, 2) Sun Microsystems wants to leverage a lot of reliability issues with C code and become The Platform and spends gobs of money to win hearts & minds, partly succeeding, 3) part of its ad-warfare against the then WIntel hegemony and/or tutorials/introductory material for Junior Programmers (often the target of "be more reliable" material) plays fast & loose with GC terminology because marketing plays fast & loose structurally for fun but mostly profit, 4) because human language really does == language usage a la Quine, everyone in the industry re-defines what "GC" means to bind it to a programming language instead of to a specific run-time, 5) industry & academics use different language, confusion ensues and so here we are.
This is not even the 100th time that either explicit or implicit forces of marketing have achieved confusion analogously to this. If you believe most people don't need much of what they spend on then confusion is arguably intrinsic to marketing of ideas/products. The highly misleading but suggestive metaphorical language used all over "AI" in both research and in product-lines is a more current case of this, leading anyone who knows much to have to qualify "not AGI" or other such junk just to have a conversation.
So, what is my point? Basically just that the larger problem here will persist as long as there is money to be made/attention to be garnered by sowing confusion/having people talk past each other/think some product is more than it really is. I have no meta-strategy in my back pocket to block these successful confusions, but it does seem worth being aware of it.
In general, by using macros it is possible to transform so much a C or C++ program, that it becomes unrecognizable as C/C++ and it can mimic reasonably well any other programming language that you might fancy.
The problem is when you work in a team, because even if everyone will agree that such programming languages have great deficiencies, it would be impossible to reach a consensus about how the ideal programming language should look like, so eventually the team remains stuck with writing programs in the ugly standard manner.
Richard Jones’s Garbage Collection (Wiley, 1996) was a milestone book in the area of automatic memory management. Its widely acclaimed successor, The Garbage Collection Handbook: The Art of Automatic Memory Management captured the state of the field in 2012. However, technology developments have made memory management more challenging, interesting and important than ever. This second edition updates the handbook, bringing together a wealth of knowledge gathered by automatic memory management researchers and developers over the past sixty years. The authors compare the most important approaches and state-of-the-art techniques in a single, accessible framework.
The book addresses new challenges to garbage collection made by recent advances in hardware and software, and the environments in which programs are executed. It explores the consequences of these changes for designers and implementers of high performance garbage collectors. Along with simple and traditional algorithms, the book covers state-of-the-art parallel, incremental, concurrent and real-time garbage collection. Algorithms and concepts are often described with pseudocode and illustrations.
The nearly universal adoption of garbage collection by modern programming languages makes a thorough understanding of this topic essential for any programmer. This authoritative handbook gives expert insight on how different collectors work as well as the various issues currently facing garbage collectors. Armed with this knowledge, programmers can confidently select and configure the many choices of garbage collectors.
The e-book enhances the print versions with a rich collection of over 37,000 hyperlinks to chapters, sections, algorithms, figures, glossary entries, index items, original research papers and much more.
Chinese and Japanese translations of the first edition were published in 2016. We thank the translators for their work in bringing our book to a wider audience.
The online bibliographic database includes nearly 3,400 garbage collection-related publications. It contains abstracts for some entries and URLs or DOIs for most of the electronically available ones, and is continually being updated. The database can be searched online, or downloaded as BibTeX, PostScript or PDF.