(emphasis mine)
Not at all true. Assuming the types are such that >> is equivalent to /, modern compilers will implement division by a power of two as a shift every single time.
"Interview with RollerCoaster Tycoon's Creator, Chris Sawyer (2024)" https://news.ycombinator.com/item?id=46130335
"Rollercoaster Tycoon (Or, MicroProse's Last Hurrah)" https://news.ycombinator.com/item?id=44758842
"RollerCoaster Tycoon at 25: 'It's mind-blowing how it inspired me'" https://news.ycombinator.com/item?id=39792034
"RollerCoaster Tycoon was the last of its kind [video]" https://news.ycombinator.com/item?id=42346463
"The Story of RollerCoaster Tycoon" https://www.youtube.com/watch?v=ts4BD8AqD9g
He has lots of videos that are deep dives into how RCT works and how things are implemented!
I really wish I could see the source code.
Each map block was 2x2 cells, and each cell, 8x8 pixels. Made rendering background cells and fog-of-war overlays very straightforward assembly language.
All of Warcraft/etc. had only a few thousand lines of assembly language to render maps/sprites/fonts/fog-of-war into the offscreen buffer, and to blit from the offscreen buffer to the screen.
The rest of the code didn't need to be in assembly, which is too time-consuming to write for code where the performance doesn't matter. Everything else was written in portable assembler, by which I mean C.
Edit:
By way of comparison, Blackthorne for Super Nintendo was all 85816 assembly. The Genesis version (Motorola 68000) and DOS version (Intel 80386) were manually transcribed into their respective assembly languages.
The PC version of Blackthorne also had a lot of custom assembler macros to generate 100K of rendering code to do pixel-scrollable chunky-planar VGA mode X (written by Bryan Waters - https://www.mobygames.com/person/5641/bryan-waters/).
At Blizzard we learned from working on those console app ports that writing assembly code takes too much programmer time.
Edit 2:
I recall that Comanche: Maximum Overkill (1992, a voxel-based helicopter simulator) was written in all assembly in DOS real mode. A huge technical feat, but so much work to port to protected mode that I think they switched to polygon-rendering for later versions.
Numeric characteristics are absolutely still a consideration for game designers even in 2026, one that influences what numbers they use in their game designs. The good ones, anyways. There are, of course, also countless bad developers/designers who ignore these things these days, but not because it is free to do so; rather, because they don't know better, and in many cases it is one of many silent contributing factors to a noticeable decrease in the quality of their game.
> NewValue = OldValue << 2;
I disagree with the framing of this section. Bit shifts are used all the time in low-level code. They're not just some archaic optimisation, they're also a natural way of working with binary data (aka all data on a computer). Modern low-level code continues to use lots of bit shifts, bitwise operators, etc.
Low-level programming is absolutely crucial to performant games. Even if you're not doing low-level programming yourself, you're almost certainly using an engine or library that uses it extensively. I'm surprised an article about optimisation in gaming, of all things, would take the somewhat tired "in ye olde days" angle on low-level code.
I was reminded of the factorio blog. That game's such a huge optimization challenge even by today's standards and I believe works with the design.
One interesting thing I remember is if you have a long conveyor belt of 10,000 copper coils, you can basically simplify it to just be only the entry and exit tile are actually active. All the others don't actually have to move because nothing changes... As long as the belts are fully or uniformly saturated. So you avoid mechanics which would stop that.
> NewValue = OldValue >> 3;
You need to be careful, because this doesn't work if the value is negative. A
And this folks is why an optimizing compiler can never beat sufficient quantities of human optimization.
The human can decide when the abstraction layers should be deliberately broken for performance reasons. A compiler cannot do that.
… oh wait, nvm. Don’t preoptimize!
EA a while back released the source code to (most) of the old Command & Conquer games [2] though interestingly left out Tiberian Sun and Red Alert 2, StarCraft's closest competitors at the time.
Would've been nice for historical preservation to be able to peek behind the curtain and see StarCraft's code in a similar fashion
[1] https://old.reddit.com/r/gamecollecting/comments/68xzxt/star...
Fumito Ueda was notably quite concerned with the technical/production feasibility of his designs for Shadow of the Colossus. [1] Doom was an exercise in both creativity and expertise.
[1] https://www.designroom.site/shadow-of-the-colossus-oral-hist...
By not having to pull in anything from the constant pools and thereby avoid memory stalls in the fast path, we got to use random numbers profligately and still run quickly and efficiently, and get to sleep quickly and efficiently. It was a fun little piece of engineering. I'm not sure how much it mattered, but I enjoyed writing it. (I think I did most of it after hours either way.)
Alas, I don't think it ever shipped because we eventually moved to an even smaller and cheaper Cortex-M0 processor which lacked those instructions. Also my successor on that project threw most of it out and rewrote it, for reasons both good and bad.
dc->color=c++&15;
Hint: it's from this "Lines" demo program, whose source is here: https://web.archive.org/web/20180906060723/https://templeos....
And this is what it looks like when it runs (ignore the fact it's running in Minecraft): https://youtu.be/pAN_Fza6Vy8?t=38
The biggest caveat is that right shifting -1 still produces -1 instead of 0, but that's usually fine for much older game fixed-point maths since -1 is close enough to 0.
However, there is a quirk of the hardware of most CPUs that has been inherited by the C language and by other languages.
There are multiple ways of defining integer division when the dividend is not a multiple of the divisor, depending on the rounding rule used for the quotient.
The 2 most frequently used definitions is to have a positive remainder, which corresponds to rounding the quotient by using the floor function, and to have a remainder of the same sign with the quotient, which corresponds to rounding the quotient by truncation.
In most CPUs, the hardware is designed such that for signed integers the division instruction uses the second definition, while the right shift uses the first definition.
This means that when the dividend is a multiple of the divisor, division and right shift are the same, but otherwise the quotient may differ by one unit due to different rounding rules.
Because of this, compilers will not replace automatically divisions with right shifts, because there are operands where the result is different.
Nevertheless, the programmer can always replace a division by a power of two with a right shift. In all the programs that I have ever seen, either the rounding rule for the quotient does not matter or the desired definition for the division is the one with positive remainder, i.e. the definition implemented by right shift.
In those cases when the rounding rule matters, the worrisome case is when you must use division not when you can use right shift, so you must correct the result to correspond to rounding by floor, instead of the rounding by truncation provided by the hardware. For this, you must not use the "/" operator of the C language, but one of the "div" functions from "stdlib.h", or you may use "/" but divide the absolute values of the operands, after which you compute the correct signed results.
I used to think like this, not anymore.
What convinced me that these sort of micro-optimizations just don't matter is reading up on the cycle count of modern processors.
One a Zen 5, Integer addition is a single cycle, multiplication 3, and division ~12. But that's not the full story. The CPU can have 5 inflight multiplications running simultaneously. It can have about 3 divisions running simultaneously.
Back in the day of RCT, there was much less pipelining. For the original pentium, a multiplication took 11 cycles, division could take upwards of 46 cycles. These were on CPUs with 100 Mhz clock cycles. So not only did it take more cycles to finish, couldn't be pipelined, the CPUs were also operating at 1/30th to 1/50th the cycle rate of common CPUs today.
And this isn't even touching on SIMD instructions.
Integer tricks and optimizations are pointless. Far more important than those in a modern game is memory layout. That's where the CPU is actually going to be burning most it's time. If you can create and do operations on a int[], you'll be MUCH faster than if you are doing operations against a Monster[]. A cache miss is going to mean anywhere from a 100 to 1000 cycle penalty. That blows out any sort of hit you take cutting your cycles from 3 to 1.
All possible numerical representations come with inherent trade-offs around speed, accuracy, storage size, complexity, and even the kinds of questions one can ask (it's often not meaningful to ask if two floats equal each other without an epsilon to account for floating point error, for instance).
"Toward an API for the Real Numbers" ( https://dl.acm.org/doi/epdf/10.1145/3385412.3386037 ) is one of the better papers I've found detailing a sort of staged complexity technique for dealing with this, in which most calculations are fast and always return (arbitrary precision calculations can sometimes go on forever or until memory runs out), but one can still ask for more precise answers which require more compute if required. But there are also other options entirely like interval arithmetic, symbolic algebra engines, etc.
One must understand the trade-offs else be bitten by them.
Game designers are not so constrained anymore by the limits of the hardware, unless they want to push boundaries. Quality of a game is not just the most efficient runtime performance - it is mainly a question if the game is fun to play. Do the mechanics work. Are there severe bugs. Is the story consistent and the characters relatable. Is something breaking immersion. So ... frequent stuttering because of bad programming is definitely a sign of low quality - but if it runs smooth on the targets audience hardware, improvements should be rather done elsewhere.
They’re not pointless; they’re just not the first thing to optimize.
It’s like worrying about cache locality when you have an inherently O(n^2) algorithm and could have a O(n log n) or O(n) one. Fix the biggest problem first.
Once your data layout is good and your cpu isn’t taking a 200 cycle lunch break to chase pointers, then you worry about cycle count and keeping the execution units fed.
That’s when integer tricks can matter. Depending on the micro arch, you may have twice as many execution units that can take integer instructions. And those instructions (outside of division) tend to have lower latency and higher throughput.
And if you’re doing SIMD, your integer SIMD instructions can be 2 or 4x higher throughput than float32 if you can use int16 / int8 data.
So it can very much matter. It’s just usually not the lowest hanging fruit.
The thing that changed during the 90's is that mechanical sympathy became optional to achieving a large production. The data input defining the game world was decoupled into assets authored in disconnected ways and "crunched down" to optimized forms - scans, video, digital painting, 3D models. RCT exhibits some of this, too, in that it's using PCM audio samples and prerendered sprites. If the game weren't also a massive agent simulator it would be unremarkable in its era. But even at this time more complex scripting and treating gameplay code as another form of asset was becoming normalized in more genres.
From the POV of getting a desired effect and shipping product, it's irrelevant to engage with mechanical sympathy, but it turns out that it's a thing that players gradually unravel, appreciate and optimize their own play towards if they stick with it and play to competitive extremes, speedrun, mod, etc.
The 64kb FPS QUOD released earlier this year is a good example of what can happen by staying committed to this philosophy even today: the result isn't particularly ambitious as a game design, but it isn't purely a tech demo, nor does it feel entirely arbitrary, nor did it take an outrageous amount of time to make(about one year, according to the dev).
If your game is small-scale, something like Super Mario Bros., you should be able to get away with not thinking about it in theory. But even then people manage to write simple games with bloated loading times and stuttery performance, so never underestimate the impressive ability of people who are operating solely at the highest level of abstraction to make computers cry.
Which wasn't a problem, but it clearly showed how the programmers improvised to make it perform.
For the lesson here, I think re-contextualizing the product design in order to ease development should be a core tenant of modern software engineering (or really any form of engineering). This is why we are usually saying that we need to shift left on problems, discussing the constraints up-front lets us inform designers how we might be able to tweak a few designs early in order to save big time on the calendar. All of the projects that I loved being a part of in my career did this well, all of the slogs were ones that employed a leadership-driven approach that amounted to waterfall.
Reminds me of blood moons in Zelda https://www.polygon.com/legend-zelda-tears-kingdom/23834440/...
Somehow even as a child I just knew that it would be a whole new emergent game play experience.
Ofcourse I didnt know waht went into making Rolelrcoaster Tycoon but I could just by a couple of screenshots how this was clearly a ground up new game with new mechanics that would be extremely fun to play.
I dont get this feeling anymore, as I just assyne everything is just a clone of another game in the same engine generally.
Unless its been a decade in production like Breath of the Wild of GTA 5 i just dont expect much.
It does make you wonder if the future of AI-assisted development will look more like the early days of coding, where one single mind can build and deliver a whole piece of software from beginning to end.
For integers the situation is better but even there, it hugely depends on your compiler and how much it cheats. You can't replace trig with intrinsics in the general case (sets errno for example), inlining is at best an adequate heuristic which completely fails to take account what the hot path is unless you use PGO and keep it up to date.
I've managed to improve a game's worst case performance better by like 50% just by shrinking a method's codesize from 3000 bytes to 1500. Barely even touched the hot path there, keep in mind. Mostly due to icache usage.
The takeaway from this shouldn't be that "computers are fast and compilers are clever, no point optimising" but more that "you can afford not to optimise in many cases, computers are fast."
Today, I imagine we have conversations like this happening:
Game designer: We will have 300 different enemy types in the game.
Programmer: Things could be really, really faster if you could limit it to 256 types.
Game designer: ?????
That ????? is the sign of someone who is designing a computer program who doesn't understand the basics of computers.
10000x this. Miyamoto starts with a rudimentary prototype and asks himself this. Sadly it seems for many fun is an afterthought they try to patch in somehow.
Texture resolution mismatches causing blurriness/aliasing, floating point errors and bad level design causing collision detection problems (getting stuck in the walls), frame rate and other update rates not being synced causing stutter and lag (and more collision detection problems), bad illumination parameters ruining the look they were going for, numeric overflow breaking everything, bad approximations of constants also breaking everything somewhere eventually, messy model mesh geometry causing glitches in texturing, lighting, animation, collision, etc.
There's probably a lot more I'm not thinking of. They have nothing to do "with the hardware", but the underlying math and logic.
They're also not bugs to "let the programmer figure out". Good programmers and designers work together to solve them. I could just as easily hate on the many criminally ugly, awkward, and plain unfun games made by programmers working alone, but I'll let someone else do that. :)
> NewValue = OldValue >> 3; > This is basically the same as
> NewValue = OldValue / 8;
> RCT does this trick all the time, and even in its OpenRCT2 version, this syntax hasn’t been changed, since compilers won’t do this optimization for you.
The author loses a lot of credibility by suggesting the compiler won't replace multiplying or dividing by a factor of 2 with the equivalent bit shift. That's a trivial optimization that's always been done. I'm sure compilers were doing that in the 70s.
At first this sounds like a strange technical obscurity"
Do we not know binary in 2026? Why is this a surprise to the intended audience?
But if you look at creative writing, story arcs are all about obstacles. A boring story is made interesting by an obstacle. It is what our protagonist needs to overcome. A one-man-band game dev who simultaneously holds the story and the technical challenge their head, might spot the opportunity to use a glitch or limitation as, I dunno, a mini game that riffs on the glitch.
My point wasn't "don't optimize" it was "don't optimize the wrong thing".
Trying to replace a division with a bit shift is an example of worrying about the wrong thing, especially since that's a simple optimization the compiler can pick up on.
But as you said, it can be very worth it to optimize around things like the icache. Shrinking and aligning a hot loop can ensure your code isn't spending a bunch of time loading instructions. Cache behavior, in general, is probably the most important thing you can optimize. It's also the thing that can often make it hard to know if you actually optimized something. Changing the size of code can change cache behavior, which might give you the mistaken impression that the code change was what made things faster when in reality it was simply an effect of the code shifting.
Fortunately, D compilers gdc and ldc take advantage of the gcc and llvm optimizers to stay even with everyone else.
I remember the early Simpsons video game. Sometimes, due to some bug in it (probably a sign error), you could go through the walls and see the rendered scenery from the other side. It was like you went backstage in a play. It would have made a great Twilight Zone episode!
If AI has any benefit to creative endeavors at all it will be because of the challenges of coaxing a machine defined to produce an averaging of a large corpus of work (producing inherently mediocre slop) provides novel limitations, not because it makes art any more "accessible".
Similarly, redstone has 16 power levels: 0 to 15. This allows it to store the power level using 4 bits. In fact, quite a lot of attributes in Minecraft blocks are squeezed into 4 bits. I think the system has grown to be more flexible these days, but I'm pretty sure the chunk data structure used to set aside 4 bits for every block for various metadata.
And of course, the world height used to be at 255 blocks. Every block's Y position could be expressed as an 8-bit integer.
A voxel game like that is a good example of where this kind of efficiency really matters since there's just so much data. A single 1616256 chunk is 65.5k blocks. If a game designer says they want to add a new light source with brightness level 20, or a new kind of redstone which can go 25 blocks, it might very well be the right choice to say no.
From what I heard, there was a Civilization game which suffered from an unsigned integer underflow error where Gandhi, whose aggression was set to 0, would become "less aggressive" due to some event in the game, but due to integer underflow, this would cause his aggression to go to 255, causing him to nuke the entire map.
The article says this was just an urban legend though. Well, real or not, it's a perfect example of the principle!
> I have calculated the value of Pi on Sausage Island and found it to be 2.
https://web.archive.org/web/20240405034314/https://twitter.c...
(But it definitely helps if the game designer knows of the technical limits)
Now writing very optimized assembly is very hard. Because you need to break your consistency and conventions to squeeze out all the possible performance. The larger "kernel" you optimize the more pattern breaking code you need to keep in your head at a time.
Not saying that it was not a huge feat, but it’s definitely a lot harder to start from scratch nowadays, even for the same platform.
Who formats or cleans up the assets and at least oversees that things are done according to a consistent spec, process, and guidelines? Is that not a game designer or someone under their leadership?
I think in all the cases I gave, what might be completely delegated to "engine design" really should be teamwork with game design and art direction too. This is what the top-level comment was talking about. Even when a game is "well made", they just adopted someone else's standards and that sucks all the soul out of it. This is a common problem in all creative work.
(adding this due to reply depth): Coordination is a big aspect of design and can often be the most impactful to the result.
World of Warcraft (at least originally) encoded every item as an ID. To keep the database simple and small (given millions of players with many characters with lots of items): if you wanted to permanently enchant your item with an upgrade, that was represented essentially as a whole new item. The item was replaced with a different item (your item + enchant). Represented by a different ID. The ID was essentially a bitmask type thing.
This meant that it was baked into the underlying data structures and deep into the core game engine that you could never have more than one enchant at a time. It wasn't like there was a relational table linking what enchants an item in your character's inventory had.
The first expansion introduced "gems" which you could socket into items. This was basically 0-4 more enchants per item. The way they handled this was to just lengthen item Ids by a whole bunch to make all that bitmask room.
I might have gotten some of this wrong. It's been forever since I read all about these details. For a while I was obsessed with how they implemented WoW given the sheer scale of the game's player base 20 years ago.
Four instructions, in about eight chips.
By combining shifts and adds Keith Barr was able to devise all the different filter and delay coefficients for 63 different reverb programs (the 64th one was just dead passthrough).
For multiplying with powers of two greater or equal to 16, they use shift left, because LEA can no longer be used.
imul rdx, rdx, 1717986919
shr rdx, 32
sar edx
sar eax, 31
sub edx, eax
mov eax, edxDue to some lucky circumstances, I recently had the chance to appear in one of the biggest German gaming podcasts, Stay Forever, to talk about the technology of RollerCoaster Tycoon (1999). It was a great interview, and I strongly recommend to listen to the whole episode here, at least if you speak german. If not, don’t worry—this article covers what was said (and a little more).

RollerCoaster Tycoon and its sequel are often named as some of the best-optimized games out there, written almost completely in Assembly by their creator, Chris Sawyer. Somehow this game managed to simulate full theme parks with thousands of agents on the hardware of 1999 without breaking a sweat. An immensely impressive feat, considering that even nowadays a lot of similar building games struggle to hit a consistent framerate.

So how did Chris Sawyer manage to achieve this?
There are a lot of answers to this question, some of them small and focused, some broad and impactful. The one which is mentioned first in most articles is the fact that the game was written in the low-level language Assembly, which, especially at the time of the game’s development, allowed him to write more performant programs than if he had used other high-level languages like C or C++.
Coding in Assembly had been the standard for game development for a long time but at this point in time was basically a given-up practice. Even the first Doom, which was released six years earlier, was already mostly written in C with only a few parts written in Assembly, and nobody would argue that Doom was in any way an unoptimized game.
It’s hard to check for sure, but it’s likely that RCT was the last big game developed in this way. How big the performance impact was at the time is hard to quantify, but for what it’s worth, it was probably higher than it would be nowadays. Compilers have gotten much better at optimizing high-level code, and many optimizations that you’d need to do manually back then can be handled by compilers nowadays.
But besides the use of assembly, the code of RCT was aggressively optimized. How do we know this if the source code has never been released? We have something that’s almost as good: A 100% compatible re-implementation of it, OpenRCT2.

Written by (very) dedicated fans, OpenRCT2 manages to reimplement the entirety of RollerCoaster 1&2, using the original assets. Even though this is NOT the original source code, especially in its earlier versions, this re-implementation is a very, very close match to the original, being based on years of reverse engineering. Note that by now, OpenRCT2 contains more and more improvements over the original code. I’ll note some of those changes as we come across them.
Also, I won’t go through all optimizations, but I will pick some examples, just to illustrate that every part of the game was optimized to the brink.
How would you store a money value in a game? You would probably start by thinking about the highest possible money value you might need in the game and choose a data type based on that. Chris Sawyer apparently did the same thing, but in a more fine-grained way.

Different money values in the code use different data types, based on what the highest expected value at that point is. The variable that stores the overall park value, for example, uses 4 bytes since the overall park value is expected to use quite high numbers. But the adjustable price of a shop item? This requires a far lower number range, so the game uses only one byte to store it. Note that this is one of the optimizations that has been removed in OpenRCT2, which changed all occurrences to a simple 8-byte variable, since on modern CPUs it doesn’t make a performance difference anymore.
When reading through OpenRCT2’s source, there is a common syntax that you rarely see in modern code, lines like this:
Thanks to operator overloading, the ‘<<’ operator can mean a lot of things in C++. What the line effectively does is the same as what most coders would write like this:
What the ‘<<’ operator does here is called bit shifting, meaning all the bits that store the value of the variable are shifted to the left, in this case by two positions, with the new digits being filled in with zeros. Since the number is stored in a binary system, every shift to the left means the number is doubled.
At first this sounds like a strange technical obscurity, but when multiplying numbers in the decimal system we basically do the same. When you multiply 57 * 10, do you actually ‘calculate’ the multiplication? Or do you just append a 0 to the 57? It’s the same principle just with a different numerical system.
The same trick can also be used for the other direction to save a division:
This is basically the same as
RCT does this trick all the time, and even in its OpenRCT2 version, this syntax hasn’t been changed, since compilers won’t do this optimization for you. This might seem like a missed opportunity but makes sense considering that this optimization will return different results for underflow and overflow cases (which the code should avoid anyway).
The even more interesting point about those calculations, however, is how often the code is able to do this. Obviously, bit shifting can only be done for multiplications and divisions involving a power of two, like 2, 4, 8, 16, etc. The fact that it is done that often indicates that the in-game formulas were specifically designed to stick to those numbers wherever possible, which in most modern development workflows is basically an impossibility. Imagine a programmer asking a game designer if they could change their formula to use an 8 instead of a 9.5 because it is a number that the CPU prefers to calculate with. There is a very good argument to be made that a game designer should never have to worry about the runtime performance characteristics of binary arithmetic in their life, that’s a fate reserved for programmers. Luckily, in the case of RCT the game designer and the programmer of the game are the same person, which also offers a good transition to the third big optimization:
RCT was never a pure one-man-project, even though it is often described as one. All the graphics of the game and its add-ons, for example, were created by Simon Foster, while the sound was the responsibility of Allister Brimble.
But it’s probably correct to call it a Chris Sawyer Game, who was the main programmer and only game designer in unison.
This overlap in roles enables some profound optimizations, by not only designing the game based on the expected game experience, but also informed by the performance characteristics of those design decisions.
One great example for this is the pathfinding used in the game. When writing a game design document for a park building game, it’s very easy to design a solution in which guests first decide on which attraction they want to visit (based on the ride preferences of the individual guest), and then walk over to their chosen attraction.

From a tech point of view, this design, however, is basically a worst case scenario. Pathfinding is an expensive task, and running it for potentially thousands of agents at the same time is a daunting prospect, even on modern machines.
That’s probably why the guest behavior in RCT works fundamentally different. Instead of choosing a ride to visit and then finding a path to it, the guests in RCT walk around the park, basically blind, waiting to stumble over an interesting ride by accident. They follow the current path, not thinking about rides or needs at all. When reaching a junction, they will select a new walking direction almost randomly, only using a very small set of extra rules to avoid dead ends, etc.
This “shortcoming” is actually easy to spot in the game, when following a guest around the park for a while. They don’t walk anywhere on purpose, even when complaining about hunger and thirst, they wouldn’t think of looking for the nearest food stall, they just continue until they randomly walk by a food stall.
This doesn’t mean that RCT doesn’t do any pathfinding at all; there are cases where a traditional pathfinder is used. For example, if a mechanic needs to reach a broken ride or a guest wants to reach the park exit, those cases still require traditional, and therefore expensive, pathfinding.
But even for those cases, RCT has some safety nets installed to avoid framespikes. Most importantly, the pathfinder has a built-in limit on how far it is allowed to traverse the path network for an individual path request. If no path has been found before hitting this limit, the pathfinder is allowed to cancel the search and return a failure as result. As a player, you can actually see the pathfinder failures in real-time by reading the guest thoughts:

Yep, every time a park guest complains about not being able to find the exit, this is basically the Pathfinder telling the game that there might be a path, but for the sake of performance, it won’t continue searching for it.
This part is especially fascinating to me, since it turns an optimization done out of technical necessity into a gameplay feature. Something that can barely happen in “modern” game development, where the roles of coders and game designers are strictly separated. In case of the pathfinding limit, even more game systems were connected to it. By default, the pathfinder is only allowed to traverse the path network up to a depth of 5 junctions, but this limit isn’t set in stone. Mechanics, for example, are seen as more important for the gameplay than normal guests, which is why they are allowed to run the pathfinder with a search limit of 8 junctions.
But even a normal park guest is allowed to run the pathfinder for longer, for example by buying a map of the park, which is sold at the information kiosk.
When searching a path for a guest who bought a map, the pathfinder limit is increased from 5 to 7, making it easier for guests to find the park exit.
Changing the design of a game to improve its performance can seem like a radical step, but if done right, it can result in gains that no amount of careful micro-optimization could ever achieve.
Another example of this is how RCT handles overcrowded parks. Congested paths are a common sight in every theme park, and obviously, the game also has to account for them somehow. But the obvious solution, implementing some form of agent collision or avoidance system, would do to the framerate what Kryptonite does to Superman.

The solution, again, is just to bypass the technical challenge altogether. The guests in RCT don’t collide with each other, nor do they try to avoid each other. In practice, even thousands of them can occupy the same path tile:
However, this doesn’t mean that the player doesn’t need to account for overcrowded parks. Even though guests don’t interact with guests around them, they do keep track of them. If too many other guests are close by, this will affect their happiness and trigger a complaint to the player. The outcome for the player is similar, as they still need to plan their layout to avoid too crowded paths, but the calculations needed for this implementation are a magnitude faster to handle.
RCT might have been the “perfect storm” for this specific approach to optimization, but this doesn’t mean that it can’t be done anymore, nowadays. It just means more dialogue between coders and game designers is needed, and often, the courage to say “No” to technical challenges. No matter how much you’d wish to solve them.
If you read my rumblings up to this point, you can follow me at Mastodon, Bluesky, or LinkedIn, or subscribe to this blog directly below this article. I publish new articles about game programming, Unreal, and game development in general about every month.
sar eax, 0x1f
and eax, 7
add eax, edx
sar eax, 3
You get 4 instructions instead of one because value >> 3 rounds towards negative infinity and value / 8 rounds towards zero.And while this wouldn't apply to C++, in languages with checked arithmetic, the left shift won't necessarily set the overflow flag, so the compiler often can't use it.
> RCT does this trick all the time, and even in its OpenRCT2 version, this syntax hasn’t been changed, since compilers won’t do this optimization for you.
MS has been loosening up on the 4 bits limit and have created a CPP variant of Minecraft which performs better, but they've also introduced their unified login garbage that has almost made me give up Minecraft completely.
The 4-bit stuff is a hangover from Notch doing this (I'd maybe even say a similar-calibre programmer to Chris Sawyer...). The sound has nothing to do with technical limits, that's a post-facto rationalisation.
The game never played midi samples, it was always playing "real" audio. The style was an artistic choice, many similar retro-looking games were using chiptune and the sorts. It's a deliberate juxtaposition...
The CPP variant doesn't really perform better anymore either.