Instead, we get a zooming in/out raccoon (making fun of the reader, IMO) for recognizing this problem via the OP author.
Maybe it's just a really hard problem to solve across all devices & latencies... Perhaps more time needs spent on "problem solving" vs "problem description".
I just look at the largest tech companies in the world that with their unlimited finances cannot produce software that isn't glitchy like this.
At the same time, why does everything need motion? My understanding is that motion should be used if an action subtly changes the UI in a region that's different from where the action was triggered (e.g. toasts)
I think many of these transitions are unnecessary and would feel just as good if they snapped immediately with instantaneous reflow.
Often with out it your brain has to rescan the entire page on each refresh.
The notification area doesn't need animations either, because a GUI is only appropriate for displaying non-urgent notifications. If something really needs urgent attention, you need alarms and flashing lights, not an animated "toast".
https://tonsky.me/blog/every-frame-perfect/toolbar@2x.mp4, for example
I don't think I would have to rescan the entire page to figure out where things were afterwards. Everything's shifted to the right, just like when I open my browser bookmarks.
"Back-in-the-days" you'd click and stuff would instantly happen, and I don't remember anything being more difficult to visually interpret.
On my Kubuntu desktop if I disable all animations (the whole compositor) I don't feel there is an increased cognitive load of rescaning things - but maybe it's my preexisting memory of the UIs and certain baked in UI expectations. Maybe this animated stuff helps people that are computer illiterate? (software made for the lowest common denominator)
A while ago I was reading about Wayland and this quote stuck with me:
A stated goal of Wayland is “every frame is perfect”.
And I think this is a goal we should all aspire to. Wayland is talking about the technical side of things (modern GPU stacks are very complex and Wayland is trying to take control back) but it could be applied to UI too.
The rule of thumb is:
If I take a screenshot of your app at any moment, it must make sense
Why care about every frame? It builds trust. Users can’t see the code, so UI is the only way for them to judge the quality of the app. If UI looks good, that means developers had time to polish it, which means that they probably spent a comparable amount of time to iron out the code. It’s a heuristic, but a reasonable one.
Now, what does it mean in practice? I can think of a few things:
Animations often end up being forgotten. A UI might look great in both start and end states but very janky in between. Like this:
If you feel like there are weird things going on there, there are! Look at slowed down version:
Now let’s apply our rule and take screenshots in the middle of the animation. This doesn’t look right:

Neither does this:

Both of these frames are not perfect.
Let’s look at another example. Safari:
Placeholder text here moves from the center but cursor animates from the left position:
Not the end of the world by any means, but it does create a feeling that these two components are not in sync with each other. Next thought: maybe they weren’t designed together? If so, then they might not work well together. That’s how trust is lost.
This desynchronization can lead to a lot of confusion. For example, in Photos, when switching between Crop and Adjust mode, picture snaps into place immediately but the crop border is animated:
This creates a false feeling that something subtly changes when you switch between modes. And you know what? I don’t want my UI to give me false feelings. I want it to be a precise instrument, not an animated toy.
Sometimes animations are supposed to help you understand a transition, so it’s doubly sad when they make it harder. Follow the magnifying glass:
Same with Youtube. They had the simplest task in the world: move a rectangle from one position to another! Yet they decided to do something very strange:
Can you explain this? Does it make sense?

Probably a technical limitation of the DOM architecture they decided earlier on. I call these situations “The technology has outsmarted the programmer”. But no matter the reason, the result is an imperfect frame.
Sometimes animations are left out as an afterthought. Whatever happens, happens. Then we get this:
The details are fascinating to watch:
So yeah. Please pay attention not only to the start and end states, but also to everything in between. Every frame matters.
I’ll leave you with this unprovoked zoom animation from Preview app. Take care!