Separating the Wayland Compositor and Window Manager

The fact that Wayland can't just substitute out pluggable WMs without changing a bunch of other unrelated infrastructure is IMO one of the biggest user-facing losses relative to X11. Anybody who is working to improve that is doing god's work as they say.

I've never used a system with Wayland (been on i3 for ~15 years) but every time a project like this comes up, I have to wonder why Wayland is even a thing. So many hoops to jump through for things that should be simple.

Sure, X11 has warts but I can make it do basically anything I want. Wayland seems like it will always have too much friction to ever consider switching.

This is a really interesting direction.

Separating the compositor and window manager feels like one of those ideas that seems obvious in hindsight, but the protocol/state-machine design here shows why it took real work to make it practical.

Lowering the barrier for writing Wayland window managers without forcing everyone to build a full compositor seems like a big win.

I'm currently using a fully vibe-coded, personal River window manager that works just how I want it to. I switched to it after I realized I couldn't do everything I wanted in Hyprland (e.g. tile windows to equal areas instead of BSP by default).

Simple example of how impactful this separation has been for me.

If Wayland doesn't get this solved then I'll just use X11 forever, with coding agents to keep it running if I have to.

super interested to hear more on this.

i'm a little thrown, because the Wayland diagram doesn't feel quite right. the compositor does lie between the kernel and the apps, but IIRC the apps have their own graphics buffers from the kernel that they are drawing into directly. the compositor then composites them together. to me, that feels more like the kernel is at the center of the diagram here: the wayland compositor is between the kernel and the output / input.

i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).

it means that the task of writing the display-server / compositor is much much much simpler. it's still hard! but the kernel is helping so much. there's an assumed base of having working GPU drivers!

author appears to super know their stuff. alas the FOSDEM video they link to is not loading for me. :(

one major question, since this is a protocol, how viable is it to decompose the window management tasks? rather than have a monolithic window manager, does this facilitate multiple different programs working together to run a desktop? not entirely sure the use case, but a more pluggable desktop would be interesting!

Not only a loss but a key disabler. Having used to having the same customized window manager for decades it's impossible to change to Wayland until there's a fully equivalent interface for managing windows so that everything works as I want from mouse clicks to keyboard shortcuts. Maybe it could be an existing window manager adding support for River, or Wayback layer that reimplements an X11 desktop root on top of a minimal Wayland compositor, but none of the current Wayland compositors even scratch the surface of this.

You only need a single implementation that exposes an API for running a WM as an extension.

I don't really get why would it be a good idea to somehow mandate a specific architecture design from the standard.

It's a damper on development of new WMs and DEs, too. I have ideas for my own desktop I'd like to explore at some point, and if I do it'll almost certainly be X11 based initially because it's so much more quick and easy to wrap one's head around and get the iteration loop up and running with.

I'm not anti-Wayland and I think X11 has enough issues that it's worth transitioning over to something better but this is a critical weakness in Wayland's design.

You can do that already with libraries such as wlroots or Smithay

You only need a single implementation that exposes an API for running a WM as an extension.

I don't really get why would it be a good idea to somehow mandate a specific architecture design from the standard.

Handwaving "just expose an API" ignores the mess at the extension boundary. Modular only works if the contract is airtight, and with Wayland's churn and "sorta spec" documenation, that sounds optimistic at best.

Every "flexible" API turns into a leaky mess unless someone is paid to write the dullest test suite in existance, and nobody is. Mandating one design is ugly, but pretending composition is free is a fairy tale.

We need a compositor that exposes everything as an extension. Preferably in a hot-reloadable, tweakable way, say, using Lua (with JIT). And also exposing its APIs in a way that allows having an analog of xdotool.

Simple example of how impactful this separation has been for me.

Sure, X11 has warts but I can make it do basically anything I want. Wayland seems like it will always have too much friction to ever consider switching.

I've been on wayland since KDE had it available (like the KDE 5 days) because it offered fractional HiDPI scaling that wasn't buns. As a laptop user, it has been one of the best features of Wayland.

Furthermore, getting stuff like VRR on Wayland working is way easier than X.org. And, Wayland also supports HDR.

The hoop I recently jumped through:

There's a type of input called "DeviceEvent" which is a bit lower level than "Window event". It also occurs even if the window isn't "active".

Windows and X11 support this, but Wayland doesn't except for mouse movement. I noticed my program stopped working on Linux after I updated it. Ended up switching to Window Events, but still kind of irritating.

> I can make it do basically anything I want

X11 can't do high refresh rates every time that I've tried to do so.

Sway is basically i3 on Wayland. You pretty much keep your config file (with a few modifications), there really isn’t much friction.

That’s not a reason to do it of course, for me the driver was support for multiple monitors with different scaling requirements.

This is a really interesting direction.

Lowering the barrier for writing Wayland window managers without forcing everyone to build a full compositor seems like a big win.

Are you human? If yes sorry for the offensive question. Your account is new.

super interested to hear more on this.

it means that the task of writing the display-server / compositor is much much much simpler. it's still hard! but the kernel is helping so much. there's an assumed base of having working GPU drivers!

author appears to super know their stuff. alas the FOSDEM video they link to is not loading for me. :(

>i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).

Are you an AI bot? Modern X11 server using DRM are more than 20 years old. You are talking about how X11 servers worked in the 90's

If Wayland doesn't get this solved then I'll just use X11 forever, with coding agents to keep it running if I have to.

You could use xlibre, although some people say it's a joke

I'm not anti-Wayland and I think X11 has enough issues that it's worth transitioning over to something better but this is a critical weakness in Wayland's design.

How is a WM not just a simple plugin/extension? Find a display server you like and write an extension for it!

You can do that already with libraries such as wlroots or Smithay

That's not the same thing. It's way easier to write an X11 window manager than to write a Wayland compositor, even with something like wlroots, because the window manager can speak the same protocol that clients speak, and it runs as a separate process.

As a concrete example, Emacs' EXWM package works by implementing an X11 client library in Emacs Lisp, then using it to talk to the X server (which is a separate process, so this works fine) and telling it how to position windows.

Whereas on Wayland, this is not possible without re-implementing a standalone compositor process, because otherwise architecturally it doesn't work. Emacs can't both do the drawing and be drawn.

No, that still requires you to make the whole thing, you just get help. For instance, I've run into a problem where I try some great new compositor that uses wlroots, and even though wlroots has good support for keyboard layouts I can't actually set the layout because the compositor hasn't wired up that functionality.

The article already addresses that...

It's not easy and the major compositors (Gnome, KDE) are NOT wlroots based, making this point mostly moot anyway.

This protocol at least has a chance of using a custom WM with an advanced compositor (which wlroots is not).

Especially with LLMs, the cost here is down significantly. People also drastically over-idealize what making an X window manager entailed: sure X had it's compositor, but you had to build so so much yourself.

I'm glad River is trying to create a bigger base here; this is way cool. And it sort of proves the value of Wayland: someone can just go do that. Someone can just make a generic compositor/display-server now, with their own new architecture and plugin system, and it'll just work with existing apps.

We were so locked in to such a narrow limited system, with it's own parallel abstraction layer to what the kernel now offers (that didn't exist when X was created). It's amazing that we have a chance for innovation and improvement now. The kernel as a stable base of the pyramid, wlroots/sway as a next layer up, and now River as a higher layer still for folks to experiment and create with. This could not be going better, and there's so much more freedom and possibility; this is such a great engine for iteration and improvement.

I've been on wayland since KDE had it available (like the KDE 5 days) because it offered fractional HiDPI scaling that wasn't buns. As a laptop user, it has been one of the best features of Wayland.

Furthermore, getting stuff like VRR on Wayland working is way easier than X.org. And, Wayland also supports HDR.

The hoop I recently jumped through:

There's a type of input called "DeviceEvent" which is a bit lower level than "Window event". It also occurs even if the window isn't "active".

Sway is basically i3 on Wayland. You pretty much keep your config file (with a few modifications), there really isn’t much friction.

That’s not a reason to do it of course, for me the driver was support for multiple monitors with different scaling requirements.

Are you human? If yes sorry for the offensive question. Your account is new.

You could use xlibre, although some people say it's a joke

"Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that.""

https://news.ycombinator.com/newsguidelines.html

Yes, and I am praising them for tackling the idea. I don't know how you managed to misread me like that. I also read the article before commenting.

The Wayland standard does not prescribe it (unlike X), and the reference implementations were monolithic for a very long time.

Wayland in general had a rather cavalier approach to doing away with things that X users take for granted, like, well, making screenshots. Eventually, under pressure, those in charge agreed that these features are actually very important for real users, so implementations appeared. It's an understandable way to discover the minimal usable subset of features, but the process of it is a bit frustrating for the early adopters.

We just read titles here

Binary space partitioning

> I can make it do basically anything I want

X11 can't do high refresh rates every time that I've tried to do so.

It runs just fine at 165 hz for me. Given that xrandr and CRTs have been around for a while, and both have supported high refresh rates for a long while, something seems fishy here. Something is probably at fault, but it's not X11.

Huh ? It did in 2000.

Are you an AI bot? Modern X11 server using DRM are more than 20 years old. You are talking about how X11 servers worked in the 90's

That's what the anti-Wayland people want: for things to work exactly as they did in the 90s. It's not an accident.

The Xorg codebase still includes some of those old drivers and is structured to allow them to exist.

The article already addresses that...

It's not easy and the major compositors (Gnome, KDE) are NOT wlroots based, making this point mostly moot anyway.

This protocol at least has a chance of using a custom WM with an advanced compositor (which wlroots is not).

We just read titles here

How is a WM not just a simple plugin/extension? Find a display server you like and write an extension for it!

That would suffice if I were only looking to build a WM, but my goal is a full (lean) DE.

Whereas on Wayland, this is not possible without re-implementing a standalone compositor process, because otherwise architecturally it doesn't work. Emacs can't both do the drawing and be drawn.

EWM implements a Wayland compositor as a native thread spawned by a dynamic module in Emacs, it's a full compositor within the Emacs process: https://codeberg.org/ezemtsov/ewm

So it is architecturally possible (but infeasible in plain Emacs Lisp).

For river (the thing this article is about) I wrote an Emacs WM, but also opted for a dynamic module for the Wayland protocol parts: https://code.tvl.fyi/tree/tools/emacs-pkgs/reka

This one could technically be written in plain Emacs Lisp, but I'm happy to use something that already has all the XML codegen stuff for Wayland figured out. Dynamic modules work pretty well, fwiw.

"Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that.""

https://news.ycombinator.com/newsguidelines.html

Yes, and I am praising them for tackling the idea. I don't know how you managed to misread me like that. I also read the article before commenting.

The Wayland standard does not prescribe it (unlike X), and the reference implementations were monolithic for a very long time.

Binary space partitioning

That's what the anti-Wayland people want: for things to work exactly as they did in the 90s. It's not an accident.

Huh ? It did in 2000.

First time I've seen you gray. What days to live in!

Sorry, I didn't address that at you but rather the other replies in this thread.

> so implementations appeared

Indeed - implementations, plural. Incompatible with each other, naturally.

X11 can't do different hz on different screens. If you have a dual screen setup where one screen is 165 hz and the other is 60 you're SOL.

The Xorg codebase still includes some of those old drivers and is structured to allow them to exist.

Just to be clear the hardware abstraction layer used by wayland and any current Xserver is exactly the same.

Yes exactly. DRM exists, but there's still what I called the X "kernel", all of it's heavyweight abstractions.

To the previous a-hole, frak you: not an AI. That's rude as frak. Also, you manage to be incredibly wrong. Even an AI wouldn't overlook such an obvious error; maybe it'd be better to have it replace you. So rude dude! Behave!

That would suffice if I were only looking to build a WM, but my goal is a full (lean) DE.

First time I've seen you gray. What days to live in!

EWM implements a Wayland compositor as a native thread spawned by a dynamic module in Emacs, it's a full compositor within the Emacs process: https://codeberg.org/ezemtsov/ewm

So it is architecturally possible (but infeasible in plain Emacs Lisp).

For river (the thing this article is about) I wrote an Emacs WM, but also opted for a dynamic module for the Wayland protocol parts: https://code.tvl.fyi/tree/tools/emacs-pkgs/reka

This one could technically be written in plain Emacs Lisp, but I'm happy to use something that already has all the XML codegen stuff for Wayland figured out. Dynamic modules work pretty well, fwiw.

> so implementations appeared

Indeed - implementations, plural. Incompatible with each other, naturally.

X11 can't do different hz on different screens. If you have a dual screen setup where one screen is 165 hz and the other is 60 you're SOL.

Sorry, I didn't address that at you but rather the other replies in this thread.

Just to be clear the hardware abstraction layer used by wayland and any current Xserver is exactly the same.

Yes exactly. DRM exists, but there's still what I called the X "kernel", all of it's heavyweight abstractions.

I am sorry if I mistaken you for a bot but the model you are describing have not been implenented by any graphic driver in decades.

Traditional Wayland compositors have a monolithic architecture that combines the compositor and window manager into a single program. This has the downside of requiring Wayland window managers to do the significant work of implementing an entire Wayland compositor as well.

The new 0.4.0 release of river, a non-monolithic Wayland compositor, breaks from this traditional architecture and splits the window manager into a separate program. There are already many window managers compatible with river.

The stable river-window-management-v1 protocol gives window managers full control over window position, keybindings, and all other window management policy while river itself provides frame-perfect rendering, good performance, and all the low-level plumbing required.

This blog post gives a high level overview of the design decisions behind this protocol. This is roughly the same information I presented in my FOSDEM 2026 Talk.

Display Server, Compositor, Window Manager

The traditional Wayland architecture combines three separate roles in the compositor process:

Display Server: Route input events from the kernel to windows and give the kernel buffers to display.
Compositor: Combine all buffers from visible windows into a single buffer to be displayed by the kernel.
Window Manager: Arrange windows, define keybindings, other user-facing behavior.

To understand why Wayland compositors thus far have chosen to combine these roles, it is first necessary to understand some of the fundamental problems with X11’s architecture that Wayland was designed to solve.

With the X11 protocol, the display server is a separate process from the compositor and window manager:

Following the steps taken from a user clicking on a button in a window to the change in the window’s displayed content is quite informative. Referring to the numbers in the diagram above:

The user clicks on a button in a window.
The kernel sends an input event to the display server.
The display server decides which window to route the input event to. Already there is a problem here: since the display server is not aware of the compositor’s scene graph it cannot be 100% sure which window is rendered under the user’s mouse at the time of the click. The display server makes its best guess and sends the event to a window.
The window submits a new buffer to the display server.
The display server passes the window’s new buffer on to the compositor.
The compositor combines the window’s new buffer with the rest of the user’s desktop and sends the new buffer to the display server.
The display server passes the compositor’s new buffer to the kernel.

With this architecture, the display server is acting as an unnecessary middle-man between the kernel, X11 windows, and compositor. This results in unnecessary roundtrips between the display server and compositor, adding latency to every frame.

The traditional Wayland architecture eliminates these unnecessary roundtrips and solves the input routing problem mentioned in step 2. by combining the display server and compositor into a single process:

Traditionally, Wayland compositors have taken on the role of the window manager as well, but this is not in fact a necessary step to solve the architectural problems with X11. Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance. It is not trivial to design a window management protocol that keeps all the advantages of Wayland, but it’s certainly possible.

Window Management Protocol Design Constraints

The river-window-management-v1 protocol is designed to give window managers maximum control without losing any of the key advantages of Wayland. Specifically, the window management protocol must not require a roundtrip every frame or every input event. There should be no input latency penalty compared to the traditional monolithic Wayland architecture when, for example, typing into a terminal emulator or playing a game.

Furthermore, the Wayland ideal of “frame perfection” must be upheld. To illustrate what frame perfection means in the context of window management, consider the following example: several windows are in a tiled layout taking up the entire display area and a new window is opened. The window manager decides to add the new window to the tiled layout, resizing and rearranging the existing windows to make space.

In this case, frame perfection means that the user must not see a frame where there is a gap in the tiled layout or where windows are overlapping and have dimensions that do not fit cleanly with their neighbors. Achieving frame perfection here requires waiting to render the new state until all windows have submitted buffers with the newly requested dimensions.

Note however that frame perfection is only achievable if the windows are drawn by well-implemented programs. The compositor cannot delay rendering the new state forever while waiting for windows to submit new buffers, delaying too long makes things feel less responsive to the user rather than smoother. To solve this the compositor uses a short timeout. If windows are too slow, frame perfection is not possible.

Window Management State Machine

The river-window-management-v1 protocol divides the state managed by the window manager into two disjoint categories: window management state and rendering state.

Window management state influences the communication between the compositor and individual windows. It includes window dimensions, fullscreen state, keyboard focus, keyboard bindings, etc.

Rendering state on the other hand only affects the rendered output of the compositor and does not influence communication between the compositor and individual windows. It includes the position and rendering order of windows, server-side decorations, window cropping, etc.

To achieve frame perfection, the modifications made to this state by the window manager are batched into atomic updates by the river-window-management-v1 protocol. Changes to window management state can only occur during a “manage sequence” and changes to rendering state can only occur during a “render sequence.”

As seen in the state machine above, the compositor initiates manage/render sequences and no roundtrip with the window manager is required when no window-management-related state has changed. In other words, the window manager stays idle while the user is, for example, typing into a terminal and is woken up again when, for example, the user triggers a window manager keybinding or a new window is opened.

At the same time, frame perfection is possible even with complex tiled layouts or server-side decorations rendered by the window manager. The compositor handles all the synchronization work with the windows and the window manager only needs to, for example, adjust the position of windows or size of its server side decorations to adapt to the new window dimensions.

This state machine is not really a new idea, something similar can be found hiding inside most existing Wayland compositors including river-classic and sway for example. In a way, this state machine is a clarification and formalization of the internal architecture used by older river versions. It is the result of 6+ years of experience working on river and slowly refining the architecture over time.

Motivation

Separating the Wayland compositor and window manager greatly lowers the barrier to entry for writing a Wayland window manager. Window manager authors can focus on window management policy without needing to implement an entire Wayland compositor as well. While libraries such as wlroots make writing a compositor somewhat easier, it remains a great deal of work. Writing a Wayland compositor is not a weekend project, but with the new river-window-management-v1 protocol writing a basic but usable Wayland window manager over the weekend is now very possible.

The window manager developer experience is also greatly improved. A window manager crash does not cause the Wayland session to be lost. Window managers can be restarted and switched between. Debugging a window manager is much less of an ordeal than debugging a Wayland compositor. Anyone who has written a Wayland compositor knows the pain of debugging issues that only reproduce when running on “bare metal” (i.e. using DRM/KMS directly), one might be forced to ssh in from a different computer to figure out what has gone wrong.

Furthermore, window managers can be implemented in slow, high-level, garbage-collected languages without sacrificing compositor performance/latency. Having a garbage collector in the compositor is great way to miss frame deadlines and cause input latency spikes. However, since the river-window-management-v1 protocol does not require a round-trip with the window manager every frame, having a garbage collector in the window manager does not really matter. I don’t have any performance issues daily driving my slow, garbage-collected window manager on a >10 year old ThinkPad x220.

Wayland currently does not come close to the diversity of X11 window managers. I believe that separating the Wayland compositor and window manager will change this and I see the beginnings of this change with the 15 window managers already written for river!

Limitations

The river-window-management-v1 protocol does not support use-cases that deviate from the traditional, 2D desktop paradigm. This means no VR support for example.

Crazy visual effects like wobbly windows are also out of scope for river currently, though simple animations already work well. I am open to exploring custom shaders to give window managers more control over rendering eventually but don’t expect that to happen for a year or two at the earliest, there are other priorities for now.

I am not aware of any limitations river places on window management policy that cannot be resolved by extending the protocol. If you are developing a window manager and have a use-case that river does not yet support please open an issue and we will figure out how to get it supported!

Roadmap

With the 0.4.0 release, river is already more than featureful enough to daily drive in combination with a window manager of your choice. Furthermore, the river-window-management-v1 protocol is stable, we do not break window managers.

The best way to get a sense of what features are planned to be added in the future is to look at the accepted label on our issue tracker.

As far as what needs to happen before river 1.0.0, I want to explore some ideas for how to improve the UX of starting and switching between river compatible window managers. All window managers written for river 0.4.0 will remain compatible with river 1.0.0 and beyond, but I may need to make minor breaking changes to river’s CLI depending on how those plans work out. In any case, expect the next major river release to be 1.0.0!

Donate

Unfortunately, the current pace of river’s development is not sustainable without more financial support. If my work on river adds value to your life please consider setting up a recurring donation through liberapay. You can also support me with a one-time or monthly donation on github sponsors or ko-fi though I prefer liberapay as it is run by a non-profit. Thank you for your support!

Gallery

To make all this talk about window managers a bit more tangible, please enjoy these screenshots of window managers running under river. Note that this selection is heavily biased towards the most visually interesting window managers, there are plenty of other excellent window managers to choose from!

Canoe - Stacking window manager with classic look and feel:

reka - An Emacs-based window manager for river (similar to EXWM):

tarazed - A powerful and distraction-free desktop experience:

rhine - Recursive and modular window management with animations: