I don't get the frustration with wayland (the protocol) in the comments. This project shows that having a separate window manager was always possible. First we got wlroots as a library that did most of the heavy lifting, and now we got river as an even higher level abstraction.
Sure I agree that wayland (the project) could have provided these abstractions much earlier. But anyone else could have done it, too. We get all of this for free, so we shouldn't complain if other people don't do the work that we could do just as well.
> I don't get the frustration with wayland (the protocol) in the comments.
They took a firm principled stance against screenshots to start with, which set them up for the COVID WFH wave. Then we've got this questionable design that seems hard to make accessible since accessibility is a security risk and we're heading right into Agentic AI which will be interesting. I've been avoiding the Wayland ecosystem for as long as I can after the initial burn and it'll be curious to see how well it supports bringing in new AI tooling. Maybe quite well, I gather that Pipewire is taking over the parts of the ecosystem that Wayland left for someone else and maybe the community has grown to the point where it has overcome the poor design of Wayland's security model by routing around it.
My guess is the frustration is coming from a similar perspective because it is a bit scary seeing Wayland getting picked up everywhere as a default and the evidence to date is they don't really consider a user-friendly system as a core design outcome. Realistically Wayland is 2 steps forward even if there is a step back here or there. The OSS world has never been defined by a clean and well designed graphics stack.
I think wayland is OK as a user. But Wayland is just not really that UNIX.
As ordinary user, I actually don't care about any of this. However, from another perspective, I think this is a bad thing—open source projects have become product-centered, defaulting to the assumption that users are ignorant fools. This isn't how community projects should behave, but those projects is not that community-driven anyway.
After all, for a long time, so-called security has only been a misused justification—never letting users make mistakes is just a pretty excuse, meant to keep users from being able to easily access something, and eventually from ever accessing it at all.
I'd guess it's because of the general attitude of the project's community, specifically GNOME people and their “my way or highway” style of answering questions e.g. about CSD or other non-critical stuff, not directly related to core protocol. If they were a bit more accommodating to reasonable requests from outside, they'd get less backlash in comments. There's plenty of exemplar behaviour elsewhere in adjacent communities, they could have taken hint multiple times.
That they provide this stuff for free would be a good argument if the stuff wasn't pushed down people's throats with no working alternative and Xorg being discontinued.
And how would they be able to "push stuff down people's throats" if people could walk away towards alternatives? When such alternatives don't exist, that's exactly how "they do stuff for free and nobody else is putting in the work to make something else" looks like.
The problem isn't they "pushing stuff down your throats", it's nobody else (including you) making alternatives that you like better. You are voluntarily ingesting their stuff because your only alternative is starving.
> And how would they be able to "push stuff down people's throats" if people could walk away towards alternatives?
It's a forcing of their narrow opinion on what should be allowed onto the ecosystem at large, because all of these things are connected. You can leave to a different DE/distro, but if every DE is doing its own thing for global hotkeys or whatever, then software in the ecosystem is going to be hacky/bespoke or have an unreasonable maintenance burden.
Even if you in particular can move elsewhere the ecosystem is still held back. We only recently got consensus on apps being able to request a window position on screen, which is something x11, macos, and windows all allow you to do. CSD and tray icons are other examples of things found everywhere else that they did not want to support. Some applications are just broken without tray icon support.
This bleeds over into work for folks releasing software for Linux in general. By not supporting SSD they were pushing the burden of drawing window decorations onto every single app author, and while most frameworks will handle this, it's not like everyone is using qt or gtk. App authors will get bug reports and the burden of releasing software on Linux needlessly climbs again.
Hard to convey how unreasonable I feel their stance was on tray icons / SSD. It should be the domain of the DE from a conceptual but also practical point of view, even from just the amount of work involved. It reminds me of LSP's enabling text editors to have great support for every language. And again, Gnome was the odd man out in this, they want extra attention and work when Linux is the lowest desktop marketshare by far, and they themselves are not the overwhelming majority but they are large enough that you really do need to make sure your software runs well on Gnome even if you want to support Linux.
People think Gnome push stuff down your throat because they have the power and influence to impact the ecosystem, and they use that power and influence to die on absolutely absurd hills.
To me, this is the first time Wayland feels like it's not a waste of time. The display server does not need to have the complexity of window managing on top the surface management. I certainly share the author's sentiment:
> Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance.
Although I'm not sure if it was the least resistance per se (as a social phenomenon), but just that it's an easier problem to tackle. Or maybe the authors means the same thing.
(That and the remote access story needs to be fixed. It just works in X11. Last time I tried it with a system that had 90 degree display orientation, my input was 90 degrees off from the real one. Now, this is of course just a bug, but I have a strong feeling that the way architecture Wayland has been built makes these kind of bugs much easier to create than in X11.)
X11 has some tricky, imposible to fix (within the confines of the existing protoco) issues because of the seperation between Xserver and window manager. Things like (IIRC) initial window placement and what nots, where ideally the window manager would make choices about things before the process continues, but the reality of distributed systems makes everything hard. Combining some things into an integrated process fixes that, but comes with other issues.
There were probably other ways to fix those issues, but it would still be a fair amount of churn.
> imposible to fix (within the confines of the existing protoco)
X11's extension mechanisms can - and has - been used to enable backwards incompatible protocol changes. E.g. BigRequest changes the length and format of every single protocol request.
Very few client libraries are only capable of speaking "the existing" protocol if you take that to mean the original unextended X11 protocol.
Adding an X11 extension that when enabled cleans up a lot of cruft would not have been a problem.
> where ideally the window manager would make choices about things before the process continues
Nothing stops you from introducing an extension that when enabled requires the client to wait for a new notification type before continuing, or re-defines behaviour. That said, using my own custom window manager, I don't know what you mean here. My WM does decide the initial window placement and size, and it's the clients damn problem if it can't handle a resize before I allow the window to be mapped.
The X protocol is crusty in places, but it is very flexible. People haven't fixed these things because they chose to invent compatibility hindrances that weren't real when their response was to invent an entirely new protocol with no compatibility at all.
Too late to edit, but one minor self-correction: BigRequest changes the allowed length and format of every single protocol request. For small requests they are the same, but if length is set to 0, an extra 4 octets are inserted to allow encoding a larger packet length.
I think it's quite ironic that everybody nowadays complains about Wayland and the "good old days" of X. Back in the day, everybody and their dog complained about X being "archaic", "slow", "takes 20 operations to draw a line", etc. XComposite and XRender were just hacks. Everybody hated on X and anything else was considered better.
On a tangent, also very ironic that X (the successor of Twitter) has the exact same logo as X (the window system). It's like Elon Musk just Googled for the first X logo that came along and appropriated that and nobody seems to notice or care.
I think most smaller Wayland compositors are using a library (wlroots, smithay) for most (?) of the compositing. If using a library provides a few extra options, while still allowing sharing code, it feel like the API boundary was put in the right place.
When there was the 90deg off bug, was that a bug in the compositor or in wlroots?
Remote access on X11 is a mess and I won't miss it, at least on Wayland everyone is funneled through EGL or Vulkan and there's a reasonable path to layering remote access on top of that.
X11 remote access have worked really well for me. And the best part is that it worked even when the client machine has no graphical subsystem installed. I can launch GUI applications remotely with a non-privileged account and it shows on my machine as if it was native.
Wayland can use RDP and some other remote desktop protocols, but it is not what I want, I want a window, not a desktop. There is Waypipe now, I heard it works fine now, but I am still doing "ssh -X", because it just works.
The problem with Wayland is that it is very much "batteries not included". To all the things that worked well in X11, the response has been "it can be done, our protocol is very flexible, ask the guys writing the compositor", not "that's how is done". The result, Wayland is 18 years old and it is only starting to work well, with some pain points still remaining, and display forwarding is one of them.
It is funny you mention a "reasonable path" by the way, as it is exactly that problem, I don't want a "reasonable path", I want it to work, and after 18 years, I think it is a reasonable expectation. To their credit, it seems we are getting there: waypipe, and now window managers, we may finally have feature parity.
This is also where I'm at. I don't care what protocol or whatever is running underneath but I just want things to work and Wayland doesn't do that. It has lately been better, previously I would try Wayland and run into problems within minutes, recent attempts have given me hours without running into a problem. And as an end user I don't want to care that the problems I get aren't with Wayland but rather a particular compositor/WM implementation or whatever. I want it to work but it's only in the last year or so that basic functionality like screenshots has become reliable.
What gets me is how old Wayland is. It's now older than Linux itself was when Wayland started. It started in the era of 2.6 kernel series, when most software was still 32-bit, systemd didn't exist, when Motora Razr was more common than iPhones, when native desktop applications were still the norm, Node.js didn't yet exist and Google Chrome was a completely new beta browser. Wayland is now reaching feature parity and some kind of "it works out of the box, usually" state when it's from a completely different era of computing.
The nearest point of comparison is perhaps systemd, another Linux project that is very large in scope, complicated, critical and must interface well with lots of pre-existing software. Four years after Poeterring's "Rethinking PID 1" post that introduced systemd, it was enabled and in use on many distros. The conservative Debian adopted it within five years. Now it's been clearly a major success, but Wayland has been perhaps the slowest serious software product to be in development.
> Although I'm not sure if it was the least resistance per se (as a social phenomenon), but just that it's an easier problem to tackle. Or maybe the authors means the same thing.
Or maybe it’s desktop environments pulling the ladder up behind them.
Well, it only took 15 years to someone to fix one of many Wayland design flaws and start to make it feel usable.
Now it will take another 15 years for people to settle down in a set of common protocols instead of writing their own extension protocols and others 15 years for window managers to mature at the same level of the X11 window managers.
Then, people who think they know better than everyone else will throw Wayland away and start from zero all over again.
Which is why WSL and Virtualiztion Framework have become the best way to have the Year of Desktop Linux, I really don't bother any other way.
I thought I still did as my travel netbook died, but then I ended up in UEFI mess, regardless of the distro, and decided in the end to give that role to a Samsung tablet with DEX support instead.
As a 25 year user of Linux I love wayland since cutting over to it about 5 years ago. No tearing ever, which I always had to battle with with X. Certain developers that must interact with the Wayland stack will have to do more work now, and some projects may not still be viable, I get it. I've been following the comments in Linux forums for years. But users exist too, and here is one data point for you.
Another datapoint: I'm a 20+ year off-and-on user of Linux who was never able to switch to desktop Linux for daily use because of the graphics issues. After recently deciding to abandon Windows for good and suffer whatever problems I would have with Linux, I was pleasantly surprised by the experience on Fedora/Wayland/Gnome. No issues with high DPI, per-monitor fractional scaling, tearing, bad performance, etc. which have plagued my Linux experiences in the past. There are still minor issues with Nvidia drivers, but this is very likely the fault of Nvidia and not the OSS community.
Whatever ideological debates there are underneath the X vs. Wayland divide, ultimately what I care about is things working as well as or better than other mainstream operating systems, and Wayland seems to deliver on that.
Recently I fixed my X11 tearing by simply running a compositor (compton). Not having a compositor didn't prevent xserver from working but it was tearing. I remember that historically it was a mess, some x11 gpu driver even had a "TearFree" configuration option.
I am still a bit sad that window shading isn't supported. I wonder if I am going to continue saying this until I am like all those people 20 years ago complaining about things they liked in CDE not being available in more modern DEs.
Basically something that us grey beards like in several window managers, it is not supported in GNOME since the version 3.0 reboot, and relates to minimizing a window to the title bar.
You can move the title bar around, and depending on the window manager either double clik to drop down again its contents, or by leaving the mouse pointer for a few seconds hover it, it will temporarly reveal its contents.
Yeah, it took me a while to discover that it was removed from KWin. (I eventually ended up reading the sources). It didn't help that KWin also does 3D effects, so all of my Web searches for "shading" were returning results for "shader" :-(
River was really great even before this split, so I'm very excited to see what happens in the space in the future. I switched to Niri while waiting for it to happen, and I'll probably switch back at some point.
If you were an Xmonad user I feel pretty confident in saying River is the Wayland WM for you.
I'm still on Xmonad mainly because I've only tried hyprland and it couldn't handle the master/slave stack the way xmonad does. On River, when I create a new window will it be inserted above the current selected window even if the current window is the master?
Also, when it was split up what did he call his window manager? Looks like the River repo is just for his display server/compositor
Last I checked the idealists weren't even willing to commit to a virtual 2D rectangular grid of pixels of arbitrary width and height. I think we'll be waiting a while (or more likely using a soft fork of the spec).
Ah, but now instead of one old difficult-to-maintain way of allowing an application to draw on a rectangle on the screen, we have five (and counting!) new and somewhat incompatible ways to do it. Progress.
I'm currently using a fully vibe-coded, personal River window manager that works just how I want it to. I switched to it after I realized I couldn't do everything I wanted in Hyprland (e.g. tile windows to equal areas instead of BSP by default).
Simple example of how impactful this separation has been for me.
I encountered similar setbacks with hyprland (https://github.com/ArikRahman/hydenix), and I eventually wound up preferring scrollable tiling managers. I restarted from scratch with niri, and have found it to be a stable platform to develop against. Here's my current dotfiles (https://github.com/ArikRahman/dotfiles)
Wasn't one of Wayland's key design features combining the window manager and compositor? I am not too familiar with its history but surely there have been presentations or papers about the Wayland designers' reasoning for doing so.
When the window manager is a separate process with async communication between the WM and display server things can get out of sync for a frame or two which leads to visual artifacts. In Wayland the window manager works synchronously with the compositor so that it's never out of sync.
Yeah, that makes sense. It seems like instead of introducing another IPC protocol like this project does, there could be a compositor that loads different window managers as plugins. Then everything is in the same process and there is no need for async communication. Of course a crash in the window manager would take down the compositor, but this is already true for Wayland compositors that combine both.
> It seems like instead of introducing another IPC protocol like this project does
It doesn't introduce a new IPC, it uses the Wayland protocol with the river-window-management-v1 extension. The extension mainly defines new objects and verbs for them, but it's the same protocol.
Separate process means that the window manager can be written in any language (even, e.g.: Python).
Window Management -- Overview:
F R A Hopgood, D A Duce, E V C Fielding, K Robinson, A S Williams. 29 April 1985.
This is the Proceedings of the Alvey Workshop at Cosener's House, Abingdon that took place from 29 April 1985 until 1 May 1985. It was input into the planning for the MMI part of the Alvey Programme.
The Proceedings were later published by Springer-Verlag in 1986.
James Gosling: SunDew - A Distributed and Extensible Window System.
As a result, one of the most amazing pieces of literature to come out of the X Consortium is the “Inter Client Communication Conventions Manual,” more fondly known as the “ICCCM”, “Ice Cubed,” or “I39L” (short for “I, 39 letters, L”). It describes protocols that X clients must use to communicate with each other via the X server, including diverse topics like window management, selections, keyboard and colormap focus, and session management. In short, it tries to cover everything the X designers forgot and tries to fix everything they got wrong. But it was too late — by the time ICCCM was published, people were already writing window managers and toolkits, so each new version of the ICCCM was forced to bend over backwards to be backward compatible with the mistakes of the past.
The ICCCM is unbelievably dense, it must be followed to the last letter, and it still doesn’t work. ICCCM compliance is one of the most complex ordeals of implementing X toolkits, window managers, and even simple applications. It’s so difficult, that many of the benefits just aren’t worth the hassle of compliance. And when one program doesn’t comply, it screws up other programs. This is the reason cut-and-paste never works properly with X (unless you are cutting and pasting straight ASCII text), drag-and-drop locks up the system, colormaps flash wildly and are never installed at the right time, keyboard focus lags behind the cursor, keys go to the wrong window, and deleting a popup window can quit the whole application. If you want to write an interoperable ICCCM compliant application, you have to crossbar test it with every other application, and with all possible window managers, and then plead with the vendors to fix their problems in the next release.
In summary, ICCCM is a technological disaster: a toxic waste dump of broken protocols, backward compatibility nightmares, complex nonsolutions to obsolete nonproblems, a twisted mass of scabs and scar tissue intended to cover up the moral and intellectual depravity of the industry’s standard naked emperor.
Using these toolkits is like trying to make a bookshelf out of mashed potatoes. - Jamie Zawinski
X Myths
X is a colletion of myths that have become so widespread and so prolific in the computer industry that many of them are now accepted as “fact,” without any thought or reflection.
Myth: X Demonstrates the Power of Client/Server Computing
At the mere mention of network window systems, certain propeller heads who confuse technology with economics will start foaming at the mouth about their client/server models and how in the future palmtops will just run the X server and let the other half of the program run on some Cray down the street. They’ve become unwitting pawns in the hardware manufacturers’ conspiracy to sell newer systems each year. After all, what better way is there to force users to upgrade their hardware than to give them X, where a single application can bog down the client, the server, and the network between them, simultaneously!
The database client/server model (the server machine stores all the data, and the clients beseech it for data) makes sense. The computation client/server model (where the server is a very expensive or experimental supercomputer, and the client is a desktop workstation or portable computer) makes sense. But a graphical client/server model that slices the interface down some arbitrary middle is like Solomon following through with his child-sharing strategy. The legs, heart, and left eye end up on the server, the arms and lungs go to the client, the head is left rolling around on the floor, and blood spurts everywhere.
The fundamental problem with X’s notion of client/server is that the proper division of labor between the client and the server can only be decided on an application-by-application basis. Some applications (like a flight simulator) require that all mouse movement be sent to the application. Others need only mouse clicks. Still others need a sophisticated combination of the two, depending on the program’s state or the region of the screen where the mouse happens to be. Some programs need to update meters or widgets on the screen every second. Other programs just want to display clocks; the server could just as well do the updating, provided that there was some way to tell it to do so.
The right graphical client/server model is to have an extensible server. Application programs on remote machines can download their own special extension on demand and share libraries in the server. Downloaded code can draw windows, track input events, provide fast interactive feedback, and minimize network traffic by communicating with the application using a dynamic, high-level protocol.
As an example, imagine a CAD application built on top of such an extensible server. The application could download a program to draw an IC and associate it with a name. From then on, the client could draw the IC anywhere on the screen simply by sending the name and a pair of coordinates. Better yet, the client can download programs and data structures to draw the whole schematic, which are called automatically to refresh and scroll the window, without bothering the client. The user can drag an IC around smoothly, without any network traffic or context switching, and the server sends a single message to the client when the interaction is complete. This makes it possible to run interactive clients over low-speed (that is, slow-bandwidth) communication lines.
Sounds like science fiction? An extensible window server was precisely the strategy taken by the NeWS (Network extensible Window System) window system written by James Gosling at Sun. With such an extensible system, the user interface toolkit becomes an extensible server library of classes that clients download directly into the server (the approach taken by Sun’s TNT Toolkit). Toolkit objects in different applications share common objects in the server, saving both time and memory, and creating a look-and-feel that is both consistent across applications and customizable. With NeWS, the window manager itself was implemented inside the server, eliminating network overhead for window manipulation operations — and along with it the race conditions, context switching overhead, and interaction problems that plague X toolkits and window manager.
Ultimately, NeWS was not economically or politically viable because it solved the very problems that X was designed to create.
... or the WM loads the compositor, or the WM links to a compositor library (i.e. wlroots). The point is there are options...
Honestly, every time this topic comes up, I feel like the person complaining just doesn't want to put in the work and they are angry that they don't get an easy win. And maybe that's a good thing. Do we really need more half baked WMs?
This also means low-performing clients can make the whole Desktop stutter/freeze and it is one of the many reasons the Wayland architecture is beyond idiotic. High responsiveness is obviously far more important than avoiding the occasional artifact.
Clients (apps) should still be async in Wayland; it's just the window manager that's tightly integrated. Wayland compositors should probably use some kind of fair queuing to prevent misbehaving apps from spamming the event loop but they probably don't.
Example: Windows can only be resized when the client finished drawing the new resized window. Otherwise you would get "visual artifacts". So typical operations dealing with window management can not be async when the insistence on the nonsensical "every frame is perfect" mantra is upheld.
Graceful degradation is a thing. If you don't get a response from a client that you are redrawing, you may just put a translucent shader on the window, show the user the changing borders immediately, and then resume whenever it's ready.
Well, that's exactly what the article is about. Wayland put all together into one process I order to avoid unnecessary context switch. This protocol aims to keep the performance advantages of Wayland without giving up on separation of graphics c server and window manager.
I was responding to this comment in the article and wondering about the historical context:
> Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance.
why? because the compositor completely changed the geometry of the desktop. being 3-dimensional. now, today most wayland compositors are 2 dimensional, but they don't have to be. a window manager that works with a 2 dimensional compositor would not work with a 3 dimensional one and vice versa. the window managers listed in the article are only compatible with this particular compositor.
it seems to me that separating window managers from compositors from the start would have created the expectation that all window managers work on all compositors. that is not and will not be the case.
if all compositors start separating out the window manager then the result would be that as a user i now have to choose two components, a compositor and a window manager, and i have to make sure they are compatible.
I think wmf's comment in this thread was absolutely correct and succinct, so I won't repeat, but I think it's worth noting that many (all?) of the Wayland devs were actually Xorg devs. Make of that what you will.
I've never used a system with Wayland (been on i3 for ~15 years) but every time a project like this comes up, I have to wonder why Wayland is even a thing. So many hoops to jump through for things that should be simple.
Sure, X11 has warts but I can make it do basically anything I want. Wayland seems like it will always have too much friction to ever consider switching.
I've been on wayland since KDE had it available (like the KDE 5 days) because it offered fractional HiDPI scaling that wasn't buns. As a laptop user, it has been one of the best features of Wayland.
Furthermore, getting stuff like VRR on Wayland working is way easier than X.org. And, Wayland also supports HDR.
From a user perspective it just works, the online issues are all hypotheticals or very specific scenarios. On wayland my computer just works, my displays can have different dpi scales, my video doesn't tear, and programs don't have full access to record the screen and keylog without asking permission.
My reason for switching from i3 to sway (about 8 years ago) is DPI support. High DPI is a pain in Xorg, and essentially impossible with heterogeneous monitors.
The migration was a one way thing. Lots of things are smoother and simpler, and not having to ever again touch Xorg.conf has improved my quality of life.
To this day, I still have different monitors with different scale factors.
The funny thing is that X11 can actually do heterogeneous dpi and Wayland can't.
Unfortunately you will never find yourself in a situation to actually use a mixed dpi X11 setup (you lose your homogeneous desktop) and Wayland is better at spoofing it (for whatever reason fractional scaling works better in Wayland).
"If you think this idea is a bit stupid, shed a tear for the future of the display servers: this same mechanism is essentially how Wayland compositors —Wayland being the purported future replacement for X— cope with mixed-DPI setups."
Yeah I've done that, I used my linux box for years with a 24" 1920x1200 screen and a 32" 4k screen next to each other.
Doing some basic mathematics and xrandr command-line wizardry to apply scaling factors to each display, I was able to get Xorg to render to a virtual framebuffer and treat the monitors as appropriately scaled windows onto it, so that dragging applications from one screen to the other didn't result in any noticeable change in size.
> not having to ever again touch Xorg.conf has improved my quality of life
I haven't touched xorg.conf in decades. I suppose you might have to do it to configure some unique setup, but for me this hasn't been an issue in a long time.
Now with Wayland, instead of having to touch a single config file, we have to learn how each compositor/WM is configured, and do it there instead. It hardly seems like an improvement in that regard, IMO.
There's a type of input called "DeviceEvent" which is a bit lower level than "Window event". It also occurs even if the window isn't "active".
Windows and X11 support this, but Wayland doesn't except for mouse movement. I noticed my program stopped working on Linux after I updated it. Ended up switching to Window Events, but still kind of irritating.
I don't think so, and it's something every Windows and X11 Linux application can do. Perhaps this perspective is a divide between people writing/using applications, and those using/writing web servers? But maybe the Wayland team disagrees, and this is one of the reasons for this restriction? I'm speculating.
> If you want that start your processes as different users.
How does this make any difference if they're going to connect to the same IPC that handles input/display?
The display server must absolutely enforce some kind of security boundary between clients. Clients that are running untrusted code (e.g. a web browser) must not be able to hijacked into controlling a potentially privileged client (e.g. a root terminal).
It runs just fine at 165 hz for me. Given that xrandr and CRTs have been around for a while, and both have supported high refresh rates for a long while, something seems fishy here. Something is probably at fault, but it's not X11.
Well, in my case, because the VCF on my synth sets its cutoff frequency based on the pointer's Y position and its resonance based on the pointer's X position
Sure, but that is pretty niche. My point was it’s pretty low friction for most people and certainly to try it.
But obviously be pragmatic, if it doesn’t work for you because you have a particular requirement, or even if it doesn’t offer any improvement over what you have then don’t use it.
I'm with you. I haven't had major issues with X11 for a good couple of decades. Ever since I didn't have to manually edit xorg.conf, I forget when that happened.
Granted, my requirements were simple, a laptop and occasionally one external monitor, though the issues I did run into were related to graphics drivers and NVIDIA shenanigans, but not to X11.
Now that I'm on Wayland, I do feel that visuals are slightly more responsive and crisper, but honestly, it wasn't worth replacing a bunch of my programs, significantly altering my workflow, and dealing with numerous new issues I didn't have to deal with before.
Unfortunately, the momentum is now fully with Wayland, and it's only a matter of time until X11 stops being supported altogether. The XLibre project is a noble idea, but a few contributors can't maintain an entire ecosystem on their own.
I completely agree. This feels like the typical “open-source developers rewriting something that already works well, then forcing it on people who never asked for it” kind of project.
It will take years to reach the feature set of X11. And for what? From my perspective as both a developer and a user, I see no tangible benefits.
On top of that, it breaks software I rely on professionally. Code that worked perfectly fine under X11.
Meanwhile, I can still build and run Windows programs I wrote 30 years ago on Windows 11.
Many theories. A simple one is that corporations wanted more control. See systemd's rise - not related to wayland as such, but to corporate-driven influence.
I am not saying all of the design is corporate-controlled. But a ton of propaganda is associated with how wayland was advertised, until some folks had enough with it and decided to stop buying the "xorg is dead" routine these corporations push on:
It will be interesting to see what will happen though. The GTK devs said they will help kill off xorg with GTK5. KDE also wants to kill xserver. It would be kind of cool if that would not happen - imagine if a non-corporate controlled ecosystem would emerge. Not likely to happen, but it would be a lot of fun. As well as more real competition with wayland. Wayland broke its biggest promise: that it is a viable alternative to the xorg-server. I don't want to lose any feature, so it is a drawback for me.
Xorg is not dead. It is maintained by one of those corporations you are talking about. And since 2/3 of the Xorg code is in Xwayland, they will be maintaining it for a very long time to come. It is not going anywhere.
But already a majority of Linux desktop users have stopped using it. And it will be 90% in the next 2-3 years. GNOME, KDE, Budgie, and COSMIC are effectively Wayland only now and XFCE and Cinnamon will be Wayland native before then.
The GTK5 devs do not have to “kill off” X11. But it is not really worth their time either.
Keep using Xorg, or Xlibre, or Phoenix. It should keep working.
But don’t mind if the rest of us keep building on Wayland.
By the way, I use Niri, a very cool and absolutely non-corporate Wayland compositor. Not sure how that fits into your narrative.
The fact that Wayland can't just substitute out pluggable WMs without changing a bunch of other unrelated infrastructure is IMO one of the biggest user-facing losses relative to X11. Anybody who is working to improve that is doing god's work as they say.
Not only a loss but a key disabler. Having used to having the same customized window manager for decades it's impossible to change to Wayland until there's a fully equivalent interface for managing windows so that everything works as I want from mouse clicks to keyboard shortcuts. Maybe it could be an existing window manager adding support for River, or Wayback layer that reimplements an X11 desktop root on top of a minimal Wayland compositor, but none of the current Wayland compositors even scratch the surface of this.
It's a damper on development of new WMs and DEs, too. I have ideas for my own desktop I'd like to explore at some point, and if I do it'll almost certainly be X11 based initially because it's so much more quick and easy to wrap one's head around and get the iteration loop up and running with.
I'm not anti-Wayland and I think X11 has enough issues that it's worth transitioning over to something better but this is a critical weakness in Wayland's design.
Handwaving "just expose an API" ignores the mess at the extension boundary. Modular only works if the contract is airtight, and with Wayland's churn and "sorta spec" documenation, that sounds optimistic at best.
Every "flexible" API turns into a leaky mess unless someone is paid to write the dullest test suite in existance, and nobody is. Mandating one design is ugly, but pretending composition is free is a fairy tale.
Well, window management is likely compositor-specific in the first place, so I don't really see the problem. It's not part of the spec, because it would unreasonably constrain what an implementation could do. E.g. what window management does a kiosk compositor have? What even is window placement, what if I am writing a 3D compositor so x/y positioning doesn't even make sense?
We need a compositor that exposes everything as an extension. Preferably in a hot-reloadable, tweakable way, say, using Lua (with JIT). And also exposing its APIs in a way that allows having an analog of xdotool.
Honestly, probably the best Linux GUI stack would look like a root Wayland server (not running as root ofc), inside which are nested a per-user Wayland servers (which can be switched between rendering to a monitor or offscreen for a remote login), inside which is nested an X11 server (which is freed from having to care about hardware), inside which runs a normal window manager.
"Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that.""
The Wayland standard does not prescribe it (unlike X), and the reference implementations were monolithic for a very long time.
Wayland in general had a rather cavalier approach to doing away with things that X users take for granted, like, well, making screenshots. Eventually, under pressure, those in charge agreed that these features are actually very important for real users, so implementations appeared. It's an understandable way to discover the minimal usable subset of features, but the process of it is a bit frustrating for the early adopters.
That's not the same thing. It's way easier to write an X11 window manager than to write a Wayland compositor, even with something like wlroots, because the window manager can speak the same protocol that clients speak, and it runs as a separate process.
As a concrete example, Emacs' EXWM package works by implementing an X11 client library in Emacs Lisp, then using it to talk to the X server (which is a separate process, so this works fine) and telling it how to position windows.
Whereas on Wayland, this is not possible without re-implementing a standalone compositor process, because otherwise architecturally it doesn't work. Emacs can't both do the drawing and be drawn.
EWM implements a Wayland compositor as a native thread spawned by a dynamic module in Emacs, it's a full compositor within the Emacs process: https://codeberg.org/ezemtsov/ewm
So it is architecturally possible (but infeasible in plain Emacs Lisp).
This one could technically be written in plain Emacs Lisp, but I'm happy to use something that already has all the XML codegen stuff for Wayland figured out. Dynamic modules work pretty well, fwiw.
Oh, reka looks interesting. Thanks for linking it. I don't disagree with you about dynamic modules, I just think that EWM's architecture shouldn't be necessary. (In which I think we agree?)
No, that still requires you to make the whole thing, you just get help. For instance, I've run into a problem where I try some great new compositor that uses wlroots, and even though wlroots has good support for keyboard layouts I can't actually set the layout because the compositor hasn't wired up that functionality.
Especially with LLMs, the cost here is down significantly. People also drastically over-idealize what making an X window manager entailed: sure X had it's compositor, but you had to build so so much yourself.
I'm glad River is trying to create a bigger base here; this is way cool. And it sort of proves the value of Wayland: someone can just go do that. Someone can just make a generic compositor/display-server now, with their own new architecture and plugin system, and it'll just work with existing apps.
We were so locked in to such a narrow limited system, with it's own parallel abstraction layer to what the kernel now offers (that didn't exist when X was created). It's amazing that we have a chance for innovation and improvement now. The kernel as a stable base of the pyramid, wlroots/sway as a next layer up, and now River as a higher layer still for folks to experiment and create with. This could not be going better, and there's so much more freedom and possibility; this is such a great engine for iteration and improvement.
Traditionally, X11 didn’t have compositors, and didn’t need the extra round trip wayland exists to remove.
I wonder if there’s space for a project like xlibre (or x.org, if it were revived) to update the x11 protocol to fill whatever gap compositors were meant to fill.
For what it’s worth, I’ve been moving all my machines to lxde.
Apparently, I accidentally switched back to a compositor free desktop without noticing. High framerate, vsync/tear-free and high dpi work fine. So does fractional scaling, but I disable it.
Personally, I’d rather these hypothetical x11 devs focused on reverse engineering hdmi vrr (blocked by lawyers at the moment), and HDR / expanded color spaces.
I guess this kind misses the point. I was an early compiz user (wobbly windows, fire effect and all), but, at this point I just don’t miss it (literally: I thought I was running a compositor for the last ~5 years, and just… wasn’t.)
The X windows paradigm was fine, and still works great with modern hardware.
Lots of weird misinformation in the comments here. Wayland doesn't choose anything. It leaves the compositor to decide where to position a window and whether or not that window receives key presses or not. The program can't draw wherever it wants or receive system wide keystrokes or on behalf of another program. When appropriately implemented the screenshot system is built directly into the compositor. It's an API that let's a program request read access to a part of the screen and the compositor provides upon approval. It's much more secure that way and it works perfectly fine these days. Unfortunately not every compositor implements this.
A project that has a daemon run in the background as a root service and that can provide an appropriate shim to pass key strokes to anything you want.
And just to be clear the appropriate secure model is to have a program request to register a "global" hot key and then the compositor passes it to the appropriate program once registered. This is already a thing in KDE Plasma 6 and works just fine.
The thing is: they mostly do implement xdg-desktop-portal's screenshot API since that does also handle permission management.
In effect the modern desktop is wayland (window communication), pipewire (audio/video), and xdg-desktop-portal (compositor/environment requests) which all kinda have to be worked with for a desktop application.
Separating the compositor and window manager feels like one of those ideas that seems obvious in hindsight, but the protocol/state-machine design here shows why it took real work to make it practical.
Lowering the barrier for writing Wayland window managers without forcing everyone to build a full compositor seems like a big win.
I've been working in the tech field for a while, but I'm new to HN. I'd never explored the platform in depth before, and recently decided to start participating and interacting with people here.
The discussions and knowledge shared here have already been very valuable to my own learning. So I hope to contribute to the community in the same way... but I felt I needed to be more active in the community before that.
If anyone is curious or still has questions, I can also share my LinkedIn or GitHub.
Insightful article. I don't recall ever viewing an easy-to-follow lesson, tutorial or book for that matter that clearly explained the various components of a Linux Desktop environment. Always had to follow complicated and obscure guides to do this and that, when solving issues, but seldom did any explain their functions clearly.
I'm currently using an old window manager that I dug out of the depths of history. 1.0 was released in 1985. 2.0 shortly after. 3.0 a few years later. But version 3.1 was when things really got good. It's been great ever since.
At this point, take all the lessons of wayland, plan everything in advance rather than incrementally deciding basic things like screenshotting and then build something new, superseding wayland so that power users like me and app developers will stop clinging to X. Right now I have no confidence in wayland and I know I'm not alone.
It is 18 years old (started in 2008 IIRC) and just now approaching something usable. So on the one hand it is a really old project whose original design considerations became obsolete a decade ago - I remember people were very bothered by the performance loss of needing several process switches with the X11 damage model in order to push an update to the screen, but on today's multi-core hardware that is basically free and everyone is using browser engines and writing their GUI in javascript anyway. But on the other, do you really want to spend another 10-20 years rewriting the Linux GUI stack from scratch only to reimplement "Wayland with best established extensions"?
It's biggest hurdle is having to explain even to tech people on HN that it's actually a good idea to have a UI where a user can approve a screen sharing request. You'd think for folks that claim to care about security that'd be a prime concern. It really is so weird how difficult that is for people to grasp. The implementation is likewise not complicated. Seriously how hard is it to draw a box selector and show an okay / cancel box.
If everyone appears to be missing something that's so easy to understand and implement, perhaps they're not missing it. They could have a different security/threat model than you're using. They could be expressing frustrations with being forced to manually approve something every time. They could be hitting dumb bugs in the implementation. There could be different people clamoring for more security and less intrusive security.
Honestly most people are just being lazy about it. You don't even need to prompt the user if you wanna allow everything by default. You just need to implement the screenshots, screensharing, and hot keys APIs. All 3 are super simple.
Then we also hit the question of who we're talking to/about. If you want to tell devs that they should implement a handful of general, simple APIs, that's probably fair. (Please start with the GNOME devs.) But some of us are just users explaining why Wayland doesn't work for us; even if we wanted, we can't fix it.
i'm a little thrown, because the Wayland diagram doesn't feel quite right. the compositor does lie between the kernel and the apps, but IIRC the apps have their own graphics buffers from the kernel that they are drawing into directly. the compositor then composites them together. to me, that feels more like the kernel is at the center of the diagram here: the wayland compositor is between the kernel and the output / input.
i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
it means that the task of writing the display-server / compositor is much much much simpler. it's still hard! but the kernel is helping so much. there's an assumed base of having working GPU drivers!
author appears to super know their stuff. alas the FOSDEM video they link to is not loading for me. :(
one major question, since this is a protocol, how viable is it to decompose the window management tasks? rather than have a monolithic window manager, does this facilitate multiple different programs working together to run a desktop? not entirely sure the use case, but a more pluggable desktop would be interesting!
> the compositor then composites them together. to me, that feels more like the kernel is at the center of the diagram here: the wayland compositor is between the kernel and the output / input.
It's also possible to use hardware planes to get the actual graphics device to composite for you directly from its video memory, effectively reducing latency to the lowest possible.
>i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
Are you an AI bot? Modern X11 server using DRM are more than 20 years old. You are talking about how X11 servers worked in the 90's
Yes exactly. DRM exists, but there's still what I called the X "kernel", all of it's heavyweight abstractions.
To the previous a-hole, frak you: not an AI. That's rude as frak. Also, you manage to be incredibly wrong. Even an AI wouldn't overlook such an obvious error; maybe it'd be better to have it replace you. So rude dude! Behave!
I switched to niri a few months ago, and while I like it for the most part, it feels too... busy for my taste. It defaults to a bunch of animations and decorations, all of which I've turned off. I'm happy with my current setup (aside from Wayland quirks[1]), but river's design and simplicity are very appealing. It reminds me of the philosophy of bspwm/sxhkd which I used for years on X11.
I do need scrollable tiling now that I've tried it, and I'm happy that there are a couple of options to choose from with river.
[1]: Seriously, why does copy/pasting sometimes simply not work?? I often have to copy twice for it to succeed. It's not related to Xwayland -> Wayland apps, and viceversa, or with copying from closed windows, etc. I don't use nor want a clipboard "manager". I just want my clipboard to work consistently. I've read many reports of this same bug on different distros and DEs, and nobody has figured it out. It's infuriating that such a basic feature is half-broken in a project that is 17 years old now!
> If Sway, Hyprland, and others each implement their own WM separation protocol
I think that's pretty unlikely. The smaller compositors actually collaborate fairly well, and if sway, hyprland, niri, KDE etc. decide to implement this, I think they will probably work with river to create a standardized protocol that works across compositors. That has happened before. Hyprland is maybe more likely to do their own thing than the other, but if a standardized protocol caught on I think they would follow that.
Gnome though... I don't think there is a great chance they implement something like this even if several other compositors implement a standardized protocol for it.
I feel like the word "protocol" is tripping you up. This isn't meant to be some standard that gets a bunch of traction in other projects. It's a protocol for the the River compositor; as the name suggests. Before this there was, I believe, river-layout-v3. It's all just getting taken to the next level; from layout to full window management.
Sure I agree that wayland (the project) could have provided these abstractions much earlier. But anyone else could have done it, too. We get all of this for free, so we shouldn't complain if other people don't do the work that we could do just as well.
They took a firm principled stance against screenshots to start with, which set them up for the COVID WFH wave. Then we've got this questionable design that seems hard to make accessible since accessibility is a security risk and we're heading right into Agentic AI which will be interesting. I've been avoiding the Wayland ecosystem for as long as I can after the initial burn and it'll be curious to see how well it supports bringing in new AI tooling. Maybe quite well, I gather that Pipewire is taking over the parts of the ecosystem that Wayland left for someone else and maybe the community has grown to the point where it has overcome the poor design of Wayland's security model by routing around it.
My guess is the frustration is coming from a similar perspective because it is a bit scary seeing Wayland getting picked up everywhere as a default and the evidence to date is they don't really consider a user-friendly system as a core design outcome. Realistically Wayland is 2 steps forward even if there is a step back here or there. The OSS world has never been defined by a clean and well designed graphics stack.
As ordinary user, I actually don't care about any of this. However, from another perspective, I think this is a bad thing—open source projects have become product-centered, defaulting to the assumption that users are ignorant fools. This isn't how community projects should behave, but those projects is not that community-driven anyway.
After all, for a long time, so-called security has only been a misused justification—never letting users make mistakes is just a pretty excuse, meant to keep users from being able to easily access something, and eventually from ever accessing it at all.
That they provide this stuff for free would be a good argument if the stuff wasn't pushed down people's throats with no working alternative and Xorg being discontinued.
because the devs actually have implemented things that i cared about
The problem isn't they "pushing stuff down your throats", it's nobody else (including you) making alternatives that you like better. You are voluntarily ingesting their stuff because your only alternative is starving.
It's a forcing of their narrow opinion on what should be allowed onto the ecosystem at large, because all of these things are connected. You can leave to a different DE/distro, but if every DE is doing its own thing for global hotkeys or whatever, then software in the ecosystem is going to be hacky/bespoke or have an unreasonable maintenance burden.
Even if you in particular can move elsewhere the ecosystem is still held back. We only recently got consensus on apps being able to request a window position on screen, which is something x11, macos, and windows all allow you to do. CSD and tray icons are other examples of things found everywhere else that they did not want to support. Some applications are just broken without tray icon support.
This bleeds over into work for folks releasing software for Linux in general. By not supporting SSD they were pushing the burden of drawing window decorations onto every single app author, and while most frameworks will handle this, it's not like everyone is using qt or gtk. App authors will get bug reports and the burden of releasing software on Linux needlessly climbs again.
Hard to convey how unreasonable I feel their stance was on tray icons / SSD. It should be the domain of the DE from a conceptual but also practical point of view, even from just the amount of work involved. It reminds me of LSP's enabling text editors to have great support for every language. And again, Gnome was the odd man out in this, they want extra attention and work when Linux is the lowest desktop marketshare by far, and they themselves are not the overwhelming majority but they are large enough that you really do need to make sure your software runs well on Gnome even if you want to support Linux.
People think Gnome push stuff down your throat because they have the power and influence to impact the ecosystem, and they use that power and influence to die on absolutely absurd hills.
> Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance.
Although I'm not sure if it was the least resistance per se (as a social phenomenon), but just that it's an easier problem to tackle. Or maybe the authors means the same thing.
(That and the remote access story needs to be fixed. It just works in X11. Last time I tried it with a system that had 90 degree display orientation, my input was 90 degrees off from the real one. Now, this is of course just a bug, but I have a strong feeling that the way architecture Wayland has been built makes these kind of bugs much easier to create than in X11.)
There were probably other ways to fix those issues, but it would still be a fair amount of churn.
X11's extension mechanisms can - and has - been used to enable backwards incompatible protocol changes. E.g. BigRequest changes the length and format of every single protocol request.
Very few client libraries are only capable of speaking "the existing" protocol if you take that to mean the original unextended X11 protocol.
Adding an X11 extension that when enabled cleans up a lot of cruft would not have been a problem.
> where ideally the window manager would make choices about things before the process continues
Nothing stops you from introducing an extension that when enabled requires the client to wait for a new notification type before continuing, or re-defines behaviour. That said, using my own custom window manager, I don't know what you mean here. My WM does decide the initial window placement and size, and it's the clients damn problem if it can't handle a resize before I allow the window to be mapped.
The X protocol is crusty in places, but it is very flexible. People haven't fixed these things because they chose to invent compatibility hindrances that weren't real when their response was to invent an entirely new protocol with no compatibility at all.
On a tangent, also very ironic that X (the successor of Twitter) has the exact same logo as X (the window system). It's like Elon Musk just Googled for the first X logo that came along and appropriated that and nobody seems to notice or care.
When there was the 90deg off bug, was that a bug in the compositor or in wlroots?
Wayland can use RDP and some other remote desktop protocols, but it is not what I want, I want a window, not a desktop. There is Waypipe now, I heard it works fine now, but I am still doing "ssh -X", because it just works.
The problem with Wayland is that it is very much "batteries not included". To all the things that worked well in X11, the response has been "it can be done, our protocol is very flexible, ask the guys writing the compositor", not "that's how is done". The result, Wayland is 18 years old and it is only starting to work well, with some pain points still remaining, and display forwarding is one of them.
It is funny you mention a "reasonable path" by the way, as it is exactly that problem, I don't want a "reasonable path", I want it to work, and after 18 years, I think it is a reasonable expectation. To their credit, it seems we are getting there: waypipe, and now window managers, we may finally have feature parity.
What gets me is how old Wayland is. It's now older than Linux itself was when Wayland started. It started in the era of 2.6 kernel series, when most software was still 32-bit, systemd didn't exist, when Motora Razr was more common than iPhones, when native desktop applications were still the norm, Node.js didn't yet exist and Google Chrome was a completely new beta browser. Wayland is now reaching feature parity and some kind of "it works out of the box, usually" state when it's from a completely different era of computing.
The nearest point of comparison is perhaps systemd, another Linux project that is very large in scope, complicated, critical and must interface well with lots of pre-existing software. Four years after Poeterring's "Rethinking PID 1" post that introduced systemd, it was enabled and in use on many distros. The conservative Debian adopted it within five years. Now it's been clearly a major success, but Wayland has been perhaps the slowest serious software product to be in development.
Or maybe it’s desktop environments pulling the ladder up behind them.
Now it will take another 15 years for people to settle down in a set of common protocols instead of writing their own extension protocols and others 15 years for window managers to mature at the same level of the X11 window managers.
Then, people who think they know better than everyone else will throw Wayland away and start from zero all over again.
I thought I still did as my travel netbook died, but then I ended up in UEFI mess, regardless of the distro, and decided in the end to give that role to a Samsung tablet with DEX support instead.
The "limitations" are political, not technical.
Whatever ideological debates there are underneath the X vs. Wayland divide, ultimately what I care about is things working as well as or better than other mainstream operating systems, and Wayland seems to deliver on that.
You can move the title bar around, and depending on the window manager either double clik to drop down again its contents, or by leaving the mouse pointer for a few seconds hover it, it will temporarly reveal its contents.
https://en.wikipedia.org/wiki/WindowShade
If you were an Xmonad user I feel pretty confident in saying River is the Wayland WM for you.
Also, when it was split up what did he call his window manager? Looks like the River repo is just for his display server/compositor
Anything else will be taken by derivatives like Android, ChromeOS, or the VMs on top of Windows/macOS.
Simple example of how impactful this separation has been for me.
hy3?
https://github.com/outfoxxed/hy3
(I'm an ex i3/now sway user and hy3 is the only way I can bear using hyprland)
It doesn't introduce a new IPC, it uses the Wayland protocol with the river-window-management-v1 extension. The extension mainly defines new objects and verbs for them, but it's the same protocol.
Separate process means that the window manager can be written in any language (even, e.g.: Python).
https://en.wikipedia.org/wiki/NeWS
Everything old is new again.
Alan Kay on “Should web browsers have stuck to being document viewers?” and a discussion of Smalltalk, HyperCard, NeWS, and HyperLook:
https://donhopkins.medium.com/alan-kay-on-should-web-browser...
Window Management -- Overview: F R A Hopgood, D A Duce, E V C Fielding, K Robinson, A S Williams. 29 April 1985.
This is the Proceedings of the Alvey Workshop at Cosener's House, Abingdon that took place from 29 April 1985 until 1 May 1985. It was input into the planning for the MMI part of the Alvey Programme.
The Proceedings were later published by Springer-Verlag in 1986.
James Gosling: SunDew - A Distributed and Extensible Window System.
https://www.chilton-computing.org.uk/inf/literature/books/wm...
The X-Windows Disaster: This is Chapter 7 of the UNIX-HATERS Handbook. The X-Windows Disaster chapter was written by Don Hopkins.
https://donhopkins.medium.com/the-x-windows-disaster-128d398...
As a result, one of the most amazing pieces of literature to come out of the X Consortium is the “Inter Client Communication Conventions Manual,” more fondly known as the “ICCCM”, “Ice Cubed,” or “I39L” (short for “I, 39 letters, L”). It describes protocols that X clients must use to communicate with each other via the X server, including diverse topics like window management, selections, keyboard and colormap focus, and session management. In short, it tries to cover everything the X designers forgot and tries to fix everything they got wrong. But it was too late — by the time ICCCM was published, people were already writing window managers and toolkits, so each new version of the ICCCM was forced to bend over backwards to be backward compatible with the mistakes of the past.
The ICCCM is unbelievably dense, it must be followed to the last letter, and it still doesn’t work. ICCCM compliance is one of the most complex ordeals of implementing X toolkits, window managers, and even simple applications. It’s so difficult, that many of the benefits just aren’t worth the hassle of compliance. And when one program doesn’t comply, it screws up other programs. This is the reason cut-and-paste never works properly with X (unless you are cutting and pasting straight ASCII text), drag-and-drop locks up the system, colormaps flash wildly and are never installed at the right time, keyboard focus lags behind the cursor, keys go to the wrong window, and deleting a popup window can quit the whole application. If you want to write an interoperable ICCCM compliant application, you have to crossbar test it with every other application, and with all possible window managers, and then plead with the vendors to fix their problems in the next release.
In summary, ICCCM is a technological disaster: a toxic waste dump of broken protocols, backward compatibility nightmares, complex nonsolutions to obsolete nonproblems, a twisted mass of scabs and scar tissue intended to cover up the moral and intellectual depravity of the industry’s standard naked emperor.
Using these toolkits is like trying to make a bookshelf out of mashed potatoes. - Jamie Zawinski
X Myths
X is a colletion of myths that have become so widespread and so prolific in the computer industry that many of them are now accepted as “fact,” without any thought or reflection.
Myth: X Demonstrates the Power of Client/Server Computing At the mere mention of network window systems, certain propeller heads who confuse technology with economics will start foaming at the mouth about their client/server models and how in the future palmtops will just run the X server and let the other half of the program run on some Cray down the street. They’ve become unwitting pawns in the hardware manufacturers’ conspiracy to sell newer systems each year. After all, what better way is there to force users to upgrade their hardware than to give them X, where a single application can bog down the client, the server, and the network between them, simultaneously!
The database client/server model (the server machine stores all the data, and the clients beseech it for data) makes sense. The computation client/server model (where the server is a very expensive or experimental supercomputer, and the client is a desktop workstation or portable computer) makes sense. But a graphical client/server model that slices the interface down some arbitrary middle is like Solomon following through with his child-sharing strategy. The legs, heart, and left eye end up on the server, the arms and lungs go to the client, the head is left rolling around on the floor, and blood spurts everywhere.
The fundamental problem with X’s notion of client/server is that the proper division of labor between the client and the server can only be decided on an application-by-application basis. Some applications (like a flight simulator) require that all mouse movement be sent to the application. Others need only mouse clicks. Still others need a sophisticated combination of the two, depending on the program’s state or the region of the screen where the mouse happens to be. Some programs need to update meters or widgets on the screen every second. Other programs just want to display clocks; the server could just as well do the updating, provided that there was some way to tell it to do so.
The right graphical client/server model is to have an extensible server. Application programs on remote machines can download their own special extension on demand and share libraries in the server. Downloaded code can draw windows, track input events, provide fast interactive feedback, and minimize network traffic by communicating with the application using a dynamic, high-level protocol.
As an example, imagine a CAD application built on top of such an extensible server. The application could download a program to draw an IC and associate it with a name. From then on, the client could draw the IC anywhere on the screen simply by sending the name and a pair of coordinates. Better yet, the client can download programs and data structures to draw the whole schematic, which are called automatically to refresh and scroll the window, without bothering the client. The user can drag an IC around smoothly, without any network traffic or context switching, and the server sends a single message to the client when the interaction is complete. This makes it possible to run interactive clients over low-speed (that is, slow-bandwidth) communication lines.
Sounds like science fiction? An extensible window server was precisely the strategy taken by the NeWS (Network extensible Window System) window system written by James Gosling at Sun. With such an extensible system, the user interface toolkit becomes an extensible server library of classes that clients download directly into the server (the approach taken by Sun’s TNT Toolkit). Toolkit objects in different applications share common objects in the server, saving both time and memory, and creating a look-and-feel that is both consistent across applications and customizable. With NeWS, the window manager itself was implemented inside the server, eliminating network overhead for window manipulation operations — and along with it the race conditions, context switching overhead, and interaction problems that plague X toolkits and window manager.
Ultimately, NeWS was not economically or politically viable because it solved the very problems that X was designed to create.
Honestly, every time this topic comes up, I feel like the person complaining just doesn't want to put in the work and they are angry that they don't get an easy win. And maybe that's a good thing. Do we really need more half baked WMs?
> Although, I do not know for sure why the original Wayland authors chose to combine the window manager and Wayland compositor, I assume it was simply the path of least resistance.
why? because the compositor completely changed the geometry of the desktop. being 3-dimensional. now, today most wayland compositors are 2 dimensional, but they don't have to be. a window manager that works with a 2 dimensional compositor would not work with a 3 dimensional one and vice versa. the window managers listed in the article are only compatible with this particular compositor.
it seems to me that separating window managers from compositors from the start would have created the expectation that all window managers work on all compositors. that is not and will not be the case.
if all compositors start separating out the window manager then the result would be that as a user i now have to choose two components, a compositor and a window manager, and i have to make sure they are compatible.
Sure, X11 has warts but I can make it do basically anything I want. Wayland seems like it will always have too much friction to ever consider switching.
Furthermore, getting stuff like VRR on Wayland working is way easier than X.org. And, Wayland also supports HDR.
The migration was a one way thing. Lots of things are smoother and simpler, and not having to ever again touch Xorg.conf has improved my quality of life.
To this day, I still have different monitors with different scale factors.
Unfortunately you will never find yourself in a situation to actually use a mixed dpi X11 setup (you lose your homogeneous desktop) and Wayland is better at spoofing it (for whatever reason fractional scaling works better in Wayland).
http://wok.oblomov.eu/tecnologia/mixed-dpi-x11/
My favorite quote from that writeup.
"If you think this idea is a bit stupid, shed a tear for the future of the display servers: this same mechanism is essentially how Wayland compositors —Wayland being the purported future replacement for X— cope with mixed-DPI setups."
Doing some basic mathematics and xrandr command-line wizardry to apply scaling factors to each display, I was able to get Xorg to render to a virtual framebuffer and treat the monitors as appropriately scaled windows onto it, so that dragging applications from one screen to the other didn't result in any noticeable change in size.
Worked pretty well.
I haven't touched xorg.conf in decades. I suppose you might have to do it to configure some unique setup, but for me this hasn't been an issue in a long time.
Now with Wayland, instead of having to touch a single config file, we have to learn how each compositor/WM is configured, and do it there instead. It hardly seems like an improvement in that regard, IMO.
There's a type of input called "DeviceEvent" which is a bit lower level than "Window event". It also occurs even if the window isn't "active".
Windows and X11 support this, but Wayland doesn't except for mouse movement. I noticed my program stopped working on Linux after I updated it. Ended up switching to Window Events, but still kind of irritating.
Meanwhile if you have root you're still free to do so directly.
How does this make any difference if they're going to connect to the same IPC that handles input/display?
The display server must absolutely enforce some kind of security boundary between clients. Clients that are running untrusted code (e.g. a web browser) must not be able to hijacked into controlling a potentially privileged client (e.g. a root terminal).
X11 can't do high refresh rates every time that I've tried to do so.
That’s not a reason to do it of course, for me the driver was support for multiple monitors with different scaling requirements.
But obviously be pragmatic, if it doesn’t work for you because you have a particular requirement, or even if it doesn’t offer any improvement over what you have then don’t use it.
Granted, my requirements were simple, a laptop and occasionally one external monitor, though the issues I did run into were related to graphics drivers and NVIDIA shenanigans, but not to X11.
Now that I'm on Wayland, I do feel that visuals are slightly more responsive and crisper, but honestly, it wasn't worth replacing a bunch of my programs, significantly altering my workflow, and dealing with numerous new issues I didn't have to deal with before.
Unfortunately, the momentum is now fully with Wayland, and it's only a matter of time until X11 stops being supported altogether. The XLibre project is a noble idea, but a few contributors can't maintain an entire ecosystem on their own.
It will take years to reach the feature set of X11. And for what? From my perspective as both a developer and a user, I see no tangible benefits.
On top of that, it breaks software I rely on professionally. Code that worked perfectly fine under X11.
Meanwhile, I can still build and run Windows programs I wrote 30 years ago on Windows 11.
I am not saying all of the design is corporate-controlled. But a ton of propaganda is associated with how wayland was advertised, until some folks had enough with it and decided to stop buying the "xorg is dead" routine these corporations push on:
https://github.com/X11Libre/xserver
It will be interesting to see what will happen though. The GTK devs said they will help kill off xorg with GTK5. KDE also wants to kill xserver. It would be kind of cool if that would not happen - imagine if a non-corporate controlled ecosystem would emerge. Not likely to happen, but it would be a lot of fun. As well as more real competition with wayland. Wayland broke its biggest promise: that it is a viable alternative to the xorg-server. I don't want to lose any feature, so it is a drawback for me.
But already a majority of Linux desktop users have stopped using it. And it will be 90% in the next 2-3 years. GNOME, KDE, Budgie, and COSMIC are effectively Wayland only now and XFCE and Cinnamon will be Wayland native before then.
The GTK5 devs do not have to “kill off” X11. But it is not really worth their time either.
Keep using Xorg, or Xlibre, or Phoenix. It should keep working.
But don’t mind if the rest of us keep building on Wayland.
By the way, I use Niri, a very cool and absolutely non-corporate Wayland compositor. Not sure how that fits into your narrative.
I'm not anti-Wayland and I think X11 has enough issues that it's worth transitioning over to something better but this is a critical weakness in Wayland's design.
Or build on River as this article suggests.
https:/github.com/X11Libre/xserver
I think x11libre specifically is a little suspicious though. There's nothing wrong with the xorg packages I use today that requires a fork.
I don't really get why would it be a good idea to somehow mandate a specific architecture design from the standard.
Every "flexible" API turns into a leaky mess unless someone is paid to write the dullest test suite in existance, and nobody is. Mandating one design is ugly, but pretending composition is free is a fairy tale.
https://news.ycombinator.com/newsguidelines.html
Wayland in general had a rather cavalier approach to doing away with things that X users take for granted, like, well, making screenshots. Eventually, under pressure, those in charge agreed that these features are actually very important for real users, so implementations appeared. It's an understandable way to discover the minimal usable subset of features, but the process of it is a bit frustrating for the early adopters.
Indeed - implementations, plural. Incompatible with each other, naturally.
As a concrete example, Emacs' EXWM package works by implementing an X11 client library in Emacs Lisp, then using it to talk to the X server (which is a separate process, so this works fine) and telling it how to position windows.
Whereas on Wayland, this is not possible without re-implementing a standalone compositor process, because otherwise architecturally it doesn't work. Emacs can't both do the drawing and be drawn.
So it is architecturally possible (but infeasible in plain Emacs Lisp).
For river (the thing this article is about) I wrote an Emacs WM, but also opted for a dynamic module for the Wayland protocol parts: https://code.tvl.fyi/tree/tools/emacs-pkgs/reka
This one could technically be written in plain Emacs Lisp, but I'm happy to use something that already has all the XML codegen stuff for Wayland figured out. Dynamic modules work pretty well, fwiw.
It's not easy and the major compositors (Gnome, KDE) are NOT wlroots based, making this point mostly moot anyway.
This protocol at least has a chance of using a custom WM with an advanced compositor (which wlroots is not).
I'm glad River is trying to create a bigger base here; this is way cool. And it sort of proves the value of Wayland: someone can just go do that. Someone can just make a generic compositor/display-server now, with their own new architecture and plugin system, and it'll just work with existing apps.
We were so locked in to such a narrow limited system, with it's own parallel abstraction layer to what the kernel now offers (that didn't exist when X was created). It's amazing that we have a chance for innovation and improvement now. The kernel as a stable base of the pyramid, wlroots/sway as a next layer up, and now River as a higher layer still for folks to experiment and create with. This could not be going better, and there's so much more freedom and possibility; this is such a great engine for iteration and improvement.
I wonder if there’s space for a project like xlibre (or x.org, if it were revived) to update the x11 protocol to fill whatever gap compositors were meant to fill.
For what it’s worth, I’ve been moving all my machines to lxde.
Apparently, I accidentally switched back to a compositor free desktop without noticing. High framerate, vsync/tear-free and high dpi work fine. So does fractional scaling, but I disable it.
Personally, I’d rather these hypothetical x11 devs focused on reverse engineering hdmi vrr (blocked by lawyers at the moment), and HDR / expanded color spaces.
Any gotchas or regrets?
Haven't used it in many years and now considering going all in on making it (scaffolding of) next DE. So looking at the same move.
note that compiz is also a windowmanager, so already then compositor and window manager were one unit.
The X windows paradigm was fine, and still works great with modern hardware.
However if you really really really wanna side step this you can look at keyd - https://github.com/rvaiya/keyd
A project that has a daemon run in the background as a root service and that can provide an appropriate shim to pass key strokes to anything you want.
And just to be clear the appropriate secure model is to have a program request to register a "global" hot key and then the compositor passes it to the appropriate program once registered. This is already a thing in KDE Plasma 6 and works just fine.
That's kind of a big sticking point. When GNOME, KDE, and eg. Sway all have different screenshot APIs, the (eco)system doesn't work.
In effect the modern desktop is wayland (window communication), pipewire (audio/video), and xdg-desktop-portal (compositor/environment requests) which all kinda have to be worked with for a desktop application.
I've been heard!
Separating the compositor and window manager feels like one of those ideas that seems obvious in hindsight, but the protocol/state-machine design here shows why it took real work to make it practical.
Lowering the barrier for writing Wayland window managers without forcing everyone to build a full compositor seems like a big win.
I've been working in the tech field for a while, but I'm new to HN. I'd never explored the platform in depth before, and recently decided to start participating and interacting with people here.
The discussions and knowledge shared here have already been very valuable to my own learning. So I hope to contribute to the community in the same way... but I felt I needed to be more active in the community before that.
If anyone is curious or still has questions, I can also share my LinkedIn or GitHub.
i'm a little thrown, because the Wayland diagram doesn't feel quite right. the compositor does lie between the kernel and the apps, but IIRC the apps have their own graphics buffers from the kernel that they are drawing into directly. the compositor then composites them together. to me, that feels more like the kernel is at the center of the diagram here: the wayland compositor is between the kernel and the output / input.
i don't think it has a huge impact on the discussion here. but this is such a key difference versus X, that i think is hugely under-told: Wayland compositors all rely on lots of kernel facilities to do the job, where-as X is basically it's own kernel, has origins where it effectively was the device driver for the gpu, talking to it over pci, and doing just about everything. when people contrast wayland versus X as wayland compositors needing to do so much, i can't help but chuckle, because it feels like the kernel does >50% of what X used to have to do itself; it's a much simpler world, using the kernel's built-in abstractions, rather than being multiple stacked layers of abstractions (kernels + X's own).
it means that the task of writing the display-server / compositor is much much much simpler. it's still hard! but the kernel is helping so much. there's an assumed base of having working GPU drivers!
author appears to super know their stuff. alas the FOSDEM video they link to is not loading for me. :(
one major question, since this is a protocol, how viable is it to decompose the window management tasks? rather than have a monolithic window manager, does this facilitate multiple different programs working together to run a desktop? not entirely sure the use case, but a more pluggable desktop would be interesting!
It's also possible to use hardware planes to get the actual graphics device to composite for you directly from its video memory, effectively reducing latency to the lowest possible.
Are you an AI bot? Modern X11 server using DRM are more than 20 years old. You are talking about how X11 servers worked in the 90's
To the previous a-hole, frak you: not an AI. That's rude as frak. Also, you manage to be incredibly wrong. Even an AI wouldn't overlook such an obvious error; maybe it'd be better to have it replace you. So rude dude! Behave!
It's vastly deeper than what Wayland does.
I switched to niri a few months ago, and while I like it for the most part, it feels too... busy for my taste. It defaults to a bunch of animations and decorations, all of which I've turned off. I'm happy with my current setup (aside from Wayland quirks[1]), but river's design and simplicity are very appealing. It reminds me of the philosophy of bspwm/sxhkd which I used for years on X11.
I do need scrollable tiling now that I've tried it, and I'm happy that there are a couple of options to choose from with river.
[1]: Seriously, why does copy/pasting sometimes simply not work?? I often have to copy twice for it to succeed. It's not related to Xwayland -> Wayland apps, and viceversa, or with copying from closed windows, etc. I don't use nor want a clipboard "manager". I just want my clipboard to work consistently. I've read many reports of this same bug on different distros and DEs, and nobody has figured it out. It's infuriating that such a basic feature is half-broken in a project that is 17 years old now!
I think that's pretty unlikely. The smaller compositors actually collaborate fairly well, and if sway, hyprland, niri, KDE etc. decide to implement this, I think they will probably work with river to create a standardized protocol that works across compositors. That has happened before. Hyprland is maybe more likely to do their own thing than the other, but if a standardized protocol caught on I think they would follow that.
Gnome though... I don't think there is a great chance they implement something like this even if several other compositors implement a standardized protocol for it.