Wayland 1.5 is released. It’s a pretty exciting release, with plenty of features, but the most exciting thing about it is that we can begin work on Wayland 1.6!

… No, I’m serious. Wayland 1.6’s release schedule matches up pretty well with GNOME’s. Wayland 1.6 will be released in the coming weeks before GNOME 3.14, the first version of GNOME with full Wayland support out of the box.

Since development is opening again, we can resume work on xdg-shell, the new desktop shell protocol to replace wl_shell. I alongside Kristian Hoegsberg have been prototyping and implementing this in toolkits and Wayland compositors. We’re extremely happy with our current revision of the bare-bones protocol, so it’s at this point that we want to start evangelizing and outreaching to other communities to make sure that everybody can use it. We’ve been working closely with and taking input from the Wayland community. That means that we’ve been working with the Qt/KDE and Enlightenment/EFL Wayland teams, but anybody who isn’t paying close attention to the Wayland community is out of the loop. This needs to change.

Ironically, as the main Wayland developer for GNOME, I haven’t talked too much about the Wayland protocol. My only two posts on Wayland were a user post about the exciting new features, and one about the legacy X11 backwards compatibility mode, XWayland.

Let’s start with a crash course in Wayland protocols.


As odd as it sounds, Wayland doesn’t have a built-in way to get something like a desktop window system, with draggable, resizable windows. As a next-generation display server, Wayland’s protocol is meant to be a bit more generic than that. Wayland can already be found on mobile devices as part of SailfishOS through the hard work of Jolla and other companies. Engineers at Toyota and Jaguar/Land Rover use Wayland for media centers in cars, as part of a custom Linux distribution called GENIVI. I’m also told that LG’s webOS as used in its smart TVs are investigating Wayland for a display server as well. Dragging and resizing tiny windows from on a phone, or inside a car, or on a TV just isn’t going to be a great experience. Wayland was designed, from the start, to be flexible enough to support a wide variety of use cases.

However, that doesn’t mean that Wayland is all custom protocols: there’s a common denominator between all of these cases. Wayland has a core protocol object called a wl_surface on which clients can show some pixels for output, and retrieve various kinds of input on. This is similar to the concept of X11’s “windows”, which I explain in Xplain. However, the wl_surface isn’t simply a subregion of the overall front buffer. Instead of owning parts of the screen, Wayland clients instead create their own pixel buffers, draw to them, and then “attach” them to the wl_surface, causing a new pixel buffer to be displayed. The wl_surface concept is fairly versatile, and is used any time we need a “live surface” to play around with. For instance, the mouse cursor is done simply by providing the Wayland compositor with a wl_surface. The same thing is done for drag-and-drop icons as well.

An interesting aside is that the model taken by Wayland with wl_surface can actually require less copies and be more efficient than X11 with modern systems. More and more GPUs have more interesting and fancy hardware at scanout time. With the rise of low-power phones that require rich graphics, we’re seeing a resurgence in fixed-function alpha blending and compositing hardware when doing scanout, similar to what game consoles like the NES and SNES had (but they called them “sprites“). X11’s model of a giant front buffer that apps draw to means that we must copy all contents to the front buffer eventually from the CPU, while Wayland’s model means that applications can simply hand us their pixel buffers, and we can choose to show it as an overlay, which removes any copy. And if an application is full-screen, we can simply tell the GPU to scan out from that application’s buffer directly, instead of having to copy.


OK, so I’ve talked about wl_surface. How does this relate to xdg-shell? Since a wl_surface can be used for lots of different purposes, like cursors, simply creating the wl_surface and attaching a buffer doesn’t put it on the screen. Instead, first, we need to let the Wayland compositor know that this wl_surface is intended to be a desktop-style window that can be dragged and resized around. It should appear in Alt-Tab, and clicking on it should give it keyboard focus, etc.

Wayland’s approach here is a bit odd, but to give a wl_surface a role, we construct a new wrapper object which has all of our desktop-level protocol functions, and then hand it the wl_surface. In this case, the protocol that we use to create this role is known as “xdg-shell”, and the wrapper object is known as an “xdg_surface”. The name is a reference to the FreeDesktop Group, an open mailing list where cross-desktop standards are discussed between all the different desktops. For historical reasons, it’s abbreviated as “XDG”. Members from the XDG community have all been contributing to xdg-shell.

The approach of a low-level structure with a high-level role is actually fairly similar to the approach taken in X11. X11 simply provides a data structure called a “window”, as I explained in Xplain: a tool that you can use to construct your interface by pushing pixels here, and getting input there. An external process called a “window manager” turns this window from a simple region of the front buffer into a window with a title and icon that the user can move around, resize, minimize and maximize with keyboard shortcuts and a taskbar. The window manager and the client applications both agree to cooperate and follow a series of complex standards like the ICCCM and EWMH that allow you to provide this “role”. Though I’ve never actually worked on any environments other than traditional desktops, I’d imagine that in more special-case environments, different protocols are used instead, and the concept of the window manager is completely dropped.

X11 has no easy, simple way of creating protocol extensions. Something as simple as a new request or a new event requires a bunch of byte-marshalling code in client libraries, extra support code added in the server, in toolkits, and a new set of APIs. It’s a pain, trust me. Instead, X11 does provide a generic way to send events to clients, and a series of key/value pairs on windows called “properties”, so standards like these often use the generic mechanisms rather than building an actual new protocol, since the effort is better spent elsewhere. It’s an unfortunate way that X11 was developed.

Wayland makes it remarkably easy to create a new protocol extension involving new objects and custom methods. You write up a simple XML description of your protocol, and an automatic tool, wayland-scanner, generates server-side and client-side marshalling code for you. All that you need to do now is write the implementation side of things. On the client, that means creating the object and calling methods on it. Because it’s so easy to write custom extensions in Wayland, we haven’t even bothered creating a generic property or event mechanism. Things with a structure allow us a lot more stability and rigidity.


Long-time users or developers of Wayland might notice this sounds similar to an older protocol known as wl_shell or wl_shell_surface. The intuition is correct: xdg-shell is a direct replacement for wl_shell. wl_shell_surface had a number of frustrating limitations, and due to its inclusion in the Wayland 1.0 core, it is harder to change and make better. As Fred Brooks told us, “write one to throw away”.

xdg-shell can be seen as a replacement for wl_shell_surface, and it solves a number of fundamental issues and race conditions which I’d prefer not to go into here (but if you ask nicely in the comments, I might oblige!), but I guess you’ll have to trust me when I say that they were highly visible user bugs and frustrations about some weird lagginess or flickering when using Wayland. We’re happy that these are gone.

A call to arms

The last remaining ticket item I have to work in xdg-shell is related to “window geometry”, as a way of communicating where the user’s concept of the edge of the window is. This requires significant reworking of the code in weston, mutter, and GTK+. After that, it will serve the needs of GTK+ and GNOME perfectly.

Does it serve the needs of your desktop? The last thing we want to do is to solidify the xdg-shell protocol, only to find that a few months later, it doesn’t quite work right for tiling WMs or for EFL or KDE applications. We’re always experimenting with things like this to make sure everything can work, but really, we’re only so many people, and others testing it out and porting their toolkits and compositors over can’t ever hurt.

So, any and all help is appreicated! If you have any feedback on xdg-shell so far, or need any help understanding this, feel free to post about it to the Wayland mailing list or poke me on IRC (#wayland on Freenode, my nick in there is Jasper).

As always, if anybody has any questions, no matter how dumb or stupid, please let me know! Comments are open, and I always try to reply.