Xwayland

Last week I wrote about Wayland in 3.12 and promised that I’d be writing again soon. I honestly didn’t expect it to be so soon!

But first, a quick notice. Some people let me know that they were having issues with running Wayland on top of F20 with the GNOME 3.12 COPR. I’ve been testing on rawhide, and since it worked fine for me, I thought the same would be true for the GNOME 3.12 COPR. It seems this isn’t the case. I tried last night to get a system to test with and failed. I’m going to continue to investigate, but I first have to get a system up and running to test with. That may take some time.

Sorry that this happened. I know about it, though, and I’ll get to the bottom of this one way or another. And hey, maybe it will be magically solved by…

A new Xwayland

Last night something very, very exciting happened. Xwayland landed in the X server. I’m super thrilled to see this land; I honestly thought it would be at least another year before we’d see it upstream. Keep in mind, it’s been the works for three years now.

So, why did it succeed so fast? To put it simply, Xwayland has been completely rearchitected to be leaner, cleaner, faster, and better than ever before. It’s not done yet; direct rendering (e.g. games using OpenGL) and by extension 2D acceleration aren’t supported yet, but it’s in the pipeline.

I also talked about this somewhat in the last blog post, and in The Linux Graphics Stack, but since it’s the result of a fairly recent development, let’s dive in.

The new architecture

Traditionally, in the Xorg stack, even within the FOSS graphics stack, there are a number of different moving parts.

The X server codebase is large, but it’s somewhat decently structured. It houses several different X servers for different purposes. The one you’re used to and is the one you log into is Xorg, and it’s in the hw/xfree86 directory. It’s named like that for legacy reasons. There’s also Xnest and Xephyr, which implement a nested testing environment. Then there’s the platform adaptions like hw/xwin and xquartz, which are Win32 and OS X servers which are designed to be seamless: the X11 windows that are popped up look and behave like any other window on your system.

There’s plenty of code that can be shared across all the different servers. If somebody presses a key on their keyboard, the code to calculate the key press event, do keysym translation, and then send it to the right application should be shared between all the different servers. And it is. This code is shared in a part of the source tree called Device-Independent X, or “DIX” for short. A lot of common functionality related to implementing the protocol are done in here.

The different servers, conversely, are named “Device-Dependent X”s, or “DDX”en, and that’s what the hw/ directory path means. They hook into the DIX layer by installing various function pointers in different structs, and exporting various public functions. The architecture isn’t 100% clean; there’s mistakes here and there, but for a codebase that’s over 30 years old, it’s fairly modern.

Since the Xorg server is what most users have been running on, it’s the biggest and most active DDX codebase by far. It has a large module system to have hardware-specific video and input drivers loaded into it. Input is a whole other topic, so let’s just talk about video drivers today. These video drivers have names like xf86-video-intel, and plug directly into Xorg in the same way: they install function pointers in various structs that override default functionality with something hardware-specific.

(Sometimes we call the xf86- drivers themselves the “DDX”en. Technically, these are the parts of the Xorg codebase that actually deal with device-dependent things. But really, the nomenclature is just there as a shorthand. Most of us work on the Xorg server, not in Xwin, so we say “DDX” instead of “xf86 video driver”, because we’re lazy. To be correct, though, the DDX is the server binary, e.g. Xorg, and its corresponding directory, e.g. hw/xfree86.)

What do these video drivers actually do? They have two main responsibilities: managing modesetting and doing accelerated rendering.

Modesetting is the responsibility of setting the buffer to display on every monitor. This is one of those things that you would think would be simple and standardized quite a long time ago, but for a few reasons that never happened. The only two standards here are the VESA BIOS Extensions, and its replacement, the UEFI Graphics Output Protocol. Unfortunately, both of these aren’t powerful enough for the features we need to build a competitive display server, like an event for when the monitor has vblanked, or flexible support for hardware overlays. Instead, we have a set of hardware-specific implementations in the kernel, along with a userspace API. This is known as “KMS”.

The first responsibility has now been killed. We can simply use KMS as a hardware-independent modesetting API. It isn’t perfect, of course, but it’s usable. This is what the xf86-video-modesetting driver does, for instance, and you can get a somewhat-credible X server and running that way.

So now we have a pixel buffer being displayed on the monitor. How do we get the pixels into the pixel buffer? While we could do this with software rendering with a library like pixman or cairo, it’s a lot better if we can use the GPU to its fullest extent.

Unfortunately, there’s no industry standard API for accelerated 2D graphics, and there likely never will be. There’s plenty of options: in the web stack we have Flash, CSS, SVG, VML, <canvas>, PostScript, PDF. On the desktop side we have GDI, Direct2D, Quartz 2D, cairo, Skia, AGG, and plenty more. The one attempt to have a hardware-accelerated 2D rendering standard is OpenVG, which ended in disaster. NVIDIA is pushing a more flexible approach that is integrated with 3D geometry better: NV_path_rendering.

Because of the lack of an industry standard, we created our own: the X RENDER extension. It supplies a set of high-level 2D rendering operations to applications. Video drivers often implement these with hardware fast paths. Whenever you hear talk about EXA, UXA or SNA, this is all they’re talking about: complex, sophisticated implementations of RENDER.

As we get newer and newer hardware up and running under Linux, and as CPUs are getting faster and faster, it’s getting less important to write a fast RENDER implementation in your custom video driver.

We also do have an industry standard for generic hardware-accelerated rendering: OpenGL. Wouldn’t it be nice if we could take the hardware-accelerated OpenGL stack we created, and use that to create a credible RENDER implementation? And that’s exactly what the glamor project is about: an accelerated RENDER implementation that works for any piece of hardware, simply by hoisting it on top of OpenGL.

So now, the two responsibilities of an X video driver have been moved to other places in the stack. Modesetting has been moved to the kernel with KMS. Accelerated 2D rendering has been pushed to use the 3D stack we already have in place anyway. Both of these are reusable components that don’t need custom drivers, and we can reuse in Wayland and Xwayland. And that’s exactly what we’re going to do.

So, let’s add another DDX to our list. We arrive at hw/xwayland. This acts like Xwin or Xquartz by proxying all windows through the Wayland protocol. It’s almost impressive how small the code is now. Seriously, compare that to hw/xfree86.

It’s also faster and leaner. A large part of Xorg is stuff related to display servers: reading events from raw devices, modesetting, VT switching and handling. The old Xwayland played tricks with the Xorg codebase to try and get it to stop doing those things. Now we have a simple, clean way to get it to stop doing those things: to never run the code in the first place!

In the old model, things like modesetting were done in the video driver, unfortunately. In the old model, we simply patched Xorg with a special magical mode to tell video drivers not to do anything too tricky. For instance, the xf86-video-intel driver had a special branch for Xwayland support. For generic hardware support, we wrote a generic, unaccelerated driver that stubbed out most of the functions we needed. With the new approach, we don’t need to patch anything at all.

Unfortunately, there are some gaps in this plan. James Jones from NVIDIA recently let us know they were expecting to use their video driver in Xwayland for backwards-compatibility with legacy applications. A few of us had a private chat afterwards about how we can move along with here. We’re still forming a plan, and I promise I’ll tell you guys about them when they’re more solidified.It’s exciting to hear that NVIDIA is on board!

And while I can’t imagine that custom xf86-video-* drivers are ever going to go away completely, I think it’s plausible that the xf86-video-modesetting video driver could add glamor support, and the rest of the FOSS DDXes die out in favor.

OK, so what does this mean for me as a user?

The new version of Xwayland is hardware-independent. In Fedora, we only built xf86-video-intel with Wayland support. While there was a generic video driver, xf86-video-wayland, we never built it in Fedora, and that meant that you couldn’t try out Wayland on non-Intel GPUs. This was a Fedora bug, not a fundamental issue with Wayland or a GNOME bug as I’ve seen some try to claim.

It is true, however, that we mostly test on Intel graphics. Most engineers I know of that develop GNOME run on Lenovo ThinkPads, and those tend to have Intel chips inside.

Now, this is all fixed, and Xwayland can work on all hardware regardless of which video drivers are built or installed. And for the record, the xf86-video-wayland driver is now considered legacy. We hope to ship these as updates in the F20 COPR, but we’re still working out the logistics of packaging all this.

I’m still working hard on Wayland support everywhere I go, and I’m not going to slow down. Questions, comments, everything welcome. I hope the next update can come as quickly!