You are viewing airlied

Back Viewing 10 - 20 Forward

Virgil is a research project I've been working on at Red Hat for a few months now and I think is ready for at least announcing upstream and seeing if there is any developer interest in the community in trying to help out.

The project is to create a 3D capable virtual GPU for qemu that can be used by Linux and eventually Windows guests to provide OpenGL/Direct3D support inside the guest. It uses an interface based on Gallium/TGSI along with virtio to communicate between guest and host, and it goal is to provided an OpenGL renderer along with a complete Linux driver stack for the guest.

The website is here with links to some videos:

some badly formatted Questions/Answers (I fail at github):

Just a note and I can't stress this strongly enough, this isn't end user ready, not even close, it isn't even bleeding edge user ready, or advanced tester usage ready, its not ready for distro packaging, there is no roadmap or commitment to finishing it. I don't need you to install it and run it on your machine and report bugs.

I'm announcing it because there maybe other developers interested or other companies interested and I'd like to allow them to get on board at the design/investigation stage, before I have to solidify the APIs etc. I also don't like single company projects and if I can announcing early can help avoid that then so be it!

If you are a developer interested in working on an open source virtual 3D GPU, or you work for a company who is interested in developing something in this area, then get in touch with me, but if you just want to kick the tyres, I don't have time for this yet.

So I've been involved in a recent dispute on the wayland project, with a person I'd classify as a poisonous person. Basically a contributor who was doing more damage than good, and was causing unneeded disturbances. I won't comment any further on that here, but just setting the scene for writing this.

So everytime something like this happens in a project, there emerges from the woodwork, people who claim that having public discussions about these sort of things is bad for open source, or makes us look like a crowd of juvenile developers, also how you never see this thing on closed sourced projects, or with open-source projects developer in-house and thrown over the wall. I've also recently seen this crop up when Linus flamed people, and everyone wondered why he didn't do it on some sort of private list or something.

Now I can only think these people are one of:

a) never worked in a company on a major closed source project.

b) if they have, its been top down development, where managers are telling them what to do, and maybe some architect dude has drawn a load of pretty pictures and docs. Of course the architect is never wrong, but its above your pay grade to talk to someone of such authority, so when you find problems with the architecture you hack around them instead of growing a pair and standing your ground, or else you aren't good enough to notice anything wrong.

I've seen plenty of companies where developers leave due to in-fighting or transfer to a different department, this stuff never comes out and you all are none the wiser.

So open source doesn't have top-down development, its all bottom up, most contributors to major projects do so with some ideas of what they want, but they aren't been driven by a management chain. However it means that there is generally nobody to force someone into their views, and when two people collide (or in this case, one person and everyone else), something has to give, and its best to give in public, so nobody can say it was some sort of cabal or closed decision.

Now open-source is about seeing the sausage making process, you get to see all the bits of stuff you don't want to think go into the sausages, you have to face a lot more truth, and you have to be willing to stand up against things without mummy manager to back you up. You can't have all the nice benefits of open-source development without having the bad side, the public blowups and discussion, it just can't work like that. If we take all those discussions to private lists or emails, where do you draw the line, are the people on that private list some sort of shadowy cabal overlords? Do you want an open-source development model that isn't public?

I'm sure people will say why can't we all just get along? and why can't everyone act mature? well a) we are human, b) there is no HR department frontend blocking the people at the gate, there's no interview process to weed out undesirable traits before they join the project. So when someone submits patches that work you generally accept them as a contributor, and it can take a while before you realise they are doing more harm than good, at which point its going to be public.

[update: Mir page removed most of the reasons Wayland wasn't suitable, so why did they not use wayland again?]
[update: still my opinion, really, nobody is making me say shit, lwn commenters really like to believe I've got a hand up my ass]

Okay I'm going to write a short piece on why I believe Mir isn't a good idea. If you don't know what mir is then don't bother reading the rest of this until you do.

So lets take a look at Mir from a cynical pov (I'm good at that): Say this is nothing more than a shallow power play by Canonical to try and control parts of the graphics infrastructure on Linux. It must be really frustrating to have poured so much money into a company and not have 100% control over all the code the company produces and have the upstream community continually ignore your "leadership". This would leave you wanting to exert control where you can and making decisions on what spaces you can do that in internally.

So in order to justify the requirement that Mir is required by the community at large above the current project in the space, Wayland, it is necessary to bash wayland in order that your community can learn the lines so they can repeat them right or wrong across the Internet. So you post a page like this
and a section called "Why Not Wayland / Weston?".

Now I've been reliably informed by people who know, that nothing in that section makes any sense for anyone who studied wayland for longer than 5 mins a year or two ago, especially the main points about the input handling. Nobody from Canonical has ever posted any questions to wayland mailing lists or contacted Wayland developers asking to support a different direction.

So maybe I'm being too cynical and Hanlon's razor applies, "Never attribute to malice that which is adequately explained by stupidity".

Now the question becomes do you want the display server that you are going to base the future of the Linux desktop and possible mobile spaces on a server written by people too stupid to understand the current open source project in the space?

The thing is putting stuff on the screen really isn't the hard part of display servers, getting input to where it needs to go is, and making it secure. Input methods are hard, input is hard, guess what they haven't even contemplated implementing yet?

Valve? NVIDIA? AMD? I'd be treading carefully :-)

(all my own opinion, not speaking for my employer or anyone really). Probably should comment on the g+ threads or lwn or somewhere cool.

So I took some time today to try and code up a thing I call reverse optimus.

Optimus laptops come in a lot of flavours, but one annoying one is where the LVDS/eDP panel is only connected to the Intel and the outputs are only connected to the nvidia GPU.

Under Windows, either the intel is rendering the compositor and the nvidia GPU is only used for offloads (when no monitors are plugged in), but when a monitor is plugged in, generally the nvidia takes over the compositor rendering, and just gives the Intel GPU a pixmap to put on the LVDS/eDP screen.

Now under Linux the first case mostly works OOTB on F18 with intel/nouveau, but switching compositors on the fly is going to take a lot more work, particularly with compositor writers, and I haven't see much jumping up on down on the client side to lead the way.

So I hacked up a thing I called reverse optimus, it kinda sucks, but it might be a decent stop gap.

The intel still renders the compositor, however it can use the nvidia to output slaved pixmaps. This is totally the opposite of how the technology was meant to be used, and it introduces another copy. So the intel driver now copies from its tiled rendering to a shared linear rendering (just like with USB GPUs), however since we don't want nouveau scanning out of system RAM, the nouveau driver then copies the rendering from the shared pixmap into the nvidia VRAM object. So we get a double copy, and we chew lots of power, but hey you can see stuff. Also the slave output stuff sucks for synchronisation so far, so you will also get tearing and other crappyness.

There is also a secondary problem with the output configuration. Some laptops (Lenovo I have at least), connect DDC lines to the Intel GPU for outputs which are only connected to the nvidia GPU, so when I enable the nvidia as a slave, I get some cases of double monitor reporting. This probably requires parsing ACPI tables properly like Windows does, in order to make it not do that. However I suppose having two outputs is better than none :-)

So I've gotten this working today with two intel/nvidia laptops, and I'm contemplating how to upstream it, so far I've just done some hackery to nouveau, that along with some fixes in intel driver master, and patch to the X server (or Fedora koji 1.13.1-2 server) makes it just work,

I really dislike this solution, but it seems that it might be the best stopgap until I can sort out the compositor side issues, (GL being the main problem).

update: I've pushed reverse-prime branches to my X server and -ati repo.

So I awake to find an announcement that the userspace drivers for the rPI have been released, lots of people cheering, but really what they've released is totally useless to anyone who uses or develops this stuff.

(libv commented on their thread:
maybe he'll follow up with a blog post at some point).

So to start the GLES implementation is on the GPU via a firmware. It provides a high level GLES RPC interface. The newly opened source code just does some marshalling and shoves it over the RPC.

Why is this bad?
You cannot make any improvements to their GLES implementation, you cannot add any new extensions, you can't fix any bugs, you can't do anything with it. You can't write a Mesa/Gallium driver for it. In other words you just can't.

Why is this not like other firmware (AMD/NVIDIA etc)?
The firmware we ship on AMD and nvidia via nouveau isn't directly controlling the GPU shader cores. It mostly does ancillary tasks like power management and CPU offloading. There are some firmwares for video decoding that would start to fall into the same category as this stuff. So if you want to add a new 3D feature to the AMD gpu driver you can just write code to do it, not so with the rPI driver stack.

Will this mean the broadcom kernel driver will get merged?

This is like Ethernet cards with TCP offload, where the full TCP/IP stack is run on the Ethernet card firmware. These cards seem like a good plan until you find bugs in their firmware stack or find out their TCP/IP implementation isn't actually any good. The same problem will occur with this. I would take bets the GLES implementation sucks, because they all do, but the whole point of open sourcing something is to allow other to improve it something that can't be done in this case.

So really Rasberry Pi and Broadcom - get a big FAIL for even bothering to make a press release for this, if they'd just stuck the code out there and gone on with things it would have been fine, nobody would have been any happier, but some idiot thought this crappy shim layer deserved a press release, pointless. (and really phoronix, you suck even more than usual at journalism).

Two videoes on youtube:

Randr 1.5 GPU offload:

Randr 1.5 USB hotplug

So I've been slowly writing the hotplug support v3 in between all the real jobs I have to do.

[side note: When I started out on hotplug. one of my goals was to avoid changing the server API/ABI too much so I could continue side by side testing.]

how did I get to v3?
v0.1: was called dynerama it failed miserably and proved that using Xinerama as the plugging layer was a bad plan.
v1: was the first time I decided to use an impedance layer between some server objects and driver objects.
v2: was the a major rebase of v1.

v2 was trucking along nicely and I managed to get the design to the stage where PRIME offloading intel/nouveau worked, USB device hotplug with udl worked, and GPU switch worked between two drivers. However v2 duplicated a lot of code and invented a whole new set of API objects called DrvXRec, so DrvScreenRec, DrvPixmapRec, DrvGCRec etc, this lead me to looking at the pain of merging this into the drivers and the server, and my goals of avoiding changing the API/ABI was getting in my way.

So before starting v3 I decided to rework some of the server "APIs".

The X server has two main bodies of code, one called DIX, and one called DDX. The DIX (device independent X) code and the DDX (Device dependent X code). In the tree the dix lives up in the top level dirs, and for server the DDX lives in hw/xfree86. The main object with info about protocol screens and GPUs is called ScreenRec in the DDX and ScrnInfoRec in the DIX. These are stored in two arrays, screenInfo.screens in the DIX and xf86Screens in the DDX, when code wants to convert between these it can do one of a few things.

a) lookup by index, both structs have an index value, so to go from ScrnInfo to Screen you look at screenInfo.screens[scrninfo->scrnIndex] and other way is xf86Screens[screen->myNum]. This is like the I didn't try and make an API, I just exposed everything.

b) ScrnInfo has a ScreenPtr in it, so some code can do ScrnInfo->pScreen to get the pointer to the dix struct. But this pointer is initialised after a bunch of code is called, so you really can't guarantee this pointer is going to be useful for you.

c) XF86SCRNINFO uses the DIX private subsystem to lookup the Scrn in the Screen's privates. This is the least used and probably slowest method.

So also screenInfo.screens contains the protocol screens we exposed to the clients, so this array cannot really change or move around. So I'd like to add screeninfo.gpuscreens and xf86GPUScreens and not have drivers know which set of info they are working on, however (a) totally screws this idea, since the indices are always looked up directly in the global arrays.

Now lots of the Screen/ScrnInfo APIs exposed to the drivers pass an int index as the first parameter, the function in the driver then goes and looks up the global arrays.

So my first API changes introduce some standard conversion functions xf86ScreenToScrn and xf86ScrnToScreen, and converts a lot of the server to use those. Yay an API. The second set of changes then changes all of the index passing APIs to pass ScrnInfoPtr or ScreenPtr, so the drivers don't go poking into global arrays. Now this is a major API change, it will involve slightly messy code in drivers that want to work with both servers, but I can't see a nicer way to do it. I've done a compat header file that will hopefully allows to cover a lot of this stuff where we don't see it.

I've ono other API introduction on the list, Glyph Pictures are another global array indexed by screen index, I've gone and added an accessor function so that drivers don't use the index anymore to get at the array contents directly.

Once this stuff lands in the server, a team of people will go forward and port the drivers to the new APIs (who am I kidding).

No new videos yet, need to fix some more rendering bugs so it looks nicer :)

So I've been working towards 3 setups:

a) intel rendering + nouveau offload
b) nouveau rendering + DVI output + intel LVDS output
c) hotplug USB with either intel or nvidia rendering.

Categorisation of devices roles:
I've identified 4 devices roles so far:
preferred master: the device is happy to be master
reluctant master: the device can be a master but would rather not be
offload slave: device can be used as an additional DRI2 renderer for a master
output slave: device can be used an additional output for a master

For the 3 setups above:
a) intel would be preferred master, nvidia would be offload slave
b) nvidia would be preferred master, intel would be output slave
c) usb devices would be output slaves, however if no master exists, usb device would be reluctant master.

I've rebased the prime work[1] on top of the dma-buf upstream work, and worked through most of the lifetime problems. Some locking issues still exist, and I'll have to get back to them. But the code works and doesn't oops randomly which is good.

prime is the kernel magic needed for this work, as it allows sharing of a buffer between two drm drivers, so for (a) it shares the dri2 front pixmap between devices, for (b/c) it shares a pixmap that the rendering gpu copies dirty updates to and the output slaves use as their scanout pixmap.

So I've done nearly all the work to share between intel and nouveau and I've done the kernel driver work for udl, but I haven't done the last piece in userspace for (c), which is to use the shared pixmap as usb scanout via the modesetting driver.

Today I hacked in a switch on the first randr command, so I can start the X server with intel as master and nouveau in offload mode. I can run gears on intel or nouveau, then after the randr command and another randr command to set a mode, the X server migrates everything to the nouveau driver, puts it in master mode, and places the intel driver into output slave mode. It seems to render my xterm + metacity content fine.

So the current short-term TODO is:
fix some issues with my nouveau/exa port rendering
fix some issues with xcompmgr
add usb output slave support.

Medium-term TODO:
worked out how to control this stuff, via randr protocol. How much information do we need to expose to clients about GPUs, and how do we control them. Open issues with atomicity of updates to avoid major uglys. Switching from intel master to nvidia master + intel outputs, means we have to reconfigure the Intel output to point at the new pixmap, but the more steps we put in there for clients to do, the more ugly and flashing we'll see on screen, however we probably want a lot of this to be client driven (i.e. gnome-settings-daemon).

Longer term TODO:
Get GLX_ARB_robustness done, now that Ian has done the context creation stuff, this should be a lot more trivial. (so trivial someone else could do it :)


So today I managed to see something on screen doing hotplug work. So I present to you live plugging.

Pretty much is a laptop running xf86-video-modesetting driver, with my server, an xterm + metacity. Plug in a USB displaylink device, with a kernel drm driver I wrote for it. (Sneaky xrefresh in the background). and the USB device displays the xterm and metacity.

So what actually happens?

The X server at the start had loaded drivers using udev, and the a new driver ABI. It exports one X protocol screen and plugs an internal DrvScreen into it.

When the hotplug happens, the server inits another DrvScreen for the new device, and plugs it into the single protocol screen. It also copies all the driver level pixmaps/pictures/GCs into the new driver. The single protocol screen at the bottom layer multiplexes across the plugged in drvscreens.

This is like Xinerama pushed down a lot further in the stack, so instead of doing iterations at the protocol level, we do it down at the acceleration level. Also I have randr code hooked up so all the outputs appear no matter what GPU they are from.

This isn't exactly what I want for USB hotplug, ideally we'd use the main GPU to render stuff and only scanout using the USB device, but this is step one. I also need the ability to add/remove drvscreens and all the associated data in order to support dynamic GPU switching.

The real solution is a still a long ways off, but this is just a small light in a long tunnel, I've been hacking on this on/off for over a year now, so its nice to see something visible for the first time.

If you've seen the social network, you know when he launches "The Facebook" the first time he needs his friend with the non-nerd contacts email list, I get the feeling that google plus never got this step.

When I joined facebook it was because my mother sent me an invite.

The thing is finding out what my nerd friends are doing is easy, they are always on irc or mailing lists or twittering. I'm not sure what google+ adds to this.

Back Viewing 10 - 20 Forward