Developing for Linux Virtualization - When Software Development goes Schizoid
So... First and foremost... Are you nuts? Even a tiny little bit? Just a pinch of? No? Interested in Virtualization? Well, I suggest you find a different field of interest.
Working on Virtualization in general and Linux in particular requires to have a bit of a crazy streak. If you are missing this crazy streak, this is really not for you. The reason is that achieving high performance when running a virtualized OS on top of itself requires you to simultaneously juggle both sides of the virtualization software - the guest (in) side and the host (out) side.
As a mental exercise, this is easiest on Type 1 hypervisors. These have distinct differences between the "in" and "out" software environment. In addition to that, you are usually limited by the relative simplicity of the tools and libraries available for developing on Type1.
It becomes harder when you move onto Type 2 hypervisors. You have a fully blown OS to your disposal on the "out" side. The in and the out become very much alike and you need to put some mental effort into keeping track what are you allowed and what not, especially if you start working on acceleration, high performance networking and the cutting of corners which comes with it.
This becomes hardest in the pinnacle of virtualization madness - UML. From the perspective of "are you in or out?" question, working on UML is a case of going off the deep end - you constantly move between the two. You may cross the rather fuzzy out/in boundary as often as 10 times within a call chain. Sometimes more. This makes UML an excellent starting platform to learn how, where and when to cut a corner when crossing the VM boundary to achieve performance.
Compared to UML, the additional restrictions provided by QEMU and Xen make life significantly easier. The number of approaches you can try is limited and you are expected to comply with the Das Ordnung of the environment. That actually makes life easier, not harder - you know what to do and you just go and do it.
Why "waste" time with UML in the first place?
I have covered some of this already in one of the other aticles
. UML performance is nowhere near as bad as it is described in a lot of books on virtualization. First of all, a lot of them continue to quote benchmarks from the days when it was using thread tracing. Even the newer benchmarks are from before me and Thomas Meyer fixed its timer subsystem. Second, nearly all benchmarks used to demonstrate how slow it is are userspace and fork/exec bound - something which has an extraordinary one-off cost in UML. If you are running monolitic software and/or running in kernel space its performance is significantly better than expected. It also allows us to try and analyze the impact of various techniques which will not make a dent in QEMU performance due to it being constrained elsewhere by the buffering/queuing semantics in the network subsystem
The Kernel (In) side and the User (Out) side - implementing a UML feature
Most of UML functionality is split between two halves: User and Kernel. The user portion is the equivalent of the Hypervisor. In fact, it can be considered a subspecies of Type 2 Hypervisor. The Kernel part is the architecture specific part of the Linux kernel which matches the corresponding userspace portion as necessary to implement a working machine emulation. UML implements all key parts of a Linux kernel architecture - same as for a specific type of hardware. These include memory management, an IRQ controller, realtime clock emulation as well as various virtual device drivers. It also implements its own method of running the Linux userspace which is an area I will not get into. It works, but it is of no interest for most experiments with virtualization and Linux kernel concepts.
This setup has a number of pros and cons. First of all, we have no cost for crossing the hypervisor boundary. The boundary is "virtual" - it is in our head and the only enforcer is the compiler. If we give it a file which has been designated as user in the Makefiles it will feed it the normal userspace include directories. Everything else will be treated as a portion of the Linux kernel. This allows to run all kinds of weird mental (for all meanings of this word) experiments, violate address space constraints and do things that will never be allowed on a "grown up" hypervisor. That is a definite pro. At the same time it is a con in the long run. Any "running barefeet in the rain" experiments will need to be cleaned up considerably before they are applicable to QEMU or other hypervisors.
So let's see how this works on an example of implementing a hypothetical (actually not very hypothetical - it is one of my old experiments) driver: direct PWE to vNIC L2TPv3
driver (similar to the one I have contributed to QEMU).
To be continued
- 29 Mar 2017