- Mastering KVM Virtualization
- Humble Devassy Chirammal Prasad Mukhedkar Anil Vettathu
- 1661字
- 2021-07-14 10:27:38
Internal workings of libvirt
Let me give some details about the following libvirt source code. If you really want to know more about the implementation, it is good to poke around in the libvirt source code. Get the libvirt source code from the libvirt Git repository:
[root@node]# git clone git://libvirt.org/libvirt.git
Once you clone the repo, you can see the following hierarchy of files in the repo:
libvirt code is based on the C programming language; however, libvirt has language bindings in different languages such as C#
, Java
, OCaml
, Perl
, PHP
, Python
, Ruby
, and so on. For more details on these bindings, please refer to: available in the source code repo and also at http://libvirt.org.
Let us move on. If we look at the libvirt internals, we can see libvirt operates or starts the connection path based on driver modes. That said, different types or levels of driver are part of the libvirt implementation. At the time of initialization, these drivers are registered with libvirt. If you are confused by the term "drivers", they are basic building blocks for libvirt functionality to support the capability to handle specific hypervisor driver calls. These drivers are discovered and registration happens at the time of connection processing, as you can see at http://libvirt.org/api.html:
"Each driver has a registration API, which loads up the driver specific function references for the libvirt APIs to call. The following is a simplistic view of the hypervisor driver mechanism. Consider the stacked list of drivers as a series of modules that can be plugged into the architecture depending on how libvirt is configured to be built"
As in the preceding figure, there is a Public API that is exposed to the outside world. Depending on the connection URI (for example: virsh --connect QEMU://xxxx/system
) passed by the clients, when initializing the library, this Public API delegates its implementation to one or more internal drivers. Yes, there are different categories of driver implementations in libvirt. For example, there are hypervisor
, interface
, network
, nodeDevice
, nwfilter
, secret
, storage
, and so on. Refer to driver.h
inside the libvirt source code to know about the driver data structures and other functions associated with the different drivers.
For example:
struct _virConnectDriver { virHypervisorDriverPtr hypervisorDriver; virInterfaceDriverPtr interfaceDriver; virNetworkDriverPtr networkDriver; virNodeDeviceDriverPtr nodeDeviceDriver; virNWFilterDriverPtr nwfilterDriver; virSecretDriverPtr secretDriver; virStorageDriverPtr storageDriver; };
struct
fields are self-explanatory and convey which type of driver is represented by each of the field members. As you might have assumed, one of the important or main drivers is hypervisor driver
, which is the driver implementation of different hypervisors supported by libvirt. The drivers are categorized as primary
and secondary
drivers. The hypervisor driver is a primary-level driver and there is always a hypervisor driver active. If the libvirt daemon is available, usually a network and storage driver are active as well. So, the libvirt code base is well segregated and for each supported hypervisor there is a driver implementation (or there should be). The following list gives us some idea about the hypervisors supported with libvirt. In other words, hypervisor-level driver implementations exist for the following hypervisors (reference# README
and the libvirt source code):
bhyve
: The BSD hypervisoresx/
: VMware ESX and GSX support using vSphere API over SOAPhyperv/
: Microsoft Hyper-V support using WinRMlxc/
: Linux Native Containersopenvz/
: OpenVZ containers using CLI toolsphyp/
: IBM Power Hypervisor using CLI tools over SSHqemu/
: QEMU / KVM using QEMU CLI/monitorremote/
: Generic libvirt native RPC clienttest/
: A "mock" driver for testinguml/
: User Mode Linuxvbox/
: Virtual Box using the native APIvmware/
: VMware Workstation and Player using the vmrun toolxen/
: Xen using hypercalls, XenD SEXPR, and XenStorexenapi
: Xen using libxenserver
Previously, I mentioned that there are secondary-level drivers as well. Not all, but some secondary drivers (see the following) are shared by several hypervisors. That said, currently these secondary drivers are used by hypervisors such as the LXC, OpenVZ, QEMU, UML, and Xen drivers. The ESX, Hyper-V, Power Hypervisor, Remote, Test, and VirtualBox drivers all implement secondary drivers directly.
Examples of secondary-level drivers include:
cpu/
: CPU feature managementinterface/
: Host network interface managementnetwork/
: Virtual NAT networkingnwfilter/
: Network traffic filtering rulesnode_device/
: Host device enumerationsecret/
: Secret managementsecurity/
: Mandatory access control driversstorage/
: Storage management drivers
Node resource operations, which are needed for the management and provisioning of virtual machines (also known as guest domains), are also in the scope of the libvirt API. The secondary drivers are consumed to perform these operations, such as interface setup, firewall rules, storage management, and general provisioning of APIs. From https://libvirt.org/api.html:
"OnDevice the application obtains a virConnectPtr connection to the hypervisor it can then use it to manage the hypervisor's available domains and related virtualization resources, such as storage and networking. All those are exposed as first class objects and connected to the hypervisor connection (and the node or cluster where it is available)".
The following figure shows the five main objects exported by the API and the connections between them:
I will give some details about the main objects available in the libvirt code. Most functions inside libvirt make use of these objects for their operations:
virConnectPtr
: As we discussed earlier, libvirt has to connect to a hypervisor and act. The connection to the hypervisor has been represented as this object. This object is one of the core objects in libvirt's API.virDomainPtr
: VMs or Guest systems are generally referred to as domains in libvirt code.virDomainPtr
represents an object to an active/defined domain/VM.virStorageVolPtr
: There are different storage volumes, exposed to the domains/guest systems.virStorageVolPtr
generally represen20t one of the volumes.virStoragePoolPtr
: The exported storage volumes are part of one of the storage pools. This object represents one of the storage pools.virNetworkPtr
: In libvirt, we can define different networks. A single virtual network (active/defined status) is represented by thevirNetworkPtr
object.
You should now have some idea about the internal structure of libvirt implementations; this can be expanded further:
On different hypervisor driver implementation our area of interest is on QEMU/KVM. So, let's explore it further. Inside the src
directory of the libvirt source code repository, there is a directory for QEMU
hypervisor driver implementation code.
I would say, pay some attention to the source files, such as qemu_driver.c
, which carries core driver methods for managing QEMU guests.
For example:
static virDrvOpenStatus qemuConnectOpen(virConnectPtr conn, virConnectAuthPtr auth ATTRIBUTE_UNUSED, unsigned int flags
libvirt makes use of different driver codes to probe the underlying hypervisor/emulator. In the context of this book, the component of libvirt responsible for finding out the QEMU/KVM presence is the QEMU driver code. This driver probes for the qemu-kvm
binary and /dev/kvm
device node to confirm the KVM fully-virtualized hardware-accelerated guests are available. If these are not available, the possibility of a QEMU emulator (without KVM) is verified with the presence of binaries such as qemu
, qemu-system-x86_64
, qemu-system-mips
, qemu-system-microblaze
, and so on.
The validation can be seen in qemu-capabilities.c
:
from (qemu-capabilities.c) static int virQEMUCapsInitGuest ( .., .. , virArch hostarch, virArch guestarch) { ….. binary = virQEMUCapsFindBinaryForArch (hostarch, guestarch); /* qemu-kvm/kvm binaries can only be used if * - host & guest arches match * - hostarch is x86_64 and guest arch is i686 (needs -cpu qemu32) * - hostarch is aarch64 and guest arch is armv7l (needs -cpu aarch64=off) * - hostarch and guestarch are both ppc64* */ native_kvm = (hostarch == guestarch); x86_32on64_kvm = (hostarch == VIR_ARCH_X86_64 && guestarch == VIR_ARCH_I686); arm_32on64_kvm = (hostarch == VIR_ARCH_AARCH64 && guestarch== VIR_ARCH_ARMV7L); ppc64_kvm = (ARCH_IS_PPC64(hostarch) && ARCH_IS_PPC64(guestarch)); if (native_kvm || x86_32on64_kvm || arm_32on64_kvm || ppc64_kvm) { const char *kvmbins[] = { "/usr/libexec/qemu-kvm", /* RHEL */ "qemu-kvm", /* Fedora */ "kvm", /* Debian/Ubuntu */ …}; ……… kvmbin = virFindFileInPath(kvmbins[i]); ……. virQEMUCapsInitGuestFromBinary (caps, binary, qemubinCaps, kvmbin, kvmbinCaps,guestarch); …… }
Then, KVM enablement is performed as shown in the following:
int virQEMUCapsInitGuestFromBinary(..., *binary, qemubinCaps, *kvmbin, kvmbinCaps, guestarch) { ……... if (virFileExists("/dev/kvm") && (virQEMUCapsGet(qemubinCaps, QEMU_CAPS_KVM) || virQEMUCapsGet(qemubinCaps, QEMU_CAPS_ENABLE_KVM) || kvmbin)) haskvm = true;
Even though it's self-explanatory, libvirt's QEMU driver is looking for different binaries in different distributions and in different paths—for example, qemu-kvm in RHEL/Fedora. Also it finds a suitable QEMU binary based on the architecture combination of both host and guest. If both the QEMU binary and KVM presence are found, then KVM is fully virtualized and hardware-accelerated guests will be available. It's also libvirt's responsibility to form the entire command line argument for the QEMU-KVM process. Finally, after forming the entire command (qemu-command.c
) line arguments and inputs, libvirt calls exec()
to create a QEMU-KVM process:
util/vircommand.c static int virExec(virCommandPtr cmd) { …... if (cmd->env) execve(binary, cmd->args, cmd->env); else execv(binary, cmd->args);
In KVM land, there is a misconception that libvirt directly uses the device file (/dev/kvm
) exposed by KVM kernel modules, and instructs KVM to do the virtualization via the different ioctls()
available with KVM. This is indeed a misconception! As mentioned earlier, libvirt spawns the QEMU-KVM process and QEMU talks to the KVM kernel modules. In short, QEMU talks to the KVM via different ioctl()
to the/dev/kvm
device file exposed by the KVM kernel module. To create a VM (for example: virsh create
), all libvirt does is to spawn a QEMU process, which in turns creates the virtual machine. Please note that a separate QEMU-KVM process is launched for each virtual machine by libvirtd. The properties of virtual machines (the number of CPUs, memory size, and I/O device configuration) are defined in separate XML files, which are located in the /etc/libvirt/qemu
directory. libvirtd uses the details from these XML files to derive the argument list that is passed to the QEMU-KVM process. The libvirt clients issue requests via the AF_UNIX
socket /var/run/libvirt/libvirt-sock
on which libvirtd is listening.
Well, we discussed libvirt and its connection to QEMU/KVM; however, users/developers periodically pose this question: Why do we need libvirt and what advantages does it bring? I would say this is best answered by Daniel P. Berrange, one of the core maintainers of libvirt, here: https://www.berrange.com/posts/2011/06/07/what-benefits-does-libvirt-offer-to-developers-targetting-QEMUKVM/.