Integrating Xen on the Ampere Platform: a first look

Hardware Apr 3, 2024

In this article, we'll talk about our initial work to run a Xen-based system on top of an Ampere platform, which is based on ARM CPU architecture, specifically on the ARMv8.2+ version.

Our collaboration with Ampere

We're excited to tell you we've forged a technical collaboration with Ampere Computing! The complete announcement is available here:

Bringing modern virtualization in the Arm-powered Datacenter
Learn about our exciting collaboration with Ampere Computing, a major stride in advancing efficient virtualization within the datacenter.

In concrete terms, they've let us access to two powerhouse machines that are nothing short of impressive. Each is a 2U rack server, loaded with dual Altra CPUs boasting 80 cores each at 3Ghz, for some serious processing power (160 cores total). That's perfect to work on getting Xen working on those beasts!

2U Mt. Collins (details)

But this partnership isn't just about hardware; it's also to prepare the future of Arm in the Datacenter with a secure hypervisor (Xen) thanks to those already mature machines.

Context

But before doing a technical dive into it, we need to understand a bit more the context around this platform and ARM in general.

💡
A quick reminder for our non-tech readers: ARM is a family of RISC instruction set architectures (ISAs) for computer processors. Arm Ltd develops the ISAs and licenses them to other companies, who build the physical devices that use the instruction set. It also designs and licenses cores that implement these ISAs. More details on this link.

Arm in the Datacenter

The road to getting Arm CPUs into the big server rooms wasn't smooth sailing from the get-go. Back in the early 2010s, some companies tried to make it work, but it was too soon. They were ahead of their time, trying to introduce these chips to a world that wasn't quite ready for them. It was a tough start, with lots of challenges, especially because everyone was so used to the usual processors and the software wasn't quite there yet for Arm.

Amazon started to demonstrate ARM-based CPU could works in the datacenter at scale with its Graviton CPU (using Arm 64 bit architecture), demonstrated the performance and maturity can be delivered. Funny story, AWS started to demonstrate the concept of the cloud with... Xen! Kind of coming full circle. Anyway, sadly, it's only possible to enjoy those chips while using AWS Cloud instances.

But then came 2018, and things started to look up. A company called Ampere Computing, started by a former Intel president, decided to dive deep into Arm-based processors for servers.

They came up with something called Ampere Altra, which was all about giving datacenters what they really needed: performance and the ability to handle lots of work without using too much power.

Now, why is all this important, especially for those of us interested in running Xen on an Ampere Altra CPU? Well, it's like finding the perfect partner for a dance. Xen, with its great features for managing virtual servers, works wonderfully with the energy-saving and flexible nature of the Ampere Altra. This combo is all about tackling big tasks in datacenters in a smarter, more efficient way. The test servers we had got 2 sockets populated with 80 cores each. That's a great density to spawn many VMs with a very contained power usage.

So as we dive into how to set up Xen on these Arm CPUs, let's remember the bigger picture. It's a story of innovation, of finding new ways to do things better, and of a tech community that's always looking to push the envelope. It's about making our datacenters not just work harder, but smarter.

Our target use case

Having our whole platform on those machine makes a lot of sense, and the first clear target is the "hosting" or the "private cloud provider", who wants to get a nice compute density while being power-efficient. The energy price today is clearly a factor to get more Arm-based machines in the datacenter! Also, those providers are usually deploying Linux-based load, like Debian/Ubuntu/RHEL with PHP, plus Postgresql or MySQL, and any other program that runs pretty well on Arm anyway. For them, the CPU architecture transition isn't a problem!

But they need a powerful and easy to manage virtualization stack. That's exactly why we started to answer the call, build the partnership with Ampere and started to work to get Xen on it.

Our first steps

Our initial goal is to run a "vanilla" Xen (from the upstream) with its xl light tool stack. So we are not yet to get XCP-ng running on it, but it's a required step to achieve first.

Boot a Linux distribution without Xen

Our first step was to run any available distribution on top of Ampere Mt. Collins. Debian (Bookworm) has been chosen as a first step.

Nothing special was needed to run Debian on the platform; download the ISO image from the Debian website and follow the installation procedure.

Everything works well after installation except gdm3 (the default display manager), which doesn't want to work as the platform has an Aspeed VGA output. Some display managers choose it by default, even if it’s not connected, and even though HDMI is available, the ast driver is failing. As a workaround, it was decided to blacklist ast driver:

root@ava:~# cat > /etc/modprobe.d/ast.conf <<'EOF'
blacklist ast
EOF
root@ava:~# reboot

Another workaround can be switching to another display manager; lightdm display manager works fine.

Besides this, other issues occurred which required an additional investigation. First, hardware errors reported by the APEI (ACPI Platform Error Interface) framework during the system initialization process, an error record from a previous boot stored in the BERT (Boot Error Record Table), some PCI related issues:

[ 0.513812] pci 000a:00:01.0: BAR 13: failed to assign [io size 0x1000]
[ 0.513815] pci 000a:00:03.0: BAR 13: no space for [io size 0x1000]
[ 0.513816] pci 000a:00:03.0: BAR 13: failed to assign [io size 0x1000]

And finally some failures in allocating contiguous memory areas (CMA) during system initialization:

[ 0.755777] cma: cma_alloc: reserved: alloc failed, req-size: 256 pages, ret: -12
[ 0.763412] arm-smmu-v3 arm-smmu-v3.8.auto: allocated 65536 entries for cmdq
[ 0.763416] cma: cma_alloc: reserved: alloc failed, req-size: 256 pages, ret: -12
[ 0.771035] arm-smmu-v3 arm-smmu-v3.8.auto: allocated 32768 entries for evtq
[ 0.771037] cma: cma_alloc: reserved: alloc failed, req-size: 128 pages, ret: -12

As you can see, there's still some rough edges, but nothing catastrophic.

Booting Xen

After having Debian OS running, the following step was to build Xen, Xen xl toolstack, kernel and finally QEMU.

For simplification, all the mentioned software was built on the platform. For the current experiment, it would be enough to clone the upstreamed version. First, let's build a kernel:

$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux/
$ git checkout v6.6-rc2
$ make menuconfig # enable CONFIG_XEN_NETDEV_BACKEND
$ make
$ make modules_install
$ make install

Now, build Xen and its xl toolstack following those instructions.

And the final build will be QEMU (should be v8.1.0 as xenpvh machine was introduced there, but safer would be used v8.2.0):

$ git clone -b staging-8.2 https://gitlab.com/qemu-project/qemu.git
$ cd qemu; mkdir build; cd build
$ ../configure --target-list=aarch64-softmmu
$ make
$ make install # as an option path to build dir can be used.

Now we have everything installed to /boot directory, we need to set boot args for Xen and Dom0 kernel. It can be done by updating the file /etc/default/grub with the following content:

GRUB_CMDLINE_XEN="noreboot dom0_mem=1024M bootscrub=0 iommu=on loglvl=all guest_loglvl=all"
GRUB_CMDLINE_LINUX_XEN_REPLACE="console=hvc0 earlycon=xenboot"
GRUB_CMDLINE_LINUX_XEN_REPLACE_DEFAULT=""

We are ready to update grub entries by running the command update-grub. After execution of the command, the Xen entry will be created:

menuentry 'Debian GNU/Linux, with Xen 4.19-unstable and Linux 6.6.2-g5645b4e1d273 (recovery mode)' --class debian --class gnu-linux --class gnu --class os --class xen $menuentry_id_option 'xen-gnulinux-6.6.2-g5645b4e1d273-recovery-0ef5c67d-e9bc-480c-858b-509fc93e80e1' {
    	insmod part_gpt
    	insmod ext2
    	search --no-floppy --fs-uuid --set=root 0ef5c67d-e9bc-4880c-858b-509fc93e80e1
    	echo    'Loading Xen 4.19-unstable ...'
    	if [ "$grub_platform" = "pc" -o "$grub_platform" = "" ];;
	then
        		xen_rm_opts=
    	else
        		xen_rm_opts="no-real-mode edd=off"
    	fi
    	xen_hypervisor  /boot/xen-4.19-unstable placeholder noreeboot dom0_mem=1024M bootscrub=0 iommu=on loglvl=all guest_loglvl=all ${xen_rm_opts}
    	echo    'Loading Linux 6.6.2-g5645b4e1d273 ...'
    	xen_module      /boot/vmlinuz-6.6.2-g5645b4e1d273 placeholder root=UUID=0ef5c67d-e9bc-480c-858b-509fc93e80e1 ro single console=hvc0 earllycon=xenboot
    	echo    'Loading initial ramdisk ...'
    	xen_module      --nounzip   /boot/initrd.img-6.6.2-g56455b4e1d273
}

If, for some reason, you are creating a Xen GRUB entry manually, please note that the xen_hypervisor command should have at least an empty argument ( "); otherwise, a NULL pointer will occur in GRUB.

After the system reboots, choose a Xen GRUB entry during boot. After the system finishes booting, it would be possible to log in to the console (at the moment, the display manager fails to start, requiring additional investigation) and create a guest domain.

The next step is providing the stuff needed to boot a guest domain, first via a guest.cfg domain config file:

name="test"
memory=8096
vcpus=4

kernel = "/boot/vmlinuz-6.6.2-g5645b4e1d273"
ramdisk = "/boot/initrd.img-6.6.2-g5645b4e1d273"

device_model_version="qemu-xen"
device_model_override="/home/ok/Projects/guest/qemu_device_model.py"
extra = " root=/dev/xvda1 rw"
virtio_qemu_domid = 0

disk = ['format=qcow2, vdev=xvda, access=rw, backendtype=qdisk, target=/home/ok/Projects/guest/debian-12-nocloud-arm64.qcow2']

vif = ['mac=00:16:3E:74:34:32,script=vif-bridge,bridge=xenbr0']

Then we need to make some modifications on qemu_device_model.py, where is needed to update the path to QEMU compiled above (QEMU_BINARY_PATH and LOG_FILE can be required to update to specify where you install QEMU and would like to see QEMU's log file). qemu_device_model.py should be located in the same directory as qemu.cfg. Get the file from this link.

Xen bridge interface should be provided to have a network in a guest domain:

$ apt-get install bridge-utils
$ brctl addbr xenbr0
$ brctl addif xenbr0 enP2p2s0f0
$ ip link set dev xenbr0 up
$ dhclient xenbr0

Booting your first VM

Now you need to start xencommon server with a systemctl start xencommons. And finally start your VM with xl create guest.cfg. That's it!

A guest domain should be started. To verify that xl list command can be used:

$ xl list
Name ID Mem VCPUs State Time(s)
Domain-0 0 1024 128 r----- 22.7
test 1 8096 4 -b-— 7.2

If it is needed to switch to a guest domain console, it can be done by the following command: xl console test. If it is needed to go back to the Dom0 domain, ctrl+5 should used. Since we have many many cores, we can start dozens of VMs on this machine, while having a very overall low power usage.

Current Xen limitations

  1. PCI passthrough: the work is going on around the PCI passthrough in Xen for Arm; some patches are available in the Xen mailing list. Some community members have started the work on virtio-pci for Ampere Altra. Look at useful links at the end of the article.
  2. GPU support wasn't properly checked; however, to share that GPU, it would need virtio-gpu or something else, and there is no support for that, but there is going some work on this topic. Check the useful links at the end of the article.
  3. CONFIG_ACPI for Xen is still under development and has UNSUPPORTED status; additional clarification is needed here.

Other known issues

When started on top of Xen, the ITS visible to Linux is modified as Xen is not allowing exactly the same area to Dom0. As a consequence, Linux finds a difference with what is given in the ACPI tables and prints this warning per CPU.

Then, Xen modifies the IORT table but fails to update the checksum properly:

[    0.009029] ACPI: Core revision 20200925
[    0.013288] ACPI BIOS Warning (bug): Incorrect checksum in table [IORT] - 0x95, should be 0xB3 (20200925/tbprint-173)

Also, The IOREGs are not supported, so no area is allocated for Linux. As a consequence, Linux cannot allocate space for the IOREGs, and Linux is failing to allocate them. PCI MSI irq setup and teardown for igb.

Work on virtio-pci and virto-gpu for Ampere Altra:

Files · testing/virtio-gpu-fixes-xen · Manos Pitsidianakis / QEMU · GitLab
QEMU main repository: Please see https://www.qemu.org/docs/master/devel/submitting-a-patch.html for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from…
Files · new-attempt · Manos Pitsidianakis / xen · GitLab
GitLab.com
GitHub - epilys/linux at 6.6.2-ampere-pci-fixes
Linux kernel source tree. Contribute to epilys/linux development by creating an account on GitHub.

Some other useful docs about Ampere Altra and Xen:

Project Orko Madrid Connect Demo - Orko (ORKO) - Confluence

If you want to actively participate on this kind of work, we welcome you at the xen-on-ampere channel on Xen Project Matrix server!

The future

Getting Xen to work on the Ampere platform is a big deal for us, especially since we managed to do it without needing any special tweaks. Yes, we're missing a few things like PCI passthrough, but the future looks good from here. Our next steps? Filling in those gaps and beefing up the rest of our platform. XCP-ng is more than just Xen; it's a whole set of tools, storage solutions, and software that makes everything run smoother.

It won't be a quick job, but we've got the right tools and direct support from Ampere's engineers, which makes a huge difference. Working together, we're set to make our platform even better and ready for whatever comes next. In short, we're on a promising path, and with some teamwork and tech know-how, there's a lot we can achieve.

Tags

Oleksii Kurochko

Along with Olivier Lambert

Hypervisor and Kernel Software Engineer at Vates. Focused on porting Xen Hypervisor to the RISC-V platform.