XO Packer template disk issues
-
I investigated this with @BMeach in Discord (link). The issue is that the broken packer run is creating a template that contains a populated
disks
property in theother-config
parameter (more details here).XO's logic for hiding this is located here. I don't know the historical context, but when I initially created the terraform provider I modeled VM creation off that logic in order to get the provider's
vm.start
RPC calls to succeed with cloud-init.I will investigate why packer is doing something differently the manual terraform template creation process described here, but I'm also interested in hearing more from the XO team for why it is setup this way.
-
So it's been a long time we did that, so I can't trust at 100% my memory. That's what I recall ( @julien-f correct me if I'm wrong)
Default templates
For "default templates" (templates without any actual disk present, ie coming from the XCP-ng installation), the
disks
field contains a list of suggested disks to generate on VM creation. It's a guide that can be fetched by XAPI clients (XO, XenCenter) to pre-fill the disks creation fields in their respective UI. Eg: a Windows template will usually advise to create a rather large disk (40GiB) by default vs a Linux template. XenCenter and XO aren't aware of those things, they just fetch the "provision" to display what's expected in the template itself to present it to the user.During the VM creation, this
provision
field should disappear (by XAPI, automatically, on provisioning), since it's not needed anymore: the VM is now created, and it won't be an empty template anymore. Either it stays a regular VM, or it becomes a template with a disk (see below).User created templates
Any other template that the "original/default" ones will very likely already contain a disk with a system installed (whatever this system is). Only those templates can be used with cloudinit, because you need to have a disk where cloudinit is installed.
A way to detect that a template was provisioned in the past is to rely on this information. There's one exception (again IIRC): Other install media, which is a special template without any guidance on disks to be created since it's a "misc" template.
-
I can confirm that XAPI is removing the
disks
key inother_config
during theVM.provision
call: -
So maybe the Packer plugin isn't doing what it should to create the template? We probably need to discuss this together
-
@olivierlambert thanks for that background on the Default vs user created templates and that matches my mental model of how things work with XO and terraform -- that a template must have an installed OS (disk) in order to work.
So maybe the Packer plugin isn't doing what it should to create the template?
As for what packer is doing, it's using the xapi VM.set_is_a_template call.
I believe this is identical to what XO does when you convert a VM to a template via the Advanced tab (source). That handler ultimately sets is a template, which seems equivalent to what packer is calling.
I definitely agree we need to discuss what the right solution is, but from the information I shared above I'm still unsure that we understand why a packer built template is behaving the way it is.
-
I can confirm that XAPI is removing the
disks
key inother_config
during theVM.provision
call:I understand very little ocaml, but is the code that you linked to part of the VM creation lifecycle? It seems like it's a performance test from what I can tell.
-
Okay so after closer inspection, it's not XAPI but the client responsability to clean this key after install, as we do it in XO:
https://github.com/vatesfr/xen-orchestra/blob/e6b893977205a90f5d7b96e8e0e0f65b030fa195/packages/xo-server/src/xapi/mixins/vm.mjs#L83-L85C37// Removes disks from the provision XML, we will create them by // ourselves. await this.call('VM.remove_from_other_config', vmRef, 'disks')::ignoreErrors()
However, it appears the be the standard way to do so. Let's see, as a first hint, the examples given by XAPI project on their Python bindings:
https://github.com/xapi-project/xen-api/blob/14ee043bcabf16cb58c2fbb974fd90ee81fb067e/scripts/examples/python/provision.py#L86session.xenapi.VM.remove_from_other_config(vm, "disks")
XenCenter is also doing exactly this:
// Removes disks from the provision XML, we will create them by // ourselves. await this.call('VM.remove_from_other_config', vmRef, 'disks')::ignoreErrors()
So the issue is the Packer plugin not having this "normal" behavior, and leading to the original problem In other words, a template with this key is considered as "without any already installed system" and adding Cloudinit config for it doesn't make sense at all => the logic is good.
Maybe it's another argument to get a Packer plugin on top of XO API to hide those implementation detailsβ¦
-
@olivierlambert I think I have found a fix on the Packer side like you were talking about. I have the templates working being built from the "Ubuntu Focal Fossa 20.04" default template. I created a pull request here let me know what you think, this is my first excursion with golang
-
@olivierlambert ah, nice find and that explains the final missing piece. The xenserver packer plugin should definitely follow the same steps as XO and XenCenter.
Definitely agree that it would be worth considering a XO specific packer plugin, but for now that should address this bug.
-
The ddelnano/packer-plugin-xenserver release v0.5.1 has been released and includes @bagas's fix for this issue.