DevOps Megathread: what you need and how we can help!
Hello everyone,
Weโre gathering all your DevOps needs in this thread to ensure our DevOps tools (Terraform, Packer, Pulumi, Ansible, and more) support what matters to you.
Donโt hesitate to share what you need! If itโs not available yet, weโll do our best to make it happen.
Looking forward to your input
O olivierlambert pinned this topic on
@olivierlambert wow, We'll surely ask many things in this thread. Thank you.
Hi, there,
You would need an ansible XOA provider to be able to create virtual machines with a template like vmware does:- name: clone VM vmware_guest: hostname: "{{ vcenter_hostname }}" username: "{{ vcenter_username }}" password: "{{ vcenter_password }}" validate_certs: "{{ vcenter_validate_certs }}" datacenter: "{{ vcenter_datacenter }}" cluster: "{{ vcenter_cluster }}" name: SRV-NAMEVM folder: FOLDERTEST template: "{{ vm_template }}" networks: - name: LAN NETWORK ip: "{{ new_ip }}" netmask: "{{ netmask }}" gateway: "{{ gateway }}" domain: "{{ domain }}" wait_for_ip_address: True customization: hostname: "{{ hostname }}" domain: "{{ domain }}" dns_servers: - "{{ dns1 }}" - "{{ dns2 }}" dns_suffix: - "{{ domain }}" state: poweredon
Same for recovering and deleting the VM from ansible :
- name: INFO VM vmware_guest_info: hostname: "{{ vcenter_hostname }}" username: "{{ vcenter_username }}" password: "{{ vcenter_password }}" validate_certs: "{{ vcenter_validate_certs }}" datacenter: "{{ vcenter_datacenter }}" name: SRV-NAMEVM delegate_to: localhost register: vm_info
- name: Shutdown VM... vmware_guest: hostname: "{{ vcenter_hostname }}" username: "{{ vcenter_username }}" password: "{{ vcenter_password }}" validate_certs: "{{ vcenter_validate_certs }}" datacenter: "{{ vcenter_datacenter }}" name: SRV-NAMEVM state: poweredoff - name: delete VM... vmware_guest: hostname: "{{ vcenter_hostname }}" username: "{{ vcenter_username }}" password: "{{ vcenter_password }}" validate_certs: "{{ vcenter_validate_certs }}" datacenter: "{{ vcenter_datacenter }}" name: SRV-NAMEVM state: absent
@kiu Thanks, it's in our backlog. We'll update you when it would move to planned tasks.
The existing technical documentation is great. An operations guide would be helpful. Here are a couple of chapter ideas:
Best practices for setting up your environment
(eg, odd number of hosts, isolate management network, treat your hosts like cattle, etc) -
Preparing for disasters (eg, requirements for restoring pool metadata, how to recover a single VM if your pool data is gone, etc)
Hello everyone,
i just got the news email for the devops team.
recently used the backup replication offsite method for only the full backups (enable option). But still this is not handy, because i need only the last 3 full backups to go offsite.
So to do this, i have created a separate backup job , weekly, full only, retention 3 on another backup repository, so i can choose this repository for the backup replication offsite job.XOA is great ! , i'm just saying there is room for improvement on the options of the offsite backups.
Main task - create base VM image or update existed one. Apply few system tweaks, and sometimes change disk volume.
Most of things i do with ansible, but no VM creation.i tried to get into terraform\packer, but it almost no any howto about. Also license scandals and new forks. I'll wait to see who survives.
Have you checked ?
@olivierlambert bookmarked. have no time for this right now.
Backup management with the Terraform provider would be a great feature. Maybe also for an upcoming ansible module. I always struggle to find the right backup for a VM since I grouped them in logical groups. So one Backup may handle multiple VMs and Sometimes it would be just easier to edit some IaC then the GUI, especially when I destroy a VM, I always forget to check if Backups exists.
@bufanda I think we'll be able to add backup support to Terraform when 1. the provider will use the new Rest API, and 2. when this API will offer endpoints for backups management. I took note. (This won't be done in minutes
About Ansible, it'll depends also if/when we start work on it. -
@Davidj-0 Let me ping @thomas-dkmt about doc
@nathanael-h said in DevOps Megathread: what you need and how we can help!:
@bufanda I think we'll be able to add backup support to Terraform when 1. the provider will use the new Rest API, and 2. when this API will offer endpoints for backups management. I took note. (This won't be done in minutes
About Ansible, it'll depends also if/when we start work on it.No worries it's not a critical issue for me. It's jsut been on my wish list and I already was about to look myself into it by either writing some ansible module for it or finally have a reason to learn go but as you said the current API has some limitations in that regard.
I use Ansible on XCP-ng hosts for a few things :- basic server configuration (hostname, syslog target, NTP, DNS, NUT service if needed, Dell OpenManage, ...)
- network creation (with VLANs on existing PIFs)
- storage creation (on local storage, it runs the xe sr-create command on a user-created partition)
- VM creation (including bootstrap in a dedicated iPXE VLAN and network interfaces recreation after the initial iPXE boot to put the VM back on the right network). To do this my ansible playbook runs a lot of xe commands on the dom0 (vm-install/vm-memory-limits-set/vm-param-set/vdi-create/vdi-param-set/vif-create/...)
Most of my pools are single-host pools with local storage. almost all my VMs are debian VMs.
This process has worked well since XenServer 7.2, but it is not future-proof since Ansible dropped support for python 2.x / 3.6 in its latest release and XCP-ng doesn't provider a newer python3 release (yet ?).
I understand that the goal is to do as few things in the dom0 as possible and use Xen Orchestra and its APIs for everything.
I tried to use custom templates with cloud-init but I stopped using them, since they need to be stored on each pool, whereas a NFS ISO storage can be shared between pools, and iPXE boot doesn't need a "full template" with a VDI.
I would like to have a few Ansible modules to manage VMs / hosts / storage through Xen Orchestra : to create / resize /delete VMs, create networks and storages.
I do not have any asks ATM, but I thought I would just share my plan that I use to create k8s clusters that we have been using for a while now.
It has grown over time and may be a bit messy, but figured better then nothing. We use this for rke2 rancher k8s clusters deployed onto out xcp-ng cluster. We use xostor for drives, and the vlan5 network is for piraeus operator to use for pv. We also use IPVS. We are using a rocky linux 9 vm template.
If these are useful to anyone and they have questions I will do my best to answer.
variable "pool" { default = "OVBH-PROD-XENPOOL04" } variable "network0" { default = "Native vRack" } variable "network1" { default = "VLAN80" } variable "network2" { default = "VLAN5" } variable "cluster_name" { default = "Production K8s Cluster" } variable "enrollment_command" { default = "curl -fL https://rancher.<redacted>.net/ | sudo sh -s - --server https://rancher.<redacted>.net --label '' --token <redacted>" } variable "node_type" { description = "Node type flag" default = { "1" = "--etcd --controlplane", "2" = "--etcd --controlplane", "3" = "--etcd --controlplane", "4" = "--worker", "5" = "--worker", "6" = "--worker", "7" = "--worker --taints smtp=true:NoSchedule", "8" = "--worker --taints smtp=true:NoSchedule", "9" = "--worker --taints smtp=true:NoSchedule" } } variable "node_networks" { description = "Node network flag" default = { "1" = "--internal-address --address <redacted>", "2" = "--internal-address --address <redacted>", "3" = "--internal-address --address <redacted>", "4" = "--internal-address --address <redacted>", "5" = "--internal-address --address <redacted>", "6" = "--internal-address --address <redacted>", "7" = "--internal-address --address <redacted>", "8" = "--internal-address --address <redacted>", "9" = "--internal-address --address <redacted>" } } variable "vm_name" { description = "Node type flag" default = { "1" = "OVBH-VPROD-K8S01-MASTER01", "2" = "OVBH-VPROD-K8S01-MASTER02", "3" = "OVBH-VPROD-K8S01-MASTER03", "4" = "OVBH-VPROD-K8S01-WORKER01", "5" = "OVBH-VPROD-K8S01-WORKER02", "6" = "OVBH-VPROD-K8S01-WORKER03", "7" = "OVBH-VPROD-K8S01-WORKER04", "8" = "OVBH-VPROD-K8S01-WORKER05", "9" = "OVBH-VPROD-K8S01-WORKER06" } } variable "preferred_host" { default = { "1" = "85838113-e4b8-4520-9f6d-8f3cf554c8f1", "2" = "783c27ac-2dcb-4798-9ca8-27f5f30791f6", "3" = "c03e1a45-4c4c-46f5-a2a1-d8de2e22a866", "4" = "85838113-e4b8-4520-9f6d-8f3cf554c8f1", "5" = "783c27ac-2dcb-4798-9ca8-27f5f30791f6", "6" = "c03e1a45-4c4c-46f5-a2a1-d8de2e22a866", "7" = "85838113-e4b8-4520-9f6d-8f3cf554c8f1", "8" = "783c27ac-2dcb-4798-9ca8-27f5f30791f6", "9" = "c03e1a45-4c4c-46f5-a2a1-d8de2e22a866" } } variable "xoa_admin_password" { } variable "host_count" { description = "All drives go to xostor" default = { "1" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "2" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "3" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "4" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "5" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "6" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "7" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "8" = "479ca676-20a1-4051-7189-a4a9ca47e00d", "9" = "479ca676-20a1-4051-7189-a4a9ca47e00d" } } variable "network1_ip_mapping" { description = "Mapping for network1 ips, vlan80" default = { "1" = "", "2" = "", "3" = "", "4" = "", "5" = "", "6" = "", "7" = "", "8" = "", "9" = "" } } variable "network1_gateway" { description = "Mapping for public ip gateways, from hosts" default = "" } variable "network1_prefix" { description = "Prefix for the network used" default = "22" } variable "network2_ip_mapping" { description = "Mapping for network2 ips, VLAN5" default = { "1" = "", "2" = "", "3" = "", "4" = "", "5" = "", "6" = "", "7" = "", "8" = "", "9" = "" } } variable "network2_prefix" { description = "Prefix for the network used" default = "22" } variable "network0_ip_mapping" { description = "Mapping for network0 ips, public" default = { <redacted> } } variable "network0_gateway" { description = "Mapping for public ip gateways, from hosts" default = { <redacted> } } variable "network0_prefix" { description = "Prefix for the network used" default = { <redacted> } }
# Instruct terraform to download the provider on `terraform init` terraform { required_providers { xenorchestra = { source = "vatesfr/xenorchestra" version = "~> 0.29.0" } } } # Configure the XenServer Provider provider "xenorchestra" { # Must be ws or wss url = "ws://" # Or set XOA_URL environment variable username = "" # Or set XOA_USER environment variable password = var.xoa_admin_password # Or set XOA_PASSWORD environment variable } data "xenorchestra_pool" "pool" { name_label = var.pool } data "xenorchestra_template" "template" { name_label = "Rocky Linux 9 Template" pool_id = } data "xenorchestra_network" "net1" { name_label = var.network1 pool_id = } data "xenorchestra_network" "net2" { name_label = var.network2 pool_id = } data "xenorchestra_network" "net0" { name_label = var.network0 pool_id = } resource "xenorchestra_cloud_config" "node" { count = 9 name = "${lower(lookup(var.vm_name, count.index + 1))}_cloud_config" template = <<EOF #cloud-config ssh_authorized_keys: - ssh-rsa <redacted> write_files: - path: /etc/NetworkManager/conf.d/rke2-canal.conf permissions: '0755' owner: root content: | [keyfile] unmanaged-devices=interface-name:cali*;interface-name:flannel* - path: /tmp/selinux_kmod_drbd.log permissions: '0640' owner: root content: | type=AVC msg=audit(1661803314.183:778): avc: denied { module_load } for pid=148256 comm="insmod" path="/tmp/ko/drbd.ko" dev="overlay" ino=101839829 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=system_u:object_r:var_lib_t:s0 tclass=system permissive=0 type=AVC msg=audit(1661803314.185:779): avc: denied { module_load } for pid=148257 comm="insmod" path="/tmp/ko/drbd_transport_tcp.ko" dev="overlay" ino=101839831 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=system_u:object_r:var_lib_t:s0 tclass=system permissive=0 - path: /etc/sysconfig/modules/ipvs.modules permissions: 0755 owner: root content: | #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack - path: /etc/modules-load.d/ipvs.conf permissions: 0755 owner: root content: | ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack #cloud-init runcmd: - sudo hostnamectl set-hostname --static ${lower(lookup(var.vm_name, count.index + 1))}.<redacted>.com - sudo hostnamectl set-hostname ${lower(lookup(var.vm_name, count.index + 1))}.<redacted>.com - nmcli -t -f NAME con show | xargs -d '\n' -I {} nmcli con delete "{}" - nmcli con add type ethernet con-name public ifname enX0 - nmcli con mod public ipv4.address '${lookup(var.network0_ip_mapping, count.index + 1)}/${lookup(var.network0_prefix, count.index + 1)}' - nmcli con mod public ipv4.method manual - nmcli con mod public ipv4.ignore-auto-dns yes - nmcli con mod public ipv4.gateway '${lookup(var.network0_gateway, count.index + 1)}' - nmcli con mod public ipv4.dns "" - nmcli con mod public connection.autoconnect true - nmcli con up public - nmcli con add type ethernet con-name vlan80 ifname enX1 - nmcli con mod vlan80 ipv4.address '${lookup(var.network1_ip_mapping, count.index + 1)}/${var.network1_prefix}' - nmcli con mod vlan80 ipv4.method manual - nmcli con mod vlan80 ipv4.ignore-auto-dns yes - nmcli con mod vlan80 ipv4.ignore-auto-routes yes - nmcli con mod vlan80 ipv4.gateway '${var.network1_gateway}' - nmcli con mod vlan80 ipv4.dns "${var.network1_gateway}" - nmcli con mod vlan80 connection.autoconnect true - nmcli con mod vlan80 ipv4.never-default true - nmcli con mod vlan80 ipv6.never-default true - nmcli con mod vlan80 ipv4.routes " ${var.network1_gateway}" - nmcli con up vlan80 - nmcli con add type ethernet con-name vlan5 ifname enX2 - nmcli con mod vlan5 ipv4.address '${lookup(var.network2_ip_mapping, count.index + 1)}/${var.network2_prefix}' - nmcli con mod vlan5 ipv4.method manual - nmcli con mod vlan5 ipv4.ignore-auto-dns yes - nmcli con mod vlan5 ipv4.ignore-auto-routes yes - nmcli con mod vlan5 connection.autoconnect true - nmcli con mod vlan5 ipv4.never-default true - nmcli con mod vlan5 ipv6.never-default true - nmcli con up vlan5 - systemctl restart NetworkManager - dnf upgrade -y - dnf install ipset ipvsadm -y - bash /etc/sysconfig/modules/ipvs.modules - dnf install chrony -y - sudo systemctl enable --now chronyd - yum install kernel-devel kernel-headers -y - yum install elfutils-libelf-devel -y - swapoff -a - modprobe -- ip_tables - systemctl disable --now firewalld.service - systemctl disable --now rngd - dnf config-manager --add-repo= - dnf install tar -y - dnf install policycoreutils-python-utils -y - cat /tmp/selinux_kmod_drbd.log | sudo audit2allow -M insmoddrbd - sudo semodule -i insmoddrbd.pp - ${var.enrollment_command} ${lookup(var.node_type, count.index + 1)} ${lookup(var.node_networks, count.index + 1)} bootcmd: - swapoff -a - modprobe -- ip_tables EOF } resource "xenorchestra_vm" "master" { count = 3 cpus = 4 memory_max = 8589934592 cloud_config = xenorchestra_cloud_config.node[count.index].template name_label = lookup(var.vm_name, count.index + 1) name_description = "${var.cluster_name} master" template = auto_poweron = true affinity_host = lookup(var.preferred_host, count.index + 1) network { network_id = } network { network_id = } network { network_id = } disk { sr_id = lookup(var.host_count, count.index + 1) name_label = "Terraform_disk_imavo" size = 107374182400 } } resource "xenorchestra_vm" "worker" { count = 3 cpus = 32 memory_max = 68719476736 cloud_config = xenorchestra_cloud_config.node[count.index + 3].template name_label = lookup(var.vm_name, count.index + 3 + 1) name_description = "${var.cluster_name} worker" template = auto_poweron = true affinity_host = lookup(var.preferred_host, count.index + 3 + 1) network { network_id = } network { network_id = } network { network_id = } disk { sr_id = lookup(var.host_count, count.index + 3 + 1) name_label = "Terraform_disk_imavo" size = 322122547200 } } resource "xenorchestra_vm" "smtp" { count = 3 cpus = 4 memory_max = 8589934592 cloud_config = xenorchestra_cloud_config.node[count.index + 6].template name_label = lookup(var.vm_name, count.index + 6 + 1) name_description = "${var.cluster_name} smtp worker" template = auto_poweron = true affinity_host = lookup(var.preferred_host, count.index + 6 + 1) network { network_id = } network { network_id = } network { network_id = } disk { sr_id = lookup(var.host_count, count.index + 6 + 1) name_label = "Terraform_disk_imavo" size = 53687091200 } }
@nathanael-h Nice
If you have any questions let me know, I have been using this for all our on prem clusters for a while now.
It nice to be able to create a schedule or have some way of automatically cleaning up VM templates within XO/XCP-NG with the ability to set a retention policy like backups.
For example, we have a GitHub workflow that runs daily, weekly, and monthly to create base Ubuntu/Windows server images with the latest updates, those templates have a tag with the build date. We then leverage those templates through other pipelines to test k3s cluster updates with the Terraform provider.
Currently I have not been able to find any automated task within XO or XCP-NG without external systems. As of current I go in daily and clean up templates that exceeds our desired retention policy (similar to our backup retention policy). I would also like to note that we also have the Vates VMS Enterprise plan through my work account.
I would also love to see some more work on the Packer provider, mainly with the XVA builder. We have our base Ubuntu templates but would like to be able to take that template and then make another template on top of it with the k3s binary installed for example to prevent having to download and install the binary or other tooling on each VM in a cluster using Terraform or Ansible.
Lastly it would be nice to have some more frequent updates to the Terraform provider. I am aware that there are updates still being pushed to the main branch but the last release was published on Mar 20, 2024.
@bufanda said in DevOps Megathread: what you need and how we can help!:
Backup management with the Terraform provider would be a great feature. Maybe also for an upcoming ansible module. I always struggle to find the right backup for a VM since I grouped them in logical groups. So one Backup may handle multiple VMs and Sometimes it would be just easier to edit some IaC then the GUI, especially when I destroy a VM, I always forget to check if Backups exists.
@nathanael-h said in DevOps Megathread: what you need and how we can help!:
@bufanda I think we'll be able to add backup support to Terraform when 1. the provider will use the new Rest API, and 2. when this API will offer endpoints for backups management. I took note. (This won't be done in minutes
About Ansible, it'll depends also if/when we start work on it.+1 to the backup management through Terraform. It would be great to be able to manage backup jobs and sequences through Terraform.