XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. Jonathon
    J
    Offline
    • Profile
    • Following 2
    • Followers 0
    • Topics 4
    • Posts 49
    • Groups 0

    Jonathon

    @Jonathon

    7
    Reputation
    18
    Profile views
    49
    Posts
    0
    Followers
    2
    Following
    Joined
    Last Online

    Jonathon Unfollow Follow

    Best posts made by Jonathon

    • RE: XOSTOR hyperconvergence preview

      I have amazing news!

      After the upgrade to xcp-ng 8.3, I retested velero backup, and it all just works 😁

      Completed Backup

      jonathon@jonathon-framework:~$ velero --kubeconfig k8s_configs/production.yaml backup describe grafana-test
      Name:         grafana-test
      Namespace:    velero
      Labels:       objectset.rio.cattle.io/hash=c2b5f500ab5d9b8ffe14f2c70bf3742291df565c
                    velero.io/storage-location=default
      Annotations:  objectset.rio.cattle.io/applied=H4sIAAAAAAAA/4SSQW/bPgzFvwvPtv9OajeJj/8N22HdBqxFL0MPlEQlWmTRkOhgQ5HvPsixE2yH7iji8ffIJ74CDu6ZYnIcoIMTeYpcOf7vtIICji4Y6OB/1MdxgAJ6EjQoCN0rYAgsKI5Dyk9WP0hLIqmi40qjiKfMcRlAq7pBY+py26qmbEi15a5p78vtaqe0oqbVVsO5AI+K/Ju4A6YDdKDXqrVtXaNqzU5traVVY9d6Uyt7t2nW693K2Pa+naABe4IO9hEtBiyFksClmgbUdN06a9NAOtvr5B4DDunA8uR64lGgg7u6rxMUYMji6OWZ/dhTeuIPaQ6os+gTFUA/tR8NmXd+TELxUfNA5hslHqOmBN13OF16ZwvNQShIqpZClYQj7qk6blPlGF5uzC/L3P+kvok7MB9z0OcCXPiLPLHmuLLWCfVfB4rTZ9/iaA5zHovNZz7R++k6JI50q89BXcuXYR5YT0DolkChABEPHWzW9cK+rPQx8jgsH/KQj+QT/frzXCdduc/Ca9u1Y7aaFvMu5Ang5Xz+HQAA//8X7Fu+/QIAAA
                    objectset.rio.cattle.io/id=e104add0-85b4-4eb5-9456-819bcbe45cfc
                    velero.io/resource-timeout=10m0s
                    velero.io/source-cluster-k8s-gitversion=v1.33.4+rke2r1
                    velero.io/source-cluster-k8s-major-version=1
                    velero.io/source-cluster-k8s-minor-version=33
      
      Phase:  Completed
      
      
      Namespaces:
        Included:  grafana
        Excluded:  <none>
      
      Resources:
        Included cluster-scoped:    <none>
        Excluded cluster-scoped:    volumesnapshotcontents.snapshot.storage.k8s.io
        Included namespace-scoped:  *
        Excluded namespace-scoped:  volumesnapshots.snapshot.storage.k8s.io
      
      Label selector:  <none>
      
      Or label selector:  <none>
      
      Storage Location:  default
      
      Velero-Native Snapshot PVs:  true
      Snapshot Move Data:          true
      Data Mover:                  velero
      
      TTL:  720h0m0s
      
      CSISnapshotTimeout:    30m0s
      ItemOperationTimeout:  4h0m0s
      
      Hooks:  <none>
      
      Backup Format Version:  1.1.0
      
      Started:    2025-10-15 15:29:52 -0700 PDT
      Completed:  2025-10-15 15:31:25 -0700 PDT
      
      Expiration:  2025-11-14 14:29:52 -0800 PST
      
      Total items to be backed up:  35
      Items backed up:              35
      
      Backup Item Operations:  1 of 1 completed successfully, 0 failed (specify --details for more information)
      Backup Volumes:
        Velero-Native Snapshots: <none included>
      
        CSI Snapshots:
          grafana/central-grafana:
            Data Movement: included, specify --details for more information
      
        Pod Volume Backups: <none included>
      
      HooksAttempted:  0
      HooksFailed:     0
      

      Completed Restore

      jonathon@jonathon-framework:~$ velero --kubeconfig k8s_configs/production.yaml restore describe restore-grafana-test --details
      Name:         restore-grafana-test
      Namespace:    velero
      Labels:       objectset.rio.cattle.io/hash=252addb3ed156c52d9fa9b8c045b47a55d66c0af
      Annotations:  objectset.rio.cattle.io/applied=H4sIAAAAAAAA/3yRTW7zIBBA7zJrO5/j35gzfE2rtsomymIM45jGBgTjbKLcvaKJm6qL7kDwnt7ABdDpHfmgrQEBZxrJ25W2/85rSOCkjQIBrxTYeoIEJmJUyAjiAmiMZWRtTYhb232Q5EC88tquJDKPFEU6GlpUG5UVZdpUdZ6WZZ+niOtNWtR1SypvqC8buCYwYkfjn7oBwwAC8ipHpbqC1LqqZZWrtse228isrLqywapSdS0z7KPU4EQgwN+mSI8eezSYMgWG22lwKOl7/MgERzJmdChPs9veDL9IGfSbQRcGy+96IjszCCiyCRLQRo6zIrVd5AHEfuHhkIBmmp4d+a/3e9Dl8LPoCZ3T5hg7FvQRcR8nxt6XL7sAgv1MCZztOE+01P23cvmnPYzaxNtwuF4/AwAA//8k6OwC/QEAAA
                    objectset.rio.cattle.io/id=9ad8d034-7562-44f2-aa18-3669ed27ef47
      
      Phase:                       Completed
      Total items to be restored:  33
      Items restored:              33
      
      Started:    2025-10-15 15:35:26 -0700 PDT
      Completed:  2025-10-15 15:36:34 -0700 PDT
      
      Warnings:
        Velero:     <none>
        Cluster:    <none>
        Namespaces:
          grafana-restore:  could not restore, ConfigMap:elasticsearch-es-transport-ca-internal already exists. Warning: the in-cluster version is different than the backed-up version
                            could not restore, ConfigMap:kube-root-ca.crt already exists. Warning: the in-cluster version is different than the backed-up version
      
      Backup:  grafana-test
      
      Namespaces:
        Included:  grafana
        Excluded:  <none>
      
      Resources:
        Included:        *
        Excluded:        nodes, events, events.events.k8s.io, backups.velero.io, restores.velero.io, resticrepositories.velero.io, csinodes.storage.k8s.io, volumeattachments.storage.k8s.io, backuprepositories.velero.io
        Cluster-scoped:  auto
      
      Namespace mappings:  grafana=grafana-restore
      
      Label selector:  <none>
      
      Or label selector:  <none>
      
      Restore PVs:  true
      
      CSI Snapshot Restores:
        grafana-restore/central-grafana:
          Data Movement:
            Operation ID: dd-ffa56e1c-9fd0-44b4-a8bb-8163f40a49e9.330b82fc-ca6a-423217ee5
            Data Mover: velero
            Uploader Type: kopia
      
      Existing Resource Policy:   <none>
      ItemOperationTimeout:       4h0m0s
      
      Preserve Service NodePorts:  auto
      
      Restore Item Operations:
        Operation for persistentvolumeclaims grafana-restore/central-grafana:
          Restore Item Action Plugin:  velero.io/csi-pvc-restorer
          Operation ID:                dd-ffa56e1c-9fd0-44b4-a8bb-8163f40a49e9.330b82fc-ca6a-423217ee5
          Phase:                       Completed
          Progress:                    856284762 of 856284762 complete (Bytes)
          Progress description:        Completed
          Created:                     2025-10-15 15:35:28 -0700 PDT
          Started:                     2025-10-15 15:36:06 -0700 PDT
          Updated:                     2025-10-15 15:36:26 -0700 PDT
      
      HooksAttempted:   0
      HooksFailed:      0
      
      Resource List:
        apps/v1/Deployment:
          - grafana-restore/central-grafana(created)
          - grafana-restore/grafana-debug(created)
        apps/v1/ReplicaSet:
          - grafana-restore/central-grafana-5448b9f65(created)
          - grafana-restore/central-grafana-56887c6cb6(created)
          - grafana-restore/central-grafana-56ddd4f497(created)
          - grafana-restore/central-grafana-5f4757844b(created)
          - grafana-restore/central-grafana-5f69f86c85(created)
          - grafana-restore/central-grafana-64545dcdc(created)
          - grafana-restore/central-grafana-69c66c54d9(created)
          - grafana-restore/central-grafana-6c8d6f65b8(created)
          - grafana-restore/central-grafana-7b479f79ff(created)
          - grafana-restore/central-grafana-bc7d96cdd(created)
          - grafana-restore/central-grafana-cb88bd49c(created)
          - grafana-restore/grafana-debug-556845ff7b(created)
          - grafana-restore/grafana-debug-6fb594cb5f(created)
          - grafana-restore/grafana-debug-8f66bfbf6(created)
        discovery.k8s.io/v1/EndpointSlice:
          - grafana-restore/central-grafana-hkgd5(created)
        networking.k8s.io/v1/Ingress:
          - grafana-restore/central-grafana(created)
        rbac.authorization.k8s.io/v1/Role:
          - grafana-restore/central-grafana(created)
        rbac.authorization.k8s.io/v1/RoleBinding:
          - grafana-restore/central-grafana(created)
        v1/ConfigMap:
          - grafana-restore/central-grafana(created)
          - grafana-restore/elasticsearch-es-transport-ca-internal(failed)
          - grafana-restore/kube-root-ca.crt(failed)
        v1/Endpoints:
          - grafana-restore/central-grafana(created)
        v1/PersistentVolume:
          - pvc-e3f6578f-08b2-4e79-85f0-76bbf8985b55(skipped)
        v1/PersistentVolumeClaim:
          - grafana-restore/central-grafana(created)
        v1/Pod:
          - grafana-restore/central-grafana-cb88bd49c-fc5br(created)
        v1/Secret:
          - grafana-restore/fpinfra-net-cf-cert(created)
          - grafana-restore/grafana(created)
        v1/Service:
          - grafana-restore/central-grafana(created)
        v1/ServiceAccount:
          - grafana-restore/central-grafana(created)
          - grafana-restore/default(skipped)
        velero.io/v2alpha1/DataUpload:
          - velero/grafana-test-nw7zj(skipped)
      

      Image of working restore pod, with correct data in PV
      34d87db1-19ae-4348-8d4e-6599375d7634-image.png

      Velero installed from helm: https://vmware-tanzu.github.io/helm-charts
      Version: velero:11.1.0
      Values

      ---
      image:
        repository: velero/velero
        tag: v1.17.0
      
      # Whether to deploy the restic daemonset.
      deployNodeAgent: true
      
      initContainers:
         - name: velero-plugin-for-aws
           image: velero/velero-plugin-for-aws:latest
           imagePullPolicy: IfNotPresent
           volumeMounts:
             - mountPath: /target
               name: plugins
      
      configuration:
        defaultItemOperationTimeout: 2h
        features: EnableCSI
        defaultSnapshotMoveData: true
      
        backupStorageLocation:
          - name: default
            provider: aws
            bucket: velero
            config:
              region: us-east-1
              s3ForcePathStyle: true
              s3Url: https://s3.location
      
        # Destination VSL points to LINSTOR snapshot class
        volumeSnapshotLocation:
          - name: linstor
            provider: velero.io/csi
            config:
              snapshotClass: linstor-vsc
      
      credentials:
        useSecret: true
        existingSecret: velero-user
      
      
      metrics:
        enabled: true
      
        serviceMonitor:
          enabled: true
      
        prometheusRule:
          enabled: true
          # Additional labels to add to deployed PrometheusRule
          additionalLabels: {}
          # PrometheusRule namespace. Defaults to Velero namespace.
          # namespace: ""
          # Rules to be deployed
          spec:
            - alert: VeleroBackupPartialFailures
              annotations:
                message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} partialy failed backups.
              expr: |-
                velero_backup_partial_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
              for: 15m
              labels:
                severity: warning
            - alert: VeleroBackupFailures
              annotations:
                message: Velero backup {{ $labels.schedule }} has {{ $value | humanizePercentage }} failed backups.
              expr: |-
                velero_backup_failure_total{schedule!=""} / velero_backup_attempt_total{schedule!=""} > 0.25
              for: 15m
              labels:
                severity: warning
      

      Also create the following.

      apiVersion: snapshot.storage.k8s.io/v1
      kind: VolumeSnapshotClass
      metadata:
        name: linstor-vsc
        labels:
          velero.io/csi-volumesnapshot-class: "true"
      driver: linstor.csi.linbit.com
      deletionPolicy: Delete
      

      We are using Piraeus operator to use xostor in k8s
      https://github.com/piraeusdatastore/piraeus-operator.git
      Version: v2.9.1
      Values:

      ---
      operator: 
        resources:
          requests:
            cpu: 250m
            memory: 500Mi
          limits:
            memory: 1Gi
      installCRDs: true
      imageConfigOverride:
      - base: quay.io/piraeusdatastore
        components:
          linstor-satellite:
            image: piraeus-server
            tag: v1.29.0
      tls:
        certManagerIssuerRef:
          name: step-issuer
          kind: StepClusterIssuer
          group: certmanager.step.sm
      

      Then we just connect to the xostor cluster like external linstor controller.

      posted in XOSTOR
      J
      Jonathon
    • RE: XOSTOR hyperconvergence preview

      @stormi

      The problem was yum cache. If I did yum update right after yum update xcp-ng-release-linstor it would still fail. To get it working right away did the following

      yum update xcp-ng-release-linstor
      yum clean all
      yum update
      
      posted in XOSTOR
      J
      Jonathon
    • RE: XOSTOR hyperconvergence preview

      OK I figured it out! I made an init container that gets a manually created node label for the node the pod is running on. This value is the bare metal host for that k8s node. The init contianer then takes that value and makes a script wrapper and then calls linstor-csi with the correct values. After making these changes all the linstor csi containers are running with no errors.

      Current problem comes from deploying and using storage class. Started with a basic one that failed, and noticed I did not know what the correct storage_pool_name name was, so went to http://IP:3370/v1/nodes/NODE/storage-pools and http://IP:3370/v1/nodes/NODE to get information.

      Still troubleshooting, but wanted to provide info.

      posted in XOSTOR
      J
      Jonathon
    • RE: DevOps Megathread: what you need and how we can help!

      @andrewperry I myself migrated our rancher management cluster from the original rke to a new rke2 cluster using this plan not too long ago, so you should not have much trouble. Feel free to ask questions šŸ™‚

      posted in Infrastructure as Code
      J
      Jonathon
    • RE: DevOps Megathread: what you need and how we can help!

      @nathanael-h Nice šŸ˜„

      If you have any questions let me know, I have been using this for all our on prem clusters for a while now.

      posted in Infrastructure as Code
      J
      Jonathon
    • RE: DevOps Megathread: what you need and how we can help!

      I do not have any asks ATM, but I thought I would just share my plan that I use to create k8s clusters that we have been using for a while now.

      It has grown over time and may be a bit messy, but figured better then nothing. We use this for rke2 rancher k8s clusters deployed onto out xcp-ng cluster. We use xostor for drives, and the vlan5 network is for piraeus operator to use for pv. We also use IPVS. We are using a rocky linux 9 vm template.

      If these are useful to anyone and they have questions I will do my best to answer.

      variable "pool" {
        default = "OVBH-PROD-XENPOOL04"
      }
      
      variable "network0" {
        default = "Native vRack"
      }
      variable "network1" {
        default = "VLAN80"
      }
      variable "network2" {
        default = "VLAN5"
      }
      
      variable "cluster_name" {
        default = "Production K8s Cluster"
      }
      
      variable "enrollment_command" {
        default = "curl -fL https://rancher.<redacted>.net/system-agent-install.sh | sudo  sh -s - --server https://rancher.<redacted>.net --label 'cattle.io/os=linux' --token <redacted>"
      }
      
      
      variable "node_type" {
        description = "Node type flag"
        default = {
          "1" = "--etcd --controlplane",
          "2" = "--etcd --controlplane",
          "3" = "--etcd --controlplane",
          "4" = "--worker",
          "5" = "--worker",
          "6" = "--worker",
          "7" = "--worker --taints smtp=true:NoSchedule",
          "8" = "--worker --taints smtp=true:NoSchedule",
          "9" = "--worker --taints smtp=true:NoSchedule"
        }
      }
      variable "node_networks" {
        description = "Node network flag"
        default = {
          "1" = "--internal-address 10.1.8.100 --address <redacted>",
          "2" = "--internal-address 10.1.8.101 --address <redacted>",
          "3" = "--internal-address 10.1.8.102 --address <redacted>",
          "4" = "--internal-address 10.1.8.103 --address <redacted>",
          "5" = "--internal-address 10.1.8.104 --address <redacted>",
          "6" = "--internal-address 10.1.8.105 --address <redacted>",
          "7" = "--internal-address 10.1.8.106 --address <redacted>",
          "8" = "--internal-address 10.1.8.107 --address <redacted>",
          "9" = "--internal-address 10.1.8.108 --address <redacted>"
        }
      }
      
      
      variable "vm_name" {
        description = "Node type flag"
        default = {
          "1" = "OVBH-VPROD-K8S01-MASTER01",
          "2" = "OVBH-VPROD-K8S01-MASTER02",
          "3" = "OVBH-VPROD-K8S01-MASTER03",
          "4" = "OVBH-VPROD-K8S01-WORKER01",
          "5" = "OVBH-VPROD-K8S01-WORKER02",
          "6" = "OVBH-VPROD-K8S01-WORKER03",
          "7" = "OVBH-VPROD-K8S01-WORKER04",
          "8" = "OVBH-VPROD-K8S01-WORKER05",
          "9" = "OVBH-VPROD-K8S01-WORKER06"
        }
      }
      
      variable "preferred_host" {
        default = {
          "1" = "85838113-e4b8-4520-9f6d-8f3cf554c8f1",
          "2" = "783c27ac-2dcb-4798-9ca8-27f5f30791f6",
          "3" = "c03e1a45-4c4c-46f5-a2a1-d8de2e22a866",
          "4" = "85838113-e4b8-4520-9f6d-8f3cf554c8f1",
          "5" = "783c27ac-2dcb-4798-9ca8-27f5f30791f6",
          "6" = "c03e1a45-4c4c-46f5-a2a1-d8de2e22a866",
          "7" = "85838113-e4b8-4520-9f6d-8f3cf554c8f1",
          "8" = "783c27ac-2dcb-4798-9ca8-27f5f30791f6",
          "9" = "c03e1a45-4c4c-46f5-a2a1-d8de2e22a866"
        }
      }
      
      variable "xoa_admin_password" {
      }
      
      variable "host_count" {
        description = "All drives go to xostor"
        default = {
          "1" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "2" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "3" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "4" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "5" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "6" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "7" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "8" = "479ca676-20a1-4051-7189-a4a9ca47e00d",
          "9" = "479ca676-20a1-4051-7189-a4a9ca47e00d"
        }
      }
      
      variable "network1_ip_mapping" {
        description = "Mapping for network1 ips, vlan80"
        default = {
          "1" = "10.1.8.100",
          "2" = "10.1.8.101",
          "3" = "10.1.8.102",
          "4" = "10.1.8.103",
          "5" = "10.1.8.104",
          "6" = "10.1.8.105",
          "7" = "10.1.8.106",
          "8" = "10.1.8.107",
          "9" = "10.1.8.108"
        }
      }
      
      variable "network1_gateway" {
        description = "Mapping for public ip gateways, from hosts"
        default     = "10.1.8.1"
      }
      
      variable "network1_prefix" {
        description = "Prefix for the network used"
        default     = "22"
      }
      
      variable "network2_ip_mapping" {
        description = "Mapping for network2 ips, VLAN5"
        default = {
          "1" = "10.2.5.30",
          "2" = "10.2.5.31",
          "3" = "10.2.5.32",
          "4" = "10.2.5.33",
          "5" = "10.2.5.34",
          "6" = "10.2.5.35",
          "7" = "10.2.5.36",
          "8" = "10.2.5.37",
          "9" = "10.2.5.38"
        }
      }
      
      
      variable "network2_prefix" {
        description = "Prefix for the network used"
        default     = "22"
      }
      
      variable "network0_ip_mapping" {
        description = "Mapping for network0 ips, public"
        default = {
      <redacted>
        }
      }
      
      variable "network0_gateway" {
        description = "Mapping for public ip gateways, from hosts"
        default = {
      <redacted>
        }
      }
      
      variable "network0_prefix" {
        description = "Prefix for the network used"
        default = {
      <redacted>
        }
      }
      
      # Instruct terraform to download the provider on `terraform init`
      terraform {
        required_providers {
          xenorchestra = {
            source  = "vatesfr/xenorchestra"
            version = "~> 0.29.0"
          }
        }
      }
      
      # Configure the XenServer Provider
      provider "xenorchestra" {
        # Must be ws or wss
        url      = "ws://10.2.0.5"        # Or set XOA_URL environment variable
        username = "admin@admin.net"      # Or set XOA_USER environment variable
        password = var.xoa_admin_password # Or set XOA_PASSWORD environment variable
      }
      
      data "xenorchestra_pool" "pool" {
        name_label = var.pool
      }
      
      data "xenorchestra_template" "template" {
        name_label = "Rocky Linux 9 Template"
        pool_id    = data.xenorchestra_pool.pool.id
      }
      
      data "xenorchestra_network" "net1" {
        name_label = var.network1
        pool_id    = data.xenorchestra_pool.pool.id
      }
      data "xenorchestra_network" "net2" {
        name_label = var.network2
        pool_id    = data.xenorchestra_pool.pool.id
      }
      data "xenorchestra_network" "net0" {
        name_label = var.network0
        pool_id    = data.xenorchestra_pool.pool.id
      }
      
      resource "xenorchestra_cloud_config" "node" {
        count    = 9
        name     = "${lower(lookup(var.vm_name, count.index + 1))}_cloud_config"
        template = <<EOF
      
      #cloud-config
      ssh_authorized_keys:
        - ssh-rsa <redacted>
      
      write_files:
        - path: /etc/NetworkManager/conf.d/rke2-canal.conf
          permissions: '0755'
          owner: root
          content: |
            [keyfile]
            unmanaged-devices=interface-name:cali*;interface-name:flannel*
        - path: /tmp/selinux_kmod_drbd.log
          permissions: '0640'
          owner: root
          content: |
            type=AVC msg=audit(1661803314.183:778): avc:  denied  { module_load } for  pid=148256 comm="insmod" path="/tmp/ko/drbd.ko" dev="overlay" ino=101839829 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=system_u:object_r:var_lib_t:s0 tclass=system permissive=0
            type=AVC msg=audit(1661803314.185:779): avc:  denied  { module_load } for  pid=148257 comm="insmod" path="/tmp/ko/drbd_transport_tcp.ko" dev="overlay" ino=101839831 scontext=system_u:system_r:unconfined_service_t:s0 tcontext=system_u:object_r:var_lib_t:s0 tclass=system permissive=0
        - path: /etc/sysconfig/modules/ipvs.modules
          permissions: 0755
          owner: root
          content: |
            #!/bin/bash
            modprobe -- ip_vs
            modprobe -- ip_vs_rr
            modprobe -- ip_vs_wrr
            modprobe -- ip_vs_sh
            modprobe -- nf_conntrack
        - path: /etc/modules-load.d/ipvs.conf
          permissions: 0755
          owner: root
          content: |
            ip_vs
            ip_vs_rr
            ip_vs_wrr
            ip_vs_sh
            nf_conntrack
      
      #cloud-init
      runcmd:
        - sudo hostnamectl set-hostname --static ${lower(lookup(var.vm_name, count.index + 1))}.<redacted>.com
        - sudo hostnamectl set-hostname ${lower(lookup(var.vm_name, count.index + 1))}.<redacted>.com
        - nmcli -t -f NAME con show | xargs -d '\n' -I {} nmcli con delete "{}"
        - nmcli con add type ethernet con-name public ifname enX0
        - nmcli con mod public ipv4.address '${lookup(var.network0_ip_mapping, count.index + 1)}/${lookup(var.network0_prefix, count.index + 1)}'
        - nmcli con mod public ipv4.method manual
        - nmcli con mod public ipv4.ignore-auto-dns yes
        - nmcli con mod public ipv4.gateway '${lookup(var.network0_gateway, count.index + 1)}'
        - nmcli con mod public ipv4.dns "8.8.8.8 8.8.4.4"
        - nmcli con mod public connection.autoconnect true
        - nmcli con up public
        - nmcli con add type ethernet con-name vlan80 ifname enX1
        - nmcli con mod vlan80 ipv4.address '${lookup(var.network1_ip_mapping, count.index + 1)}/${var.network1_prefix}'
        - nmcli con mod vlan80 ipv4.method manual
        - nmcli con mod vlan80 ipv4.ignore-auto-dns yes
        - nmcli con mod vlan80 ipv4.ignore-auto-routes yes
        - nmcli con mod vlan80 ipv4.gateway '${var.network1_gateway}'
        - nmcli con mod vlan80 ipv4.dns "${var.network1_gateway}"
        - nmcli con mod vlan80 connection.autoconnect true
        - nmcli con mod vlan80 ipv4.never-default true
        - nmcli con mod vlan80 ipv6.never-default true
        - nmcli con mod vlan80 ipv4.routes "10.0.0.0/8 ${var.network1_gateway}"
        - nmcli con up vlan80
        - nmcli con add type ethernet con-name vlan5 ifname enX2
        - nmcli con mod vlan5 ipv4.address '${lookup(var.network2_ip_mapping, count.index + 1)}/${var.network2_prefix}'
        - nmcli con mod vlan5 ipv4.method manual
        - nmcli con mod vlan5 ipv4.ignore-auto-dns yes
        - nmcli con mod vlan5 ipv4.ignore-auto-routes yes
        - nmcli con mod vlan5 connection.autoconnect true
        - nmcli con mod vlan5 ipv4.never-default true
        - nmcli con mod vlan5 ipv6.never-default true
        - nmcli con up vlan5
        - systemctl restart NetworkManager
        - dnf upgrade -y
        - dnf install ipset ipvsadm -y
        - bash /etc/sysconfig/modules/ipvs.modules
        - dnf install chrony -y
        - sudo systemctl enable --now chronyd
        - yum install kernel-devel kernel-headers -y
        - yum install elfutils-libelf-devel -y
        - swapoff -a
        - modprobe -- ip_tables
        - systemctl disable --now firewalld.service
        - systemctl disable --now rngd
        - dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
        - dnf install containerd.io tar -y
        - dnf install policycoreutils-python-utils -y
        - cat /tmp/selinux_kmod_drbd.log | sudo audit2allow -M insmoddrbd
        - sudo semodule -i insmoddrbd.pp
        - ${var.enrollment_command} ${lookup(var.node_type, count.index + 1)} ${lookup(var.node_networks, count.index + 1)}
      
      bootcmd:
        - swapoff -a
        - modprobe -- ip_tables
      EOF
      }
      
      resource "xenorchestra_vm" "master" {
        count            = 3
        cpus             = 4
        memory_max       = 8589934592
        cloud_config     = xenorchestra_cloud_config.node[count.index].template
        name_label       = lookup(var.vm_name, count.index + 1)
        name_description = "${var.cluster_name} master"
        template         = data.xenorchestra_template.template.id
        auto_poweron     = true
        affinity_host    = lookup(var.preferred_host, count.index + 1)
      
        network {
          network_id = data.xenorchestra_network.net0.id
        }
        network {
          network_id = data.xenorchestra_network.net1.id
        }
        network {
          network_id = data.xenorchestra_network.net2.id
        }
        disk {
          sr_id      = lookup(var.host_count, count.index + 1)
          name_label = "Terraform_disk_imavo"
          size       = 107374182400
        }
      }
      
      
      resource "xenorchestra_vm" "worker" {
        count            = 3
        cpus             = 32
        memory_max       = 68719476736
        cloud_config     = xenorchestra_cloud_config.node[count.index + 3].template
        name_label       = lookup(var.vm_name, count.index + 3 + 1)
        name_description = "${var.cluster_name} worker"
        template         = data.xenorchestra_template.template.id
        auto_poweron     = true
        affinity_host    = lookup(var.preferred_host, count.index + 3 + 1)
        
        network {
          network_id = data.xenorchestra_network.net0.id
        }
        network {
          network_id = data.xenorchestra_network.net1.id
        }
        network {
          network_id = data.xenorchestra_network.net2.id
        }
        disk {
          sr_id      = lookup(var.host_count, count.index + 3 + 1)
          name_label = "Terraform_disk_imavo"
          size       = 322122547200
        }
      }
      
      resource "xenorchestra_vm" "smtp" {
        count            = 3
        cpus             = 4
        memory_max       = 8589934592
        cloud_config     = xenorchestra_cloud_config.node[count.index + 6].template
        name_label       = lookup(var.vm_name, count.index + 6 + 1)
        name_description = "${var.cluster_name} smtp worker"
        template         = data.xenorchestra_template.template.id
        auto_poweron     = true
        affinity_host    = lookup(var.preferred_host, count.index + 6 + 1)
        
        network {
          network_id = data.xenorchestra_network.net0.id
        }
        network {
          network_id = data.xenorchestra_network.net1.id
        }
        network {
          network_id = data.xenorchestra_network.net2.id
        }
        disk {
          sr_id      = lookup(var.host_count, count.index + 6 + 1)
          name_label = "Terraform_disk_imavo"
          size       = 53687091200
        }
      }
      
      posted in Infrastructure as Code
      J
      Jonathon

    Latest posts made by Jonathon

    • RE: Ran into a new auth issue with xostor?

      @Mathieu-L

      linstor n l was included in my original post.
      All nodes were updated to May 2026 Security and Maintenance Updates for XCP-ng 8.3 LTS, all nodes were restarted.
      May 2026 Updates #2 for XCP-ng 8.3 LTS was released, and a couple days later I installed on all hosts. No host restarted.

      When xen04 was restarted, that is when this issue happened.
      I had used systemctl restart linstor-controller here (https://xcp-ng.org/forum/post/105309) to restart the controller.

      posted in XOSTOR
      J
      Jonathon
    • RE: Ran into a new auth issue with xostor?

      After looking at things some more and not seeing anything else I could do, I restarted the controller and satellites. This allowed things to recover.

      posted in XOSTOR
      J
      Jonathon
    • Ran into a new auth issue with xostor?

      Something went wrong with a (xeno4) host and it rebooted. After reboot it is behaving weirdly. Rebooting again does not resolve the issue.

      Attempting to start a vm with xostor vdi results in the following

      vm.start
      {
        "id": "3db40547-fcbf-35b1-4f1d-fc29ca851a57",
        "bypassMacAddressesCheck": false,
        "force": false,
        "host": "3aa66f69-ea6f-465a-83a7-c2c1c43eb3e3"
      }
      {
        "code": "SR_BACKEND_FAILURE_1200",
        "params": [
          "",
          "[Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0'",
          ""
        ],
        "task": {
          "uuid": "484997e0-b959-3f38-5711-0e1f14031fea",
          "name_label": "Async.VM.start_on",
          "name_description": "",
          "allowed_operations": [],
          "current_operations": {},
          "created": "20260511T18:43:18Z",
          "finished": "20260511T18:44:44Z",
          "status": "failure",
          "resident_on": "OpaqueRef:1f61b22b-05b3-4724-9805-284d1079c6f7",
          "progress": 1,
          "type": "<none/>",
          "result": "",
          "error_info": [
            "SR_BACKEND_FAILURE_1200",
            "",
            "[Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0'",
            ""
          ],
          "other_config": {
            "debug_info:cancel_points_seen": "1"
          },
          "subtask_of": "OpaqueRef:NULL",
          "subtasks": [],
          "backtrace": "(((process xapi)(filename ocaml/xapi-client/client.ml)(line 7))((process xapi)(filename ocaml/xapi-client/client.ml)(line 19))((process xapi)(filename ocaml/xapi-client/client.ml)(line 7879))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 144))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 1990))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 1974))((process xapi)(filename ocaml/xapi/rbac.ml)(line 228))((process xapi)(filename ocaml/xapi/rbac.ml)(line 238))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 78)))"
        },
        "message": "SR_BACKEND_FAILURE_1200(, [Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0', )",
        "name": "XapiError",
        "stack": "XapiError: SR_BACKEND_FAILURE_1200(, [Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0', )
          at XapiError.wrap (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/_XapiError.mjs:16:12)
          at default (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/_getTaskResult.mjs:13:29)
          at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1078:24)
          at file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1112:14
          at Array.forEach (<anonymous>)
          at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1102:12)
          at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1275:14)"
      }
      

      However another vm with a xostor vdi started
      7af8a0b6-3276-4fd0-81f0-f669ff93d5aa-image.jpeg

      When I look at that resource in linstor/xostor

      jonathon@jonathon-framework:~$ linstor --controllers=10.2.0.11 r l | grep -e 'xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c'
      | xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c | ovbh-pprod-xen01                         | DRBD,STORAGE | Unused | Connecting(ovbh-pprod-xen04)                                                           |     UpToDate | 2025-05-23 13:49:57 |
      | xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c | ovbh-pprod-xen02                         | DRBD,STORAGE | Unused | Connecting(ovbh-pprod-xen04)                                                           |     UpToDate | 2025-05-23 13:49:57 |
      | xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c | ovbh-pprod-xen04                         | DRBD,STORAGE | Unused | StandAlone(ovbh-pprod-xen02,ovbh-pprod-xen01)                                          |     UpToDate | 2025-05-23 13:49:57 |
      

      Restarting the satellite on xen04 does not help.

      jonathon@jonathon-framework:~$ linstor --controllers=10.2.0.11 n l
      ╭──────────────────────────────────────────────────────────────────────────────────────────╮
      ā”Š Node                                     ā”Š NodeType  ā”Š Addresses               ā”Š State   ā”Š
      ā•žā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•ā•”
      ā”Š ovbh-pprod-xen01                         ā”Š COMBINED  ā”Š 10.2.0.10:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-pprod-xen02                         ā”Š COMBINED  ā”Š 10.2.0.11:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-pprod-xen03                         ā”Š COMBINED  ā”Š 10.2.0.12:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-pprod-xen04                         ā”Š COMBINED  ā”Š 10.2.0.13:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-pprod-xen05                         ā”Š COMBINED  ā”Š 10.2.0.14:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vprod-k8s01-worker01.example.com ā”Š SATELLITE ā”Š 10.1.8.103:3366 (PLAIN) ā”Š Online  ā”Š
      ā”Š ovbh-vprod-k8s01-worker02.example.com ā”Š SATELLITE ā”Š 10.1.8.104:3366 (PLAIN) ā”Š Online  ā”Š
      ā”Š ovbh-vprod-k8s01-worker03.example.com ā”Š SATELLITE ā”Š 10.1.8.105:3366 (PLAIN) ā”Š Online  ā”Š
      ā”Š ovbh-vprod-k8s01-worker10.example.com ā”Š SATELLITE ā”Š 10.1.8.112:3366 (PLAIN) ā”Š OFFLINE ā”Š
      ā”Š ovbh-vprod-k8s01-worker13.example.com ā”Š SATELLITE ā”Š 10.1.8.115:3366 (PLAIN) ā”Š Online  ā”Š
      ā”Š ovbh-vprod-rancher01.example.com      ā”Š SATELLITE ā”Š 10.1.8.41:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vprod-rancher02.example.com      ā”Š SATELLITE ā”Š 10.1.8.42:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vprod-rancher03.example.com      ā”Š SATELLITE ā”Š 10.1.8.43:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vtest-k8s01-worker01.example.com ā”Š SATELLITE ā”Š 10.1.8.64:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vtest-k8s01-worker02.example.com ā”Š SATELLITE ā”Š 10.1.8.65:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vtest-k8s01-worker03.example.com ā”Š SATELLITE ā”Š 10.1.8.66:3366 (PLAIN)  ā”Š Online  ā”Š
      ā”Š ovbh-vtest-k8s01-worker04.example.com ā”Š SATELLITE ā”Š 10.1.8.60:3366 (PLAIN)  ā”Š OFFLINE ā”Š
      ā”Š ovbh-vtest-k8s01-worker05.example.com ā”Š SATELLITE ā”Š 10.1.8.59:3366 (PLAIN)  ā”Š Online  ā”Š
      ╰──────────────────────────────────────────────────────────────────────────────────────────╯
      

      Looking at logs on xen04

      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( StandAlone -> Unconnected ) [connect]
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Starting receiver thread (peer-node-id 0)
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( Unconnected -> Connecting ) [connecting]
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( StandAlone -> Unconnected ) [connect]
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Starting receiver thread (peer-node-id 1)
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( Unconnected -> Connecting ) [connecting]
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Handshake to peer 0 successful: Agreed network protocol version 123
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Feature flags enabled on protocol level: 0x7f TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES RESYNC_DAGTAG
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: expected AuthChallenge packet, received: P_PROTOCOL (0x000b)
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Authentication of peer failed
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( Connecting -> Disconnecting )
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Terminating sender thread
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Starting sender thread (peer-node-id 0)
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Handshake to peer 1 successful: Agreed network protocol version 123
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Feature flags enabled on protocol level: 0x7f TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES RESYNC_DAGTAG
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: expected AuthChallenge packet, received: P_PROTOCOL (0x000b)
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Authentication of peer failed
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( Connecting -> Disconnecting )
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Terminating sender thread
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Starting sender thread (peer-node-id 1)
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Connection closed
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: helper command: /sbin/drbdadm disconnected
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Connection closed
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: helper command: /sbin/drbdadm disconnected
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: helper command: /sbin/drbdadm disconnected exit code 0
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( Disconnecting -> StandAlone ) [disconnected]
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Terminating receiver thread
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: helper command: /sbin/drbdadm disconnected exit code 0
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( Disconnecting -> StandAlone ) [disconnected]
      [Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Terminating receiver thread
      
      [00:44 ovbh-pprod-xen04 ~]# drbdadm status xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c
      xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c role:Secondary suspended:quorum
        disk:UpToDate quorum:no open:no blocked:upper
        ovbh-pprod-xen01 connection:StandAlone
        ovbh-pprod-xen02 connection:StandAlone
      

      I did update all servers to this patch https://xcp-ng.org/blog/2026/05/05/april-2026-security-and-maintenance-updates-for-xcp-ng-8-3-lts-2/
      And everything got restarted and was happy. Shortly after I saw this, https://xcp-ng.org/blog/2026/05/07/may-2026-updates-2-for-xcp-ng-8-3-lts/, and installed it to all hosts.
      b6333da8-1667-4863-9a15-1452d9803dd0-image.jpeg
      xen01 is the current master
      61a98b9d-ac8d-4493-9e59-dcf5883a2a0b-image.jpeg

      Has anyone seen this before?

      posted in XOSTOR
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      xe pool-enable-tls-verification Was exactly what I needed, thanks! Worked after that.

      posted in Management
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      @psafont Sorry was swamped with other things. As listed above I get the same error, forced or not, from xcp-ng center, xcp-ng host, or xoa.

      1fdda333-0842-4281-ae69-e6c886ec1542-image.png
      tls verification has always been off, and in the past we have not had issue with adding new host to pool.

      I have taken no other actions since my last posting.

      posted in Management
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      I see, it also says
      name ( RO): sdn-controller-ca.pem
      host ( RO): <not in database>
      Like in the issue, but the file exists.

      [11:28 ovbh-pprod-xen05 ~]# xe certificate-list
      uuid ( RO)           : afdd9c8e-dcae-17c7-c35c-0fd7cebd387a
                 type ( RO): host
                 name ( RO): 
                 host ( RO): f0cec10f-ad05-48e4-893c-414b3a3e15be
           not-before ( RO): 20251110T23:15:51Z
            not-after ( RO): 20351108T23:15:51Z
          fingerprint ( RO): BF:83:23:BB:7B:E9:30:DE:86:EA:9D:AF:DF:F8:BA:34:39:D0:81:AD:34:E5:C6:AB:0C:49:41:7B:4A:3C:8B:9E
      
      
      uuid ( RO)           : b8dcd1f0-ef65-e762-f189-46bb78766c6b
                 type ( RO): ca
                 name ( RO): sdn-controller-ca.pem
                 host ( RO): <not in database>
           not-before ( RO): 20200416T00:17:31Z
            not-after ( RO): 20470901T00:17:31Z
          fingerprint ( RO): 63:1F:89:3F:0E:1F:86:52:34:95:3C:6C:3F:9C:C8:B3:5A:61:6B:4D:EE:8F:A7:11:F0:BA:79:8B:C7:15:A0:E0
      
      
      uuid ( RO)           : e7daedf2-7f35-ba40-093a-e0c011d91633
                 type ( RO): host_internal
                 name ( RO): 
                 host ( RO): f0cec10f-ad05-48e4-893c-414b3a3e15be
           not-before ( RO): 20251110T23:15:46Z
            not-after ( RO): 20351108T23:15:46Z
          fingerprint ( RO): 71:41:B0:25:88:AA:E4:56:EE:F7:A9:8E:0A:A9:FE:C5:6A:0D:D5:37:30:BF:C8:81:C2:D7:B8:20:7A:6C:7F:B7
      
      
      [13:50 ovbh-pprod-xen05 ~]# ll /etc/stunnel/certs/sdn-controller-ca.pem
      -rw-r--r-- 1 root root 1907 Nov 12 09:45 /etc/stunnel/certs/sdn-controller-ca.pem
      

      Removing it did not help, same error

      [13:54 ovbh-pprod-xen05 ~]# xe certificate-list
      uuid ( RO)           : afdd9c8e-dcae-17c7-c35c-0fd7cebd387a
                 type ( RO): host
                 name ( RO): 
                 host ( RO): f0cec10f-ad05-48e4-893c-414b3a3e15be
           not-before ( RO): 20251110T23:15:51Z
            not-after ( RO): 20351108T23:15:51Z
          fingerprint ( RO): BF:83:23:BB:7B:E9:30:DE:86:EA:9D:AF:DF:F8:BA:34:39:D0:81:AD:34:E5:C6:AB:0C:49:41:7B:4A:3C:8B:9E
      
      
      uuid ( RO)           : e7daedf2-7f35-ba40-093a-e0c011d91633
                 type ( RO): host_internal
                 name ( RO): 
                 host ( RO): f0cec10f-ad05-48e4-893c-414b3a3e15be
           not-before ( RO): 20251110T23:15:46Z
            not-after ( RO): 20351108T23:15:46Z
          fingerprint ( RO): 71:41:B0:25:88:AA:E4:56:EE:F7:A9:8E:0A:A9:FE:C5:6A:0D:D5:37:30:BF:C8:81:C2:D7:B8:20:7A:6C:7F:B7
      

      I also confirmed that all the certs for the hosts are current and not expired.

      posted in Management
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      eee8bee1-ce6f-47c2-b5f0-1cd9b942db79-image.png
      9eea1860-e725-4e3c-85ff-0c3351beff45-image.png

      Boo

      posted in Management
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      Bummer
      957a5e9d-7f52-42a6-9105-c4772cd4e6e2-image.png

      posted in Management
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      After installing packages: https://docs.xcp-ng.org/xostor/#how-to-add-a-new-host-or-fix-a-badly-configured-host

      Now I am getting the following on offical

      pool.mergeInto
      {
        "sources": [
          "e4cf2039-3547-6574-0e10-96f9d91316f0"
        ],
        "target": "38aea760-cf23-927c-ccf5-90969681e04b",
        "force": true
      }
      {
        "code": "INTERNAL_ERROR",
        "params": [
          "Stunnel.Stunnel_verify_error(\"1416F086:SSL routines:tls_process_server_certificate:certificate verify failed\")"
        ],
        "call": {
          "duration": 3104,
          "method": "pool.join_force",
          "params": [
            "* session id *",
            "10.2.0.10",
            "root",
            "* obfuscated *"
          ]
        },
        "message": "INTERNAL_ERROR(Stunnel.Stunnel_verify_error(\"1416F086:SSL routines:tls_process_server_certificate:certificate verify failed\"))",
        "name": "XapiError",
        "stack": "XapiError: INTERNAL_ERROR(Stunnel.Stunnel_verify_error(\"1416F086:SSL routines:tls_process_server_certificate:certificate verify failed\"))
          at Function.wrap (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_XapiError.mjs:16:12)
          at file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/transports/json-rpc.mjs:38:21
          at runNextTicks (node:internal/process/task_queues:60:5)
          at processImmediate (node:internal/timers:454:9)
          at process.callbackTrampoline (node:internal/async_hooks:130:17)"
      }
      

      And still getting this on source install

      pool.mergeInto
      {
        "sources": [
          "e4cf2039-3547-6574-0e10-96f9d91316f0"
        ],
        "target": "38aea760-cf23-927c-ccf5-90969681e04b",
        "force": true
      }
      {
        "message": "app.getLicenses is not a function",
        "name": "TypeError",
        "stack": "TypeError: app.getLicenses is not a function
          at enforceHostsHaveLicense (file:///opt/xen-orchestra/packages/xo-server/src/xo-mixins/pool.mjs:15:30)
          at Pools.apply (file:///opt/xen-orchestra/packages/xo-server/src/xo-mixins/pool.mjs:80:13)
          at Pools.mergeInto (/opt/xen-orchestra/node_modules/golike-defer/src/index.js:85:19)
          at Xo.mergeInto (file:///opt/xen-orchestra/packages/xo-server/src/api/pool.mjs:314:15)
          at Task.runInside (/opt/xen-orchestra/@vates/task/index.js:175:22)
          at Task.run (/opt/xen-orchestra/@vates/task/index.js:159:20)
          at Api.#callApiMethod (file:///opt/xen-orchestra/packages/xo-server/src/xo-mixins/api.mjs:469:18)"
      }
      
      posted in Management
      J
      Jonathon
    • RE: Attempting to add new host fail on xoa and on server, worked on xcp-ng center

      @olivierlambert

      Just tried after doing a force clean install, still getting same error. Going to look into it more if there is not any

      root@xoa:/home/fpcuser# sudo curl https://raw.githubusercontent.com/Jarli01/xenorchestra_updater/master/xo-update.sh | bash -s -- -f | tee xenrebuild.log
        % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                       Dload  Upload   Total   Spent    Left  Speed
      100  6896  100  6896    0     0  39116      0 --:--:-- --:--:-- --:--:-- 39181
         installed : v24.11.1 (with npm 11.6.2)
      Stopping xo-server...
      Checking for Yarn package...
      Checking for Yarn update...
      E: Malformed entry 1 in list file /etc/apt/sources.list.d/yarn.list (URI parse)
      E: The list of sources could not be read.
      E: Malformed entry 1 in list file /etc/apt/sources.list.d/yarn.list (URI parse)
      E: The list of sources could not be read.
      Checking for missing dependencies...
      Checking for Repo change...
      Checking xen-orchestra...
      Current branch master
      Current version 5.192.1 / 5.189.0
      Current commit 6cfefc91e47db7fb264705bc2def1f1b70bc537b 2025-11-12 18:01:41 +0100
      0 updates available
      Updating from source...
      No local changes to save
      No stash entries found.
      Already up to date.
      Clearing directories...
      Installing...
      yarn install v1.22.22
      (node:1226553) [DEP0169] DeprecationWarning: `url.parse()` behavior is not standardized and prone to errors that have security implications. Use the WHATWG URL API instead. CVEs are not issued for `url.parse()` vulnerabilities.
      (Use `node --trace-deprecation ...` to show where the warning was created)
      [1/5] Validating package.json...
      [2/5] Resolving packages...
      success Already up-to-date.
      $ husky install
      husky - Git hooks installed
      Done in 1.57s.
      yarn run v1.22.22
      $ TURBO_TELEMETRY_DISABLED=1 turbo run build --filter xo-server --filter xo-server-'*' --filter xo-web
      turbo 2.5.8
      
      • Packages in scope: xo-server, xo-server-audit, xo-server-auth-github, xo-server-auth-google, xo-server-auth-ldap, xo-server-auth-oidc, xo-server-auth-saml, xo-server-backup-reports, xo-server-load-balancer, xo-server-netbox, xo-server-perf-alert, xo-server-sdn-controller, xo-server-test-plugin, xo-server-transport-email, xo-server-transport-icinga2, xo-server-transport-nagios, xo-server-transport-slack, xo-server-transport-xmpp, xo-server-usage-report, xo-server-web-hooks, xo-web
      • Running build in 21 packages
      • Remote caching disabled
      
       Tasks:    30 successful, 30 total
      Cached:    30 cached, 30 total
        Time:    1.347s >>> FULL TURBO
      
      Done in 1.55s.
      Updated version 5.192.1 / 5.189.0
      Updated commit 6cfefc91e47db7fb264705bc2def1f1b70bc537b 2025-11-12 18:01:41 +0100
      Checking plugins...
      Ignoring xo-server-test plugin
      Cleanup plugins...
      Restarting xo-server...
      

      So then I updated our seperate vm for xoa that we have used in the past for requests like this, and I am getting this behavior
      48c1fd0d-d434-4fb0-9ee0-5bc6756b3875-image.png

      pool.mergeInto
      {
        "sources": [
          "e4cf2039-3547-6574-0e10-96f9d91316f0"
        ],
        "target": "38aea760-cf23-927c-ccf5-90969681e04b",
        "force": true
      }
      {
        "code": "POOL_JOINING_SM_FEATURES_INCOMPATIBLE",
        "params": [
          "OpaqueRef:151858ec-cd9b-44f5-9cc5-f053685b1b8e",
          ""
        ],
        "call": {
          "duration": 2049,
          "method": "pool.join_force",
          "params": [
            "* session id *",
            "10.2.0.10",
            "root",
            "* obfuscated *"
          ]
        },
        "message": "POOL_JOINING_SM_FEATURES_INCOMPATIBLE(OpaqueRef:151858ec-cd9b-44f5-9cc5-f053685b1b8e, )",
        "name": "XapiError",
        "stack": "XapiError: POOL_JOINING_SM_FEATURES_INCOMPATIBLE(OpaqueRef:151858ec-cd9b-44f5-9cc5-f053685b1b8e, )
          at Function.wrap (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_XapiError.mjs:16:12)
          at file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/transports/json-rpc.mjs:38:21
          at runNextTicks (node:internal/process/task_queues:60:5)
          at processImmediate (node:internal/timers:454:9)
          at process.callbackTrampoline (node:internal/async_hooks:130:17)"
      }
      

      5bc0b839-46d1-4387-aa73-5a1df07c9bfe-image.png

      posted in Management
      J
      Jonathon