XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. jimmymiller
    3. Posts
    J
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 8
    • Posts 30
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: Backups started to fail again (overall status: failure, but both snapshot and transfer returns success)

      @marcoi I've definitely been through all those steps before, but I'll try it again. The VM in question is a test machine I built specifically for this troubleshooting and I've rebuilt this XO three times, including not bringing over any export / configurations / backups from previous installments--I simply pointed a vanilla XO (running similar code) at the same XCP-ng pool. I've created a different volume and NFS export on the Synology, so that even started blank and each time I've gone through the test I've preemptively emptied the directories before connecting the XO to the NFS export. The only two things I haven't tried are rebuilding the XCP-ng host from scratch, nor have I tried sending to a different physical remote. Guess it's time to start down that path.

      The fact that I can write data to the NFS mount from within the XO VM even while the XCP-ng export is in progress is the part that gets me and makes me think this might be a XO problem vs. an underlying infrastructure problem. Oh well, time to try rebuilding these.

      Hammer down!

      posted in Backup
      J
      jimmymiller
    • RE: Backups started to fail again (overall status: failure, but both snapshot and transfer returns success)

      @olivierlambert & @marcoi The option is 'vers=3' w/o quotes.

      I tried this recommendation and made some progress in that 'ls' and 'df' don't lock up, but the export still seems to hang.

      Below is a screenshot of network throughput on the XO. The plateau on the left is a 'dd if=dev/random' to a 'dummyData.txt' before the backup is kicked off. I then stop the dd and kick off the backup job (middle). After about a minute, the job hangs. I can still 'ls' and 'df' at this point, which is good, but the export job just seems stuck at 22% ('xe task-list' below). For the heck of it I tried another 'dd if=/dev/random' this time to a dummyData2.txt, which is what that plateau is on the right. While this dd is underway the backup is still supposed to be ongoing. So the directory is still writeable, but the job is just hanging up?

      The number where the job hangs seems to vary too -- sometimes it will go ~30%, other times it stops at 10%. I left a job of this exact VM running last night and it actually finished, but it took 5 hrs to move 25G according to the backup report. I took <6 minutes to move 40G with the simple inefficient 'dd' command. Granted XO might be pulling (from the host) and pushing (to the NFS remote) simultaneously, but 5 hours doesn't seem right.

      Screenshot 2025-08-18 at 13.27.11.png

      XO output before and while backup is ongoing:

      root@xo:/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7# ls -lha
      total 0
      drwxrwxrwx 1 1027 users 56 Aug 18 13:12 .
      drwxr-xr-x 3 root root  60 Aug 18 13:11 ..
      -rwxrwxrwx 1 1024 users  0 Aug 18 13:11 .nfs000000000000017400000001
      root@xo:/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7# dd if=/dev/random of=/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7/dummyData.txt bs=1M
      ^C36868+0 records in
      36868+0 records out
      38658899968 bytes (39 GB, 36 GiB) copied, 368.652 s, 105 MB/s
      
      root@xo:/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7# df -h
      Filesystem                                    Size  Used Avail Use% Mounted on
      udev                                          3.9G     0  3.9G   0% /dev
      tmpfs                                         794M  584K  793M   1% /run
      /dev/mapper/xo--vg-root                 28G  5.3G   22G  20% /
      tmpfs                                         3.9G     0  3.9G   0% /dev/shm
      tmpfs                                         5.0M     0  5.0M   0% /run/lock
      /dev/xvda2                                    456M  119M  313M  28% /boot
      /dev/mapper/xo--vg-var                  15G  403M   14G   3% /var
      /dev/xvda1                                    511M  5.9M  506M   2% /boot/efi
      tmpfs                                         794M     0  794M   0% /run/user/1000
      192.168.32.10:/volume12/XCPBackups/VMBackups  492G   42G  451G   9% /run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7
      root@xo:/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7# ls -lha
      total 37G
      drwxrwxrwx 1 1027 users 164 Aug 18 13:20 .
      drwxr-xr-x 3 root root   60 Aug 18 13:11 ..
      -rwxrwxrwx 1 1024 users 37G Aug 18 13:20 dummyData.txt
      -rwxrwxrwx 1 1024 users   0 Aug 18 13:11 .nfs000000000000017400000001
      -rwxrwxrwx 1 1024 users   0 Aug 18 13:20 .nfs000000000000017700000002
      drwxrwxrwx 1 1024 users 154 Aug 18 13:20 xo-vm-backups
      root@xo:/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7# cd
      root@xo:~# df -h
      Filesystem                                    Size  Used Avail Use% Mounted on
      udev                                          3.9G     0  3.9G   0% /dev
      tmpfs                                         794M  584K  793M   1% /run
      /dev/mapper/xo--vg-root                 28G  5.3G   22G  20% /
      tmpfs                                         3.9G     0  3.9G   0% /dev/shm
      tmpfs                                         5.0M     0  5.0M   0% /run/lock
      /dev/xvda2                                    456M  119M  313M  28% /boot
      /dev/mapper/xo--vg-var                  15G  404M   14G   3% /var
      /dev/xvda1                                    511M  5.9M  506M   2% /boot/efi
      tmpfs                                         794M     0  794M   0% /run/user/1000
      192.168.32.10:/volume12/XCPBackups/VMBackups  492G   42G  450G   9% /run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7
      
      root@xo:~# ls -lha /run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7
      total 37G
      drwxrwxrwx 1 1027 users 164 Aug 18 13:20 .
      drwxr-xr-x 3 root root   60 Aug 18 13:11 ..
      -rwxrwxrwx 1 1024 users 37G Aug 18 13:20 dummyData.txt
      -rwxrwxrwx 1 1024 users   0 Aug 18 13:11 .nfs000000000000017400000001
      -rwxrwxrwx 1 1024 users   0 Aug 18 13:20 .nfs000000000000017700000002
      drwxrwxrwx 1 1024 users 154 Aug 18 13:20 xo-vm-backups
      root@xo:~# dd if=/dev/random of=/run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7/dummyData2.txt bs=1M
      ^C7722+0 records in
      7722+0 records out
      8097103872 bytes (8.1 GB, 7.5 GiB) copied, 78.9522 s, 103 MB/s
      
      root@conductor:~# ls /run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7
      dummyData2.txt	dummyData.txt  xo-vm-backups
      
      [13:32 xcp05 ~]# xe task-list
      uuid ( RO)                : ba902e8a-7f08-9de9-8ade-e879ffb35e11
                name-label ( RO): Exporting content of VDI shifter through NBD
          name-description ( RO):
                    status ( RO): pending
                  progress ( RO): 0.222
      

      Any other ideas before I go down the route of trying to carve out a virtual NAS instance?

      Thanks again for your input.

      posted in Backup
      J
      jimmymiller
    • RE: Backups started to fail again (overall status: failure, but both snapshot and transfer returns success)

      @olivierlambert This particular instance is a compiled version of XO, but yes, it's XO that locks up. I'd be all over submitting a ticket if this was in XOA.

      Mind you I can trigger the lock up immediately after I had sent 250G of dummy data (using dd if=/dev/random) to the exact same directory using the exact same XO. I simply deleted that dummy data, kicked off the backup job, and boom, ls & df lock up after just a few minutes.

      Separately on the host, the xe export task will just sit there. Sometimes some progress will happen, other times it will sit there for days w/o any progress.

      I've tried rebuilding XO from scratch (without importing an older XO config) and it happens the exact same way on the newer XO. Tried creating a separate empty volume on the Synology and a different NFS export / remote--same problem. I'm at the point of trying to rebuild the XCP-ng hosts, but I'm not really sure if that's the problem because the hosts seem happy up until I kick off the backup job. VMs can migrate; there are only local SRs, so we aren't dealing with any type of storage connection problem apart from the NFS for backups and these are two different hosts (Dell PowerEdge R450s). The only common thing I can find is that the hosts are in the same pool, the backup target hardware has remained unchanged, and they reside on the same physical network (Ubiquiti-based switches) I might try rebuilding these hosts and carving out a dummy virtual TrueNAS instance just to see if I can get a different result but I'm out of ideas on things to try after that.

      posted in Backup
      J
      jimmymiller
    • RE: Backups started to fail again (overall status: failure, but both snapshot and transfer returns success)

      Did anyone ever find a true resolution to this? I'm seeing a similar situation where a remote works just fine until I initiate a backup job.

      Mounted a remote via NFS (v4), generated 260G of random data (using a simple dd command just to prove I can write something to the share) but when I initiate a backup job it hangs up after <10G of backup data is transferred. Sometimes it will pause for several minutes then allow a little bit more through, but it will also sometime just hang there for hours.

      Latest XO commit (70d59) on a fully updated but mostly vanilla Debian 12, XCP-ng 8.3 with latest patches as of today (Aug 17, 2025). Remote target is a Synology DS1520+ with latest patches applied. Only a 1G connection, but the network is not busy by any means.

      Moments before the backup 'df' and 'ls' of the respective directory worked fine. After the backup is initiated and appears to pause with <10G transferred, both commands lock up. Job in XCP-ng also seems to not want to let go.

      root@xo:~# date ; df -h
      Sun Aug 17 10:20:45 PM EDT 2025
      Filesystem                                    Size  Used Avail Use% Mounted on
      udev                                          3.9G     0  3.9G   0% /dev
      tmpfs                                         794M  572K  793M   1% /run
      /dev/mapper/xo--vg-root                 28G  5.3G   22G  20% /
      tmpfs                                         3.9G     0  3.9G   0% /dev/shm
      tmpfs                                         5.0M     0  5.0M   0% /run/lock
      /dev/xvda2                                    456M  119M  313M  28% /boot
      /dev/mapper/xo--vg-var                  15G  403M   14G   3% /var
      /dev/xvda1                                    511M  5.9M  506M   2% /boot/efi
      tmpfs                                         794M     0  794M   0% /run/user/1000
      192.168.32.10:/volume12/XCPBackups/VMBackups  492G   17M  492G   1% /run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7
      root@xo:~# date ; ls -lha /run/xo-server/mounts/1f33cf95-36d6-4c5b-b99f-71bb47008dd7/
      Sun Aug 17 10:21:23 PM EDT 2025
      total 0
      drwxrwxrwx 1 1027 users 56 Aug 17 22:20 .
      drwxr-xr-x 3 root root  60 Aug 17 22:20 ..
      -rwxrwxrwx 1 1024 users  0 Aug 17 22:20 .nfs000000000000016400000001
      root@xo:~# date ; df -h
      Sun Aug 17 10:22:28 PM EDT 2025
      ^C
      root@xo:~#
      
      [22:19 xcp05 ~]# date ; xe task-list
      Sun Aug 17 22:20:00 EDT 2025
      [22:20 xcp05 ~]# date ; xe task-list
      Sun Aug 17 22:22:49 EDT 2025
      uuid ( RO)                : 8c7cd101-9b5e-3769-0383-60beea86a272
                name-label ( RO): Exporting content of VDI shifter through NBD
          name-description ( RO):
                    status ( RO): pending
                  progress ( RO): 0.101
      
      
      [22:22 xcp05 ~]# date ; xe task-cancel uuid=8c7cd101-9b5e-3769-0383-60beea86a272
      Sun Aug 17 22:23:13 EDT 2025
      [22:23 xcp05 ~]# date ; xe task-list
      Sun Aug 17 22:23:28 EDT 2025
      uuid ( RO)                : 8c7cd101-9b5e-3769-0383-60beea86a272
                name-label ( RO): Exporting content of VDI shifter through NBD
          name-description ( RO):
                    status ( RO): pending
                  progress ( RO): 0.101
      

      I'm only able to clear this task by restarting the toolstack (or host), but the issue returns as soon as I try to initiate another backup.

      Host where VM resides network throughput:
      Screenshot 2025-08-17 at 22.30.33.png

      Nothing in dmesg and journalctl on XO just shows the backup starting:
      Aug 17 22:21:56 xo xo-server[724]: 2025-08-18T02:21:56.031Z xo:backups:worker INFO starting backup

      Last bit of the VM host SMlog:

      Aug 17 22:22:02 xcp05 SM: [10246] lock: released /var/lock/sm/3a1ad8c0-7f3a-4c16-9ec1-e8c315ac0c31/vdi
      Aug 17 22:22:02 xcp05 SM: [10246] lock: closed /var/lock/sm/b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/sr
      Aug 17 22:22:02 xcp05 SM: [10246] lock: closed /var/lock/sm/3a1ad8c0-7f3a-4c16-9ec1-e8c315ac0c31/vdi
      Aug 17 22:22:02 xcp05 SM: [10246] lock: closed /var/lock/sm/lvm-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/3a1ad8c0-7f3a-4c16-9ec1-e8c315ac0c31
      Aug 17 22:22:02 xcp05 SM: [10246] lock: closed /var/lock/sm/lvm-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/1fb38e78-4700-450e-800d-2d8c94158046
      Aug 17 22:22:02 xcp05 SM: [10246] lock: closed /var/lock/sm/lvm-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/2ef8ae60-8beb-47fc-9cc2-b13e91192b14
      Aug 17 22:22:02 xcp05 SM: [10246] lock: closed /var/lock/sm/lvm-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/lvchange-p
      Aug 17 22:22:36 xcp05 SM: [10559] lock: opening lock file /var/lock/sm/b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/sr
      Aug 17 22:22:36 xcp05 SM: [10559] LVMCache created for VG_XenStorage-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c
      Aug 17 22:22:36 xcp05 fairlock[3358]: /run/fairlock/devicemapper acquired
      Aug 17 22:22:36 xcp05 fairlock[3358]: /run/fairlock/devicemapper sent '10559 - 625.897991605'
      Aug 17 22:22:36 xcp05 SM: [10559] ['/sbin/vgs', '--readonly', 'VG_XenStorage-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c']
      Aug 17 22:22:36 xcp05 SM: [10559]   pread SUCCESS
      Aug 17 22:22:36 xcp05 fairlock[3358]: /run/fairlock/devicemapper released
      Aug 17 22:22:36 xcp05 SM: [10559] Entering _checkMetadataVolume
      Aug 17 22:22:36 xcp05 SM: [10559] LVMCache: will initialize now
      Aug 17 22:22:36 xcp05 SM: [10559] LVMCache: refreshing
      Aug 17 22:22:36 xcp05 fairlock[3358]: /run/fairlock/devicemapper acquired
      Aug 17 22:22:36 xcp05 fairlock[3358]: /run/fairlock/devicemapper sent '10559 - 625.93107908'
      Aug 17 22:22:36 xcp05 SM: [10559] ['/sbin/lvs', '--noheadings', '--units', 'b', '-o', '+lv_tags', '/dev/VG_XenStorage-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c']
      Aug 17 22:22:37 xcp05 SM: [10559]   pread SUCCESS
      Aug 17 22:22:37 xcp05 fairlock[3358]: /run/fairlock/devicemapper released
      Aug 17 22:22:37 xcp05 SM: [10559] lock: closed /var/lock/sm/b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/sr
      Aug 17 22:32:51 xcp05 SM: [14729] lock: opening lock file /var/lock/sm/b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/sr
      Aug 17 22:32:51 xcp05 SM: [14729] LVMCache created for VG_XenStorage-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c
      Aug 17 22:32:51 xcp05 fairlock[3358]: /run/fairlock/devicemapper acquired
      Aug 17 22:32:51 xcp05 SM: [14729] ['/sbin/vgs', '--readonly', 'VG_XenStorage-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c']
      Aug 17 22:32:51 xcp05 fairlock[3358]: /run/fairlock/devicemapper sent '14729 - 1240.916404088'
      Aug 17 22:32:51 xcp05 SM: [14729]   pread SUCCESS
      Aug 17 22:32:51 xcp05 fairlock[3358]: /run/fairlock/devicemapper released
      Aug 17 22:32:51 xcp05 SM: [14729] Entering _checkMetadataVolume
      Aug 17 22:32:51 xcp05 SM: [14729] LVMCache: will initialize now
      Aug 17 22:32:51 xcp05 SM: [14729] LVMCache: refreshing
      Aug 17 22:32:51 xcp05 fairlock[3358]: /run/fairlock/devicemapper acquired
      Aug 17 22:32:51 xcp05 fairlock[3358]: /run/fairlock/devicemapper sent '14729 - 1240.953398589'
      Aug 17 22:32:51 xcp05 SM: [14729] ['/sbin/lvs', '--noheadings', '--units', 'b', '-o', '+lv_tags', '/dev/VG_XenStorage-b6ee0b6d-5093-b462-594f-ba2a3c3bd14c']
      Aug 17 22:32:52 xcp05 SM: [14729]   pread SUCCESS
      Aug 17 22:32:52 xcp05 fairlock[3358]: /run/fairlock/devicemapper released
      Aug 17 22:32:52 xcp05 SM: [14729] lock: closed /var/lock/sm/b6ee0b6d-5093-b462-594f-ba2a3c3bd14c/sr
      

      Looking for other ideas.

      posted in Backup
      J
      jimmymiller
    • RE: CBT: the thread to centralize your feedback

      Has anyone seen issues migrating VDIs once CBT is enabled? We're seeing VDI_CBT_ENABLED errors when we try to live migrate disks between SRs. Obviously disabling CBT on the disk allows for the migration to move forward. 'Users' who have limited access don't seem to see specifics on the error but us as admins get a VDI_CBT_ENABLED error. Ideally I think we'd want to be able to still migrate VDIs with CBT enabled or maybe as a part of a VDI migration process CBT would be disabled temporarily, migrated then re-enabled?

      User errors:
      Screenshot 2024-08-07 at 17.42.07.png

      Admins see:

      {
        "id": "7847a7c3-24a3-4338-ab3a-0c1cdbb3a12a",
        "resourceSet": "q0iE-x7MpAg",
        "sr_id": "5d671185-66f6-a292-e344-78e5106c3987"
      }
      {
        "code": "VDI_CBT_ENABLED",
        "params": [
          "OpaqueRef:aeaa21fc-344d-45f1-9409-8e1e1cf3f515"
        ],
        "task": {
          "uuid": "9860d266-d91a-9d0e-ec2a-a7752fa01a6d",
          "name_label": "Async.VDI.pool_migrate",
          "name_description": "",
          "allowed_operations": [],
          "current_operations": {},
          "created": "20240807T21:33:29Z",
          "finished": "20240807T21:33:29Z",
          "status": "failure",
          "resident_on": "OpaqueRef:8d372a96-f37c-4596-9610-1beaf26af9db",
          "progress": 1,
          "type": "<none/>",
          "result": "",
          "error_info": [
            "VDI_CBT_ENABLED",
            "OpaqueRef:aeaa21fc-344d-45f1-9409-8e1e1cf3f515"
          ],
          "other_config": {},
          "subtask_of": "OpaqueRef:NULL",
          "subtasks": [],
          "backtrace": "(((process xapi)(filename ocaml/xapi/xapi_vdi.ml)(line 470))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 4696))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 199))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 203))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 42))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 51))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 4708))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 4711))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/helpers.ml)(line 1503))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 4705))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))"
        },
        "message": "VDI_CBT_ENABLED(OpaqueRef:aeaa21fc-344d-45f1-9409-8e1e1cf3f515)",
        "name": "XapiError",
        "stack": "XapiError: VDI_CBT_ENABLED(OpaqueRef:aeaa21fc-344d-45f1-9409-8e1e1cf3f515)
          at Function.wrap (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_XapiError.mjs:16:12)
          at default (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_getTaskResult.mjs:13:29)
          at Xapi._addRecordToCache (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1033:24)
          at file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1067:14
          at Array.forEach (<anonymous>)
          at Xapi._processEvents (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1057:12)
          at Xapi._watchEvents (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1230:14)"
      }```
      posted in Backup
      J
      jimmymiller
    • RE: Shared SR (two pools)

      @olivierlambert Okay. I'll give it a shot.

      posted in Xen Orchestra
      J
      jimmymiller
    • RE: Shared SR (two pools)

      @HolgiB For this use, it's actually a virtual TrueNAS instance sitting on a LUN mapped to the source XCP-ng pool. I know there are in-OS options using zfs send|receive, but the point is to get an understanding of what we would do without that convenience.

      I know Xen and VMware do things differently, but having VMFS in the mix allowed us to unmount a datastore, move the mapping to a new host, mount that datastore, then just point that host at the existing LUN and quickly import the VMX (for a full VM config) or VMDK (with configuring a new VM to use those existing disks). This completely eliminated the need to truly copy the data--we were just changing the host that had access to it. We didn't use it very often because VMware handled moving VMs with big disks pretty well, but it was our ace-in-the-hole if the option for storage vMotion wasn't available.

      posted in Xen Orchestra
      J
      jimmymiller
    • RE: Shared SR (two pools)

      @olivierlambert Well even a cold migration seemed to fail. Bah!

      The LUN/SR is dedicated to just the one VM. 1 x 16g disk for the OS, 20 x 1.5T disks for data. Inside the VM, I'm using ZFS as a stripe to bring them all together into a single zpool. I know because this is zfs I could theoretically do a zfs replication job to another VM, but I'm also using this as a test to figure out how I'm going to move those larger VMs we have that don't have the convenience of in-OS replication option. For our larger VMs we almost always dedicate LUNs specifically them and we have block-based replication options on our arrays so in theory we should be able to fool any device into thinking the replica is a legit pre-existing SR.

      No snaps -- the data on this VM is purely an offsite backup target so we didn't feel the need to backup the backup of the backup.

      Let me try testing the forget SR + connect to different pool. I swear I tried this before but when I went to try creating the SR, it flagged the LUN as having a preexisting SR, but it forced me to reprovision a new SR and wouldn't let me map the existing.

      posted in Xen Orchestra
      J
      jimmymiller
    • Shared SR (two pools)

      Re: Shared SR between two pools?

      I have a need to move a sizable (i.e. 30T) VM between two pools of compute nodes. I can do this move cold, but I'd rather not have the VM offline for several days, which is what it's going to look like if I do a standard cold VM migration.

      As I understand it, SRs are essentially locked to a specific pool (particularly the pool master). Is it possible to basically unmount (i.e. forget) the SR from one pool, remount it on the target pool then just import the VM while still basically continuing to reside on the same LUN?

      VMware made this pretty easy with VMFS/VMX/VMDKs, but it seems like Xen may not be as flexible.

      posted in Xen Orchestra
      J
      jimmymiller
    • RE: Migrating VM Status

      Well I guess that was the coalesce process because now that one has just stopped. Any ideas on how to find out why it worked fine for essentially 2 days then stopped?

      I get the impression a live migration of a 30T VM may not be the best way to go about moving something of this size between pools? Maybe a warm migration will work better, but I'm also curious how much capacity we're going to need on the source in order to complete this move.

      posted in Management
      J
      jimmymiller
    • RE: Migrating VM Status

      @Danp

      Hrm. xe task-list is showing nothing, but there is clearly something still happening based on the stats.

      Screenshot 2024-06-19 at 11.47.39.png

      posted in Management
      J
      jimmymiller
    • Migrating VM Status

      I'm in the process of live migrating a large VM (~30T) from one host to another. The process had been going smoothly for the last 2 days, but now the task no longer appears in XO. The VM status is showing "Busy (migrate_send)", but the task isn't visible in the XO task list. Is there a timeout in XO for tasks running a long time? Is there a way to actually verify the status of the task and whether the VM is still moving? According to the states, there is still IO on the SR so it appears to still be in progress.

      posted in Management
      J
      jimmymiller
    • OIDC with EntraID

      Has anyone out there gotten the XO OIDC plugin to work with EntraID? My security folks are in search of any documentation that could help them configure things on their end to help things work with XO. Tech support folks are looking into this as well, but I figured I'd put something out there to see if the broader community has made it work.

      posted in Management
      J
      jimmymiller
    • RE: Installing Guest Tools on AlamaLinux 9 Issue

      @stevewest15 Not to call this a "fix" because I know they aren't always the latest 'n greatest, but have you tried just installing from the EPEL repo? Seems to work for us.

      dnf install epel-release-9-5.el9.noarch -y
      yum install xe-guest-utilities-latest.x86_64 -y
      systemctl enable xe-linux-distribution.service
      systemctl start xe-linux-distribution.service
      
      posted in Management
      J
      jimmymiller
    • XO token expiration (Web)

      We have a request from our security folks to force expiration of tokens if users go idle for a specific amount of time. Is there a means in XO to do this?

      posted in Management
      J
      jimmymiller
    • LDAP Sync Cron?

      Is there a means of croning the LDAP group sync process? We'd obviously prefer any LDAP changes to be instant, but because XO uses a manual sync, it'd be nice if we could tell our customers "it will happen at x & y each day."

      posted in Management
      J
      jimmymiller
    • RE: Xcp-ng 8.2 no more recognized with Cloudstack since last update

      @AlexanderK Yes. The hosts and CS Management server are on the same L2/no firewalls. I think it has to do with me wanting to use an "L2 network topology" out the gate but from the reading I'm doing, this is definitely atypical. I've blown away the zone and reconstructed it in what I think is a "normal" CS config now.

      I've gotten the hosts attached at least, but now I'm seeing issues with getting the SystemVMs up and running. I did check the 'use local host storage' option in the zone config and I can see CS is at least touching/renamed the local SR.

      Secondary Storage Vm creation failure in zone [xxxxxx]. Error details: Unable to allocate capacity on zone [2] due to [null].
      Console proxy creation failure. Zone [xxxxxx]. Error details: Unable to orchestrate start VM instance {"id":20,"instanceName":"v-20-VM","type":"ConsoleProxy","uuid":"f138df9e-92c4-4cfd-9333-7d8396c436e6"} due to [Unable to acquire lock on VMTemplateStoragePool: 31].
      

      My storage network is completely private and everything is hanging off a single bond0 where I have VLANs carved down to the XCP-ng host, but I haven't configured any networks within XO. I assume CS was supposed to take care of that?

      I'm also not seeing the VMs show up in XO. Am I supposed to see instances and SystemVMs in XO?

      Thanks for any help.

      posted in Compute
      J
      jimmymiller
    • RE: Xcp-ng 8.2 no more recognized with Cloudstack since last update

      @AlexanderK Just started playing with CS. Trying to see if this might be a better fit for our endusers than the current XOA. XOA is okay, but definitely more focused and defined for admins of the full XCP-ng stack vs. just wanting to deploy VMs. It's possible XO 6 might solve a lot of my issues, but we're looking around for other options that will utilize XCP-ng underneath the covers while maybe giving a better frontend experience for customers.

      At this point I'm just trying to get a rather vanilla XCP-ng host in my lab connected to CS so I can learn the architecture better but I'm running into issues. I won't deny that I'm a noob with CS so some of the components are still foreign to me but simply connecting a host is giving me troubles and documentation seems a little spotty. Any ideas? This might need to go into a new thread.

      Error:
      "Cannot transit agent status with event AgentDisconnected for host....Unable to transition to a new state from Creating via AgentDisconnected"

      Screenshot 2024-05-12 at 12.52.47.png

      posted in Compute
      J
      jimmymiller
    • Understanding templates

      How can we create a template from a VM that provides the same install options as those provided with the default XO templates? In our case we usually create empty shells then roll with PXE to install the OS, whether it be Windows or Linux, etc. I'm trying to find a way to take a default template, tweak it a little (i.e. mainly wanting to enable HA as a default) then redeploy that as a template with the exact same options during the VM create process as those provided by the built-in templates. In particular the main issue I'm seeing is the "install settings" are different and I also want to deselect the "fast clone" option as a default.

      Default template replica:
      Screenshot 2024-05-04 at 09.41.41.png

      Template created from a VM using the same template:Screenshot 2024-05-04 at 09.42.27.png

      posted in Management
      J
      jimmymiller
    • RE: Backup Health Check Procedure

      @olivierlambert Same behavior. Should I open a case with support?

      I have noticed if I do a manual health check outside of the backup job, it behaves as I'd expect....restores VM>>power-on>>management tools seen>>power-off>>destroy.

      posted in Backup
      J
      jimmymiller