XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. flakpyro
    3. Posts
    F
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 5
    • Posts 155
    • Groups 0

    Posts

    Recent Best Controversial
    • HTTP connection has timed out and causes

      So i already have a ticket in for this but thought i'd post here to draw on the knowledge of the community as well.

      We have a 5 host pool with several backup and replication jobs configured in XOA. Randomly a job will stall out with some VMs partially backed up, (the estimated end will shoot up to 2 - 3 days, and progress will stop) the only way i am able to cancel it is to restart the tool stack on the pool master and restart the XOA service on the XOA appliance. Then let GC run and retry the failed VMs. This usually then results in a full of the failed VM. This can happen to both CR jobs and to regular backup jobs.

      In the log i see in XOA for the failed job is:

      "message": "HTTP connection has timed out",
      "name": "Error",
      "stack": "Error: HTTP connection has timed out\n at ClientRequest.<anonymous> (/usr/local/lib/node_modules/xo-server/node_modules/http-request-plus/index.js:61:25)\n at ClientRequest.emit (node:events:518:28)\n at ClientRequest.patchedEmit [as emit] (/usr/local/lib/node_modules/xo-server/node_modules/@xen-orchestra/log/configure.js:52:17)\n at TLSSocket.emitRequestTimeout (node:_http_client:849:9)\n at Object.onceWrapper (node:events:632:28)\n at TLSSocket.emit (node:events:530:35)\n at TLSSocket.patchedEmit [as emit] (/usr/local/lib/node_modules/xo-server/node_modules/@xen-orchestra/log/configure.js:52:17)\n at Socket._onTimeout (node:net:595:8)\n at listOnTimeout (node:internal/timers:581:17)\n at process.processTimers (node:internal/timers:519:7)"
      }
      

      Now i understand i can increase the timeout value, and that is what support has suggested as well as the XO documentation here:
      https://docs.xen-orchestra.com/backup_troubleshooting#error-http-connection-has-timed-out

      What i can't figure out is why this timeout would occur in the first place?
      I have ensured the XOA appliance is running on the Pool master at all times, i have also ensured the pool master is the least loaded of all 5 hosts in our production pool in an attempt to mitigate this in the past with no success. The pool master is only running 17 VMs, has 32 cores and 384GB of RAM with 8 GB assigned to dom0.

      Each host has 4x 10GBe SFP+ ports, 2 x 10GB Ports are in an multi-chasis LAG (LACP) and dedicated to storage (NFSv3), the other 2 are in another multi-chasis lag (LACP) and dedicated to VM traffic/Management/Backup traffic.

      So i don't think anything is overloaded that would cause this? Any suggestions or something to look for? I have increased the timeout value for now but the docs seem to imply i need to get to the bottom of what is causing it for a more long term solution.

      posted in Backup
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @stormi installed on the same test machines i have the other batch of updates installed on. No issues after a reboot.

      posted in News
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @stormi Updated both of my test hosts. Everything rebooted and came up fine.

      No VM stats in XO / XOA i see still. I will be curious if this round of updates fixes my EFI / Windows Server reboot hangs.

      posted in News
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @Greg_E I do not have anything older like a K620, i do have a GeForce GTX 1650 in another machine that is also doing this but i believe that's the same generation as the T1000 (Turing). I agree i think ARC could be a great replacement for this application in the future.

      Since this is occurring at boot before the OS loads i'm not sure if this would be management agent related?

      posted in News
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @Greg_E

      The VM is EUFI with no vTPM or Secure Boot enabled. The CPU us a "Xeon E-2336 CPU @ 2.90GHz" Running in a Super Micro Server. We use these servers at remote sites to run a number of VMs including a "Blue Iris" server with an Nvidia T1000 GPU passed thru to it, i have one such servers as a test server as well. The second machine doing it is a Minisforum MS-01 also with a T1000 GPU Passed thru in my home lab. The OS in both cases is Windows Server 2025. Over the weekend i cloned a fresh copy without GPU Passthru to see if it occurs with no GPU. We have about 15 of these VMs running at remote locations on the stable 8.3 patch branch that are not doing this.

      It should be noted though that these VMs did need to be customized to allow Blue Iris to run without BSODing the VM on Intel CPUs. Thread can be found here: https://xcp-ng.org/forum/topic/8873/windows-blue-iris-xcp-ng-8-3/35?_=1746455850378 but in the end the following needed to be applied to a VM to keep it from BSODing when running Blue Iris on new Intel CPUs:

      xe vm-param-add uuid=... param-name=platform msr-relaxed=true
      

      @stormi It does not seem to do it every time, it seems the VM must run for sometime and then be rebooted to cause it to happen. I have created a bug report file from the host as requested and will DM you a link to it! The last time i experienced this would have been at "May 2, 2025 at 4:18 PM (3 days ago)" according to our XOA appliance.

      posted in News
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      This happened again today when rebooting a Windows Server on my test host running these updates:

      50334e3a-39ea-4d04-a91c-c6c988ff41fc-image.png

      A "Force Reboot" does not solve it, i have to do a "Force Shutdown" and power on again. It will hang at that screen indefinitely. I can reproduce this on two hosts now. It seems if i reboot again immediately after the fact it will be fine but after it has sat running awhile it will always occur. Not sure if it may be GPU passthru releated. I can clone another Windows VM that has no passthru device enabled if that would help.

      posted in News
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @gduperrey

      After running this for a week now i have come across an issue with EFI hosts with these testing updates. Sometimes when booting a VM or rebooting after a windows update the VM will hang at "Guest has not initialized display yet" and not boot. Doing a force shutdown and powering on again solves it. It has happened twice so far to unrelated windows VMs both in my homelab and in our test lab at work. Not sure how to provide more info or logs when it happens however.

      Edit: One common thing between both these VMs is they both are using GPU Passthru, if that could at all matter.

      posted in News
      F
      flakpyro
    • RE: [dedicated thread] Dell Open Manage Appliance (OME)

      @AtaxyaNetwork are you able to list the files that require these drive name modifications? I can try changing them on the latest updated version of the appliance.

      posted in Compute
      F
      flakpyro
    • RE: [dedicated thread] Dell Open Manage Appliance (OME)

      @AtaxyaNetwork I understand it would be a lot of work to make a new XVA every time Dell updates the appliance. Would it be possible to write a shell script that we could copy to the appliance and run after the fact that could change the /dev/sda to /dev/xvda in the needed files perhaps? I can try running it on my updated appliance you sent me, im just not sure where the drives are being referenced.

      Something like

      sed -i -e 's/sda/xvda/g' /etc/whatever.conf? 
      
      
      posted in Compute
      F
      flakpyro
    • RE: [dedicated thread] Dell Open Manage Appliance (OME)

      So testing the appliance and here is what i have encountered so far:

      Boots up no problem, XO sees tools installed and all stats work.

      It took me two attempts to upgrade it to 4.4.0.75, the first attempt failed, i reverted a snapshot from before trying and upon trying again it was successful.

      However i can no longer install / update plugins, back to throwing disk space capacity errors. I'm guessing whatever was modified in the appliance you shared was reverted by the update which now causes it to think there is no disk space again.

      posted in Compute
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @gduperrey

      installed on 2 test machines

      Machine 1:
      Intel Xeon E-2336
      SuperMicro board.

      Machine 2:
      Minisforum MS-01
      i9-13900H
      32 GB Ram
      Using Intel X710 onboard NIC

      Both machines installed fine and all VMs came up without issue after. My one test backup job also seemed to run without any issues.

      posted in News
      F
      flakpyro
    • RE: [dedicated thread] Dell Open Manage Appliance (OME)

      I'd love to see support added for XCP-NG by Dell as well. I also run Lenovo xClarity Administrator and their appliance works in XCP-NG without any modification!

      posted in Compute
      F
      flakpyro
    • RE: Bonded interface viewing support in XO

      @lsouai-vates

      While we run XOA in Production our DR site runs a from sources XO in case of emergencies, i updated it this morning and took a look at the latest changes and it looks much improved, you can now see a bonds parent device:
      29f1f7cf-b0c9-4e07-81dd-3300b5f9f5b5-image.png

      I think it would also be useful to show the type of bond (LACP, balance-slb, etc) but this i a great improvement.

      posted in Xen Orchestra
      F
      flakpyro
    • RE: Our future backup code: test it!

      @CodeMercenary I think what your asking for is already possible using [nobak] and [nosnap]
      https://xen-orchestra.com/blog/xen-orchestra-5-96/

      posted in Backup
      F
      flakpyro
    • RE: Our future backup code: test it!

      @olivierlambert @stormi this looks interesting! What will be the main advantages of using "Node generators" instead of streams for backups?

      posted in Backup
      F
      flakpyro
    • RE: CBT: the thread to centralize your feedback

      So testing after a few more rounds of updates and it appears i'm still having the same issue. If i have CBT with snapshot removal enabled, and i take a manual snapshot of a VM (say for running maintenance), then remove it after some time CBT will be reset and a full backup will run during the next backup schedule. This is fine for local backups where i have 10GBe between the ZFS backup server and the pool but not ideal for replication offsite. I see there are some big changes coming with the backup code which is great news but i'd REALLY like to be able to use CBT with snapshot deletion enabled!

      posted in Backup
      F
      flakpyro
    • RE: Bonded interface viewing support in XO

      I agree this is something that would be very useful to have in the WebUI when reviewing pool config to ensure everything is setup properly. It may be too late for XO5 but perhaps something to be added in XO6? I see there has been lots of work happening on the network tab in XO6 currently from looking at the github.

      posted in Xen Orchestra
      F
      flakpyro
    • RE: XCP-ng 8.3 updates announcements and testing

      @gduperrey

      installed on 2 test machines

      Machine 1:
      Intel Xeon E-2336
      SuperMicro board.

      Machine 2:
      Minisforum MS-01
      i9-13900H
      32 GB Ram
      Using Intel X710 onboard NIC

      Both machines installed fine and all VMs came up without issue after.

      posted in News
      F
      flakpyro
    • RE: CBT: the thread to centralize your feedback

      @Andrw0830 Can you also confirm if taking a regular snapshot from XO, then deleting it sometime later causes CBT to also reset as i have ran into as well? (Above)

      posted in Backup
      F
      flakpyro
    • RE: Win11 VM update 23H2 -> 24H2 fail

      Try removing the Xentools and running the update. Then reinstalling the tools after. I have had luck doing this when upgrading Windows server from 2022 to 2025, may also apply here as well.

      posted in Compute
      F
      flakpyro