XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. MajorP93
    M
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 4
    • Posts 54
    • Groups 0

    MajorP93

    @MajorP93

    16
    Reputation
    4
    Profile views
    54
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    MajorP93 Unfollow Follow

    Best posts made by MajorP93

    • Xen Orchestra OpenMetrics Plugin - Grafana Dashboard

      Hello XCP-ng community!

      Since Vates released the new OpenMetrics plugin for Xen Orchestra we now have an official, built-in exporter for Prometheus metrics!

      I was using xen-exporter before in order to make hypervisor internal RRD database available in the form of Prometheus metrics.
      I migrated to the new plugin which works just fine.

      I updated the Grafana dashboard that I was using in order to be compatible with the official OpenMetrics plugin and thought "why not share it with other users"?

      In case you are interested you can find my dashboard JSON here: https://gist.github.com/MajorP93/3a933a6f03b4c4e673282fb54a68474b

      It is based on the xen-exporter dashboard made by MikeDombo: https://grafana.com/grafana/dashboards/16588-xen/

      In case you also use Prometheus for scraping Xen Orchestra OpenMetrics plugin in combination with Grafana you can copy the JSON from my gist, import it and you are ready to go!

      Hope it helps!

      Might even be a good idea to include the dashboard as an example in the Xen Orchestra documentation. 🙂

      Best regards

      posted in Infrastructure as Code
      M
      MajorP93
    • RE: XO5 breaks after defaulting to XO6 (from source)

      @MathieuRA I disabled Traefik and reverted to my old XO config (port 443, ssl encryption, http to https redirection), rebuild the docker container using your branch and tested:

      it is working fine on my end now 🙂

      Thank you very much!

      I did not expect this to get fixed so fast!

      posted in Xen Orchestra
      M
      MajorP93
    • RE: [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs

      @florent said in [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs:

      @MajorP93 the size are different between the disks, did you modify it since the snapshots ?

      would it be possible to take one new snapshot with the same disk structure ?

      Sorry it was my bad indeed.
      On the VMWare side there are 2 VMs that have almost the exact same name.
      When I checked for disk layout to verify this was an issue I looked at the wrong VM. 🤦

      I checked again and can confirm that the VM in question has 1x 60GiB and 1x 25GiB VMDK.

      So this is not an issue. It is working as intended.

      Thread can be closed / deleted.
      Sorry again and thanks for the replies.

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Xen Orchestra Node 24 compatibility

      said in Xen Orchestra Node 24 compatibility:

      After moving from Node 22 to Node 24 on my XO instance I started to see more "Error: ENOMEM: not enough memory, close" for my backup jobs even though my XO VM has 8GB of RAM...

      I will revert back to Node 22 for now.

      I did some further troubleshooting and was able to pinpoint it down to SMB encryption on Xen Orchestra backup remotes ("seal" CIFS mount flag).
      "ENOMEM" errors seem to occur only when I enable previously explained option.
      Seems to be related to some buffering that is controlled by Linux kernel CIFS implementation that is failing when SMB encryption is being used.
      CIFS operation gets killed due to buffer exhaustion caused by encryption and Xen Orchestra shows "ENOMEM".
      Somehow this issue gets more visible when using Node 24 vs Node 22 which is why I thought it was caused by the Node version + XO version combination. I switched Node version at the same time I enabled SMB encryption.
      However this seems to be not directly related to Xen Orchestra and more a Node / Linux kernel CIFS implementation thing.
      Apparently not a Xen Orchestra bug per se.

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      Hey,
      small update:
      while adding the backup section and "diskPerVmConcurrency" option to "/etc/xo-server/config.diskConcurrency.toml" or "~/.config/xo-server/config.diskConcurrency.toml" had no effect for me, I was able to get this working by adding it at the end of my main XO config file at "/etc/xo-server/config.toml".

      Best regards

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I worked around this issue by changing my full backup job to "delta backup" and enabling "force full backup" in the schedule options.

      Delta backup seems more reliable as of now.

      Looking forward to a fix as Zstd compression is an appealing feature of the full backup method.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I can imagine that a fix could be to send "keepalive" packets in addition to the XCP-ng export-VM-data-stream so that the timeout on XO side does not occur 🤔

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @magicker said in "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update:

      @olivierlambert said in "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update:

      Because doing an update without rebooting doesn't reload the updated main programs, like XAPI. A host in only updated after a full reboot.

      Reply

      Hi there
      Is it just me or is this a chicken and egg situation.

      you upgrade the master... how the pool is in NOT_SUPPORTED_DURING_UPGRADE() stage. You cant move vms off the master so all you can do is close down vms.. reboot.. pray

      then move the a non master.. you cant move the vms off here either NOT_SUPPORTED_DURING_UPGRADE(). So you have do the same..

      needless to say I hit issues on each reboot which caused 30- 60 min delays in getting vms back up and running.

      can you Warm migrate or is this dead also (to scared to test)

      For me this workflow worked every time there were upgrades available:

      -disable HA on pool level
      -disable load balancer plugin
      -upgrade master
      -upgrade all other nodes
      -restart toolstack on master
      -restart toolstack on all other nodes
      -live migrate all VMs running on master to other node(s)
      -reboot master
      -reboot next node (live migrate all VMs running on that particular node away before doing so)
      -repeat until all nodes have been rebooted (one node at a time)
      -re-enable HA on pool level
      -re-enable load balancer plugin

      Never had any issues with that. No downtime for none of the VMs.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @andriy.sultanov said in Potential bug with Windows VM backup: "Body Timeout Error":

      xe-toolstack-restart

      Okay I was able to replicate the issue.
      This is the setup that I used and that resulted in the "body timeout error" previously discussed in this thread:

      OS: Windows Server 2019 Datacenter
      1.png
      2.png

      The versions of the packages in question that were used in order to replicate the issue (XCP-ng 8.3, fully upgraded):

      [11:58 dat-xcpng-test01 ~]# rpm -q xapi-core
      xapi-core-25.27.0-2.2.xcpng8.3.x86_64
      [11:59 dat-xcpng-test01 ~]# rpm -q qcow-stream-tool
      qcow-stream-tool-25.27.0-2.2.xcpng8.3.x86_64
      [11:59 dat-xcpng-test01 ~]# rpm -q vhd-tool
      vhd-tool-25.27.0-2.2.xcpng8.3.x86_64
      

      Result:
      3.png
      Backup log:

      {
        "data": {
          "mode": "full",
          "reportWhen": "failure"
        },
        "id": "1764585634255",
        "jobId": "b19ed05e-a34f-4fab-b267-1723a7195f4e",
        "jobName": "Full-Backup-Test",
        "message": "backup",
        "scheduleId": "579d937a-cf57-47b2-8cde-4e8325422b15",
        "start": 1764585634255,
        "status": "failure",
        "infos": [
          {
            "data": {
              "vms": [
                "36c492a8-e321-ef2b-94dc-a14e5757d711"
              ]
            },
            "message": "vms"
          }
        ],
        "tasks": [
          {
            "data": {
              "type": "VM",
              "id": "36c492a8-e321-ef2b-94dc-a14e5757d711",
              "name_label": "Win2019_EN_DC_TEST"
            },
            "id": "1764585635692",
            "message": "backup VM",
            "start": 1764585635692,
            "status": "failure",
            "tasks": [
              {
                "id": "1764585635919",
                "message": "snapshot",
                "start": 1764585635919,
                "status": "success",
                "end": 1764585644161,
                "result": "0f548c1f-ce5c-56e3-0259-9c59b7851a17"
              },
              {
                "data": {
                  "id": "f1bc8d14-10dd-4440-bb1d-409b91f3b550",
                  "type": "remote",
                  "isFull": true
                },
                "id": "1764585644192",
                "message": "export",
                "start": 1764585644192,
                "status": "failure",
                "tasks": [
                  {
                    "id": "1764585644201",
                    "message": "transfer",
                    "start": 1764585644201,
                    "status": "failure",
                    "end": 1764586308921,
                    "result": {
                      "name": "BodyTimeoutError",
                      "code": "UND_ERR_BODY_TIMEOUT",
                      "message": "Body Timeout Error",
                      "stack": "BodyTimeoutError: Body Timeout Error\n    at FastTimer.onParserTimeout [as _onTimeout] (/opt/xo/xo-builds/xen-orchestra-202511080402/node_modules/undici/lib/dispatcher/client-h1.js:646:28)\n    at Timeout.onTick [as _onTimeout] (/opt/xo/xo-builds/xen-orchestra-202511080402/node_modules/undici/lib/util/timers.js:162:13)\n    at listOnTimeout (node:internal/timers:588:17)\n    at process.processTimers (node:internal/timers:523:7)"
                    }
                  }
                ],
                "end": 1764586308922,
                "result": {
                  "name": "BodyTimeoutError",
                  "code": "UND_ERR_BODY_TIMEOUT",
                  "message": "Body Timeout Error",
                  "stack": "BodyTimeoutError: Body Timeout Error\n    at FastTimer.onParserTimeout [as _onTimeout] (/opt/xo/xo-builds/xen-orchestra-202511080402/node_modules/undici/lib/dispatcher/client-h1.js:646:28)\n    at Timeout.onTick [as _onTimeout] (/opt/xo/xo-builds/xen-orchestra-202511080402/node_modules/undici/lib/util/timers.js:162:13)\n    at listOnTimeout (node:internal/timers:588:17)\n    at process.processTimers (node:internal/timers:523:7)"
                }
              },
              {
                "id": "1764586443440",
                "message": "clean-vm",
                "start": 1764586443440,
                "status": "success",
                "end": 1764586443459,
                "result": {
                  "merge": false
                }
              },
              {
                "id": "1764586443624",
                "message": "snapshot",
                "start": 1764586443624,
                "status": "success",
                "end": 1764586451966,
                "result": "c3e9736e-d6eb-3669-c7b8-f603333a83bf"
              },
              {
                "data": {
                  "id": "f1bc8d14-10dd-4440-bb1d-409b91f3b550",
                  "type": "remote",
                  "isFull": true
                },
                "id": "1764586452003",
                "message": "export",
                "start": 1764586452003,
                "status": "success",
                "tasks": [
                  {
                    "id": "1764586452008",
                    "message": "transfer",
                    "start": 1764586452008,
                    "status": "success",
                    "end": 1764586686887,
                    "result": {
                      "size": 10464489322
                    }
                  }
                ],
                "end": 1764586686900
              },
              {
                "id": "1764586690122",
                "message": "clean-vm",
                "start": 1764586690122,
                "status": "success",
                "end": 1764586690140,
                "result": {
                  "merge": false
                }
              }
            ],
            "warnings": [
              {
                "data": {
                  "attempt": 1,
                  "error": "Body Timeout Error"
                },
                "message": "Retry the VM backup due to an error"
              }
            ],
            "end": 1764586690142
          }
        ],
        "end": 1764586690143
      }
      

      I then enabled your test repository and installed the packages that you mentioned:

      [12:01 dat-xcpng-test01 ~]# rpm -q xapi-core
      xapi-core-25.27.0-2.3.0.xvafix.1.xcpng8.3.x86_64
      [12:08 dat-xcpng-test01 ~]# rpm -q vhd-tool
      vhd-tool-25.27.0-2.3.0.xvafix.1.xcpng8.3.x86_64
      [12:08 dat-xcpng-test01 ~]# rpm -q qcow-stream-tool
      qcow-stream-tool-25.27.0-2.3.0.xvafix.1.xcpng8.3.x86_64
      

      I restarted tool-stack and re-ran the backup job.
      Unfortunately it did not solve the issue and made the backup behave very strangely:
      9c9e9fdc-8385-4df2-9d23-7b0e4ecee0cd-grafik.png
      The backup job ran only a few seconds and reported that it was "successful". But only 10.83KiB were transferred. There are 18GB used space on this VM. So the data unfortunately was not transferred by the backup job.

      25deccb4-295e-4ce1-a015-159780536122-grafik.png

      Here is the backup log:

      {
        "data": {
          "mode": "full",
          "reportWhen": "failure"
        },
        "id": "1764586964999",
        "jobId": "b19ed05e-a34f-4fab-b267-1723a7195f4e",
        "jobName": "Full-Backup-Test",
        "message": "backup",
        "scheduleId": "579d937a-cf57-47b2-8cde-4e8325422b15",
        "start": 1764586964999,
        "status": "success",
        "infos": [
          {
            "data": {
              "vms": [
                "36c492a8-e321-ef2b-94dc-a14e5757d711"
              ]
            },
            "message": "vms"
          }
        ],
        "tasks": [
          {
            "data": {
              "type": "VM",
              "id": "36c492a8-e321-ef2b-94dc-a14e5757d711",
              "name_label": "Win2019_EN_DC_TEST"
            },
            "id": "1764586966983",
            "message": "backup VM",
            "start": 1764586966983,
            "status": "success",
            "tasks": [
              {
                "id": "1764586967194",
                "message": "snapshot",
                "start": 1764586967194,
                "status": "success",
                "end": 1764586975429,
                "result": "ebe5c4e2-5746-9cb3-7df6-701774a679b5"
              },
              {
                "data": {
                  "id": "f1bc8d14-10dd-4440-bb1d-409b91f3b550",
                  "type": "remote",
                  "isFull": true
                },
                "id": "1764586975453",
                "message": "export",
                "start": 1764586975453,
                "status": "success",
                "tasks": [
                  {
                    "id": "1764586975473",
                    "message": "transfer",
                    "start": 1764586975473,
                    "status": "success",
                    "end": 1764586981992,
                    "result": {
                      "size": 11093
                    }
                  }
                ],
                "end": 1764586982054
              },
              {
                "id": "1764586985271",
                "message": "clean-vm",
                "start": 1764586985271,
                "status": "success",
                "end": 1764586985290,
                "result": {
                  "merge": false
                }
              }
            ],
            "end": 1764586985291
          }
        ],
        "end": 1764586985292
      }
      

      If you need me to test something else or if I should provide some log file from the XCP-ng system please let me know.

      Best regards

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @andriy.sultanov I created a small test setup in our lab. I created a WIndows VM with a lot of free disk space (2 virtual disks, 2.5 TB free space in total). Hopefully that way I will be able to replicate the issue with full backup timeout for VMs with a lot of free space that occurred in our production environment.
      The backup job is currently running. I will report back once it failed and once I had a chance to test if your fix solves the issue.

      posted in Backup
      M
      MajorP93

    Latest posts made by MajorP93

    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @shorian The documentation never stated otherwise… https://docs.xcp-ng.org/management/updates/#-how-to-apply-the-updates

      Those steps that I mentioned previously in this thread were taken from official xcp ng documentation. If you pay attention to the numbers in front of the sentences in the document I just linked and follow them in numerical order you will end up exactly with my routine.

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @magicker said in "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update:

      @olivierlambert said in "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update:

      Because doing an update without rebooting doesn't reload the updated main programs, like XAPI. A host in only updated after a full reboot.

      Reply

      Hi there
      Is it just me or is this a chicken and egg situation.

      you upgrade the master... how the pool is in NOT_SUPPORTED_DURING_UPGRADE() stage. You cant move vms off the master so all you can do is close down vms.. reboot.. pray

      then move the a non master.. you cant move the vms off here either NOT_SUPPORTED_DURING_UPGRADE(). So you have do the same..

      needless to say I hit issues on each reboot which caused 30- 60 min delays in getting vms back up and running.

      can you Warm migrate or is this dead also (to scared to test)

      For me this workflow worked every time there were upgrades available:

      -disable HA on pool level
      -disable load balancer plugin
      -upgrade master
      -upgrade all other nodes
      -restart toolstack on master
      -restart toolstack on all other nodes
      -live migrate all VMs running on master to other node(s)
      -reboot master
      -reboot next node (live migrate all VMs running on that particular node away before doing so)
      -repeat until all nodes have been rebooted (one node at a time)
      -re-enable HA on pool level
      -re-enable load balancer plugin

      Never had any issues with that. No downtime for none of the VMs.

      posted in Backup
      M
      MajorP93
    • Xen Orchestra OpenMetrics Plugin - Grafana Dashboard

      Hello XCP-ng community!

      Since Vates released the new OpenMetrics plugin for Xen Orchestra we now have an official, built-in exporter for Prometheus metrics!

      I was using xen-exporter before in order to make hypervisor internal RRD database available in the form of Prometheus metrics.
      I migrated to the new plugin which works just fine.

      I updated the Grafana dashboard that I was using in order to be compatible with the official OpenMetrics plugin and thought "why not share it with other users"?

      In case you are interested you can find my dashboard JSON here: https://gist.github.com/MajorP93/3a933a6f03b4c4e673282fb54a68474b

      It is based on the xen-exporter dashboard made by MikeDombo: https://grafana.com/grafana/dashboards/16588-xen/

      In case you also use Prometheus for scraping Xen Orchestra OpenMetrics plugin in combination with Grafana you can copy the JSON from my gist, import it and you are ready to go!

      Hope it helps!

      Might even be a good idea to include the dashboard as an example in the Xen Orchestra documentation. 🙂

      Best regards

      posted in Infrastructure as Code
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert @andriy.sultanov I saw that the linked PR got merged, which is awesome! I was wondering: do you plan to release this with the next round of patches or even before that as a hotfix?
      In each case: if you have a test build I am happy to try it out.

      Thank you and best regards

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      This is the PR: https://github.com/xapi-project/xen-api/pull/6786

      It's ready/reviewed, we are waiting for upstream merge. We'll make sure to re-ask for a merge ASAP.

      Stay tuned!

      Thank you very much for the update!

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      Hey @andriy.sultanov @olivierlambert

      Any news on this?
      If I recall correctly some work towards fixing the "compressed full backup of VMs with lots of free space" issue has been done but a fix was not (yet) pushed to the package repositories.

      Is there a new test build that I could test or similar?
      I would love to be able to use full backup method again.

      Thanks!

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @archw Ohh I get it now! The "rebooting" instead of "rebooted" can be understood as "did applying the patches cause the systems to reboot" as in system crash or similar.
      Gotcha!
      Understood what you meant now.

      //EDIT: anyways, back to topic. In case the systems were already rebooted after applying the updates I currently do not have an idea what might cause this...

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @archw Mhh, Danp asked if you rebooted your hosts after applying the patches and you said nope.

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @archw I'd advise to read documentation at the part related to rebooting after package upgrades: https://docs.xcp-ng.org/management/updates/#-when-to-reboot

      In this case "xen" updates were included in December package updates which results in one of the criteria for having to reboot being met.

      Personally I reboot every time package updates were installed. You stay on the safe side by doing so.

      posted in Backup
      M
      MajorP93
    • RE: Xen Orchestra Node 24 compatibility

      @olivierlambert said in Xen Orchestra Node 24 compatibility:

      Can you reproduce the issue on XOA? Or it's only on the sources + your current OS?

      We do not have a XOA license (yet) which is why I am currently solely using XO from sources. Therefore currently not able to reproduce on XOA. OS is Debian 13.

      posted in Xen Orchestra
      M
      MajorP93