XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. MajorP93
    M
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 6
    • Posts 92
    • Groups 0

    MajorP93

    @MajorP93

    33
    Reputation
    10
    Profile views
    92
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    MajorP93 Unfollow Follow

    Best posts made by MajorP93

    • Xen Orchestra OpenMetrics Plugin - Grafana Dashboard

      Hello XCP-ng community!

      Since Vates released the new OpenMetrics plugin for Xen Orchestra we now have an official, built-in exporter for Prometheus metrics!

      I was using xen-exporter before in order to make hypervisor internal RRD database available in the form of Prometheus metrics.
      I migrated to the new plugin which works just fine.

      I updated the Grafana dashboard that I was using in order to be compatible with the official OpenMetrics plugin and thought "why not share it with other users"?

      In case you are interested you can find my dashboard JSON here: https://gist.github.com/MajorP93/3a933a6f03b4c4e673282fb54a68474b

      It is based on the xen-exporter dashboard made by MikeDombo: https://grafana.com/grafana/dashboards/16588-xen/

      In case you also use Prometheus for scraping Xen Orchestra OpenMetrics plugin in combination with Grafana you can copy the JSON from my gist, import it and you are ready to go!

      Hope it helps!

      Might even be a good idea to include the dashboard as an example in the Xen Orchestra documentation. 🙂

      Best regards

      posted in Infrastructure as Code
      M
      MajorP93
    • RE: XO5 breaks after defaulting to XO6 (from source)

      @MathieuRA I disabled Traefik and reverted to my old XO config (port 443, ssl encryption, http to https redirection), rebuild the docker container using your branch and tested:

      it is working fine on my end now 🙂

      Thank you very much!

      I did not expect this to get fixed so fast!

      posted in Xen Orchestra
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      @Pilow said in backup mail report says INTERRUPTED but it's not ?:

      @MajorP93 you say to have 8GB Ram on XO, but it OOMkills at 5Gb Used RAM.

      did you do those additionnal steps in your XO Config ?

      You can increase the memory allocated to the XOA VM (from 2GB to 4GB or 8GB).
      Note that simply increasing the RAM for the VM is not enough.
      You must also edit the service file (/etc/systemd/system/xo-server.service) 
      to increase the memory allocated to the xo-server process itself.
      
      You should leave ~512MB for the debian OS itself. Meaning if your VM has 4096MB total RAM, you should use 3584 for the memory value below.
      
      - ExecStart=/usr/local/bin/xo-server
      + ExecStart=/usr/local/bin/node --max-old-space-size=3584 /usr/local/bin/xo-server
      The last step is to refresh and restart the service:
      
      $ systemctl daemon-reload
      $ systemctl restart xo-server
      

      Interesting!
      I did not know that it is recommended to set "--max-old-space-size=" as a startup parameter for Node JS with the result of (total system ram - 512MB).
      I added that, restarted XO and my backup job.

      I will test if that gives my backup jobs more stability.
      Thank you very much for taking the time and recommending the parameter.

      posted in Backup
      M
      MajorP93
    • RE: Xen Orchestra OpenMetrics Plugin - Grafana Dashboard

      @Mang0Musztarda said in Xen Orchestra OpenMetrics Plugin - Grafana Dashboard:

      @MajorP93 hi, how can i scrape openmetrics endpoint?
      i set up openmetrics plugin prometheus secret, enabled it, and ten tried to use curl like that: curl -H "Authorization: Bearer abc123" http://localhost:9004
      but response i got was
      {"error":"Query authentication does not match server setting"}
      what am i doing wrong?

      Hey!
      I scrape it like so:

      root@prometheus01:~# cat /etc/prometheus/scrape_configs/xen-orchestra-openmetrics.yml 
      scrape_configs:
        - job_name: xen-orchestra
          honor_labels: true
          scrape_interval: 30s
          scrape_timeout: 20s
          scheme: https
          tls_config:
            insecure_skip_verify: true
          bearer_token_file: /etc/prometheus/bearer.token
          metrics_path: /openmetrics/metrics
          static_configs:
          - targets:
            - xen-orchestra.domain.local
      

      /etc/prometheus/bearer.token file contains the bearer token as configured in openmetrics xen orchestra plugin.

      best regards

      posted in Infrastructure as Code
      M
      MajorP93
    • RE: Remote syslog broken after update/reboot? - Changing it away, then back fixes.

      @rzr Thank you very much!

      @michmoor0725 Absolutely! The community is another aspect of why working with XCP-ng is a lot more fun compared to working with VMWare!

      posted in Compute
      M
      MajorP93
    • RE: [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs

      @florent said in [VDDK V2V] Migration of VM that had more than 1 snapshot creates multiple VHDs:

      @MajorP93 the size are different between the disks, did you modify it since the snapshots ?

      would it be possible to take one new snapshot with the same disk structure ?

      Sorry it was my bad indeed.
      On the VMWare side there are 2 VMs that have almost the exact same name.
      When I checked for disk layout to verify this was an issue I looked at the wrong VM. 🤦

      I checked again and can confirm that the VM in question has 1x 60GiB and 1x 25GiB VMDK.

      So this is not an issue. It is working as intended.

      Thread can be closed / deleted.
      Sorry again and thanks for the replies.

      Best regards
      MajorP

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Xen Orchestra Node 24 compatibility

      said in Xen Orchestra Node 24 compatibility:

      After moving from Node 22 to Node 24 on my XO instance I started to see more "Error: ENOMEM: not enough memory, close" for my backup jobs even though my XO VM has 8GB of RAM...

      I will revert back to Node 22 for now.

      I did some further troubleshooting and was able to pinpoint it down to SMB encryption on Xen Orchestra backup remotes ("seal" CIFS mount flag).
      "ENOMEM" errors seem to occur only when I enable previously explained option.
      Seems to be related to some buffering that is controlled by Linux kernel CIFS implementation that is failing when SMB encryption is being used.
      CIFS operation gets killed due to buffer exhaustion caused by encryption and Xen Orchestra shows "ENOMEM".
      Somehow this issue gets more visible when using Node 24 vs Node 22 which is why I thought it was caused by the Node version + XO version combination. I switched Node version at the same time I enabled SMB encryption.
      However this seems to be not directly related to Xen Orchestra and more a Node / Linux kernel CIFS implementation thing.
      Apparently not a Xen Orchestra bug per se.

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Long backup times via NFS to Data Domain from Xen Orchestra

      Hey,
      small update:
      while adding the backup section and "diskPerVmConcurrency" option to "/etc/xo-server/config.diskConcurrency.toml" or "~/.config/xo-server/config.diskConcurrency.toml" had no effect for me, I was able to get this working by adding it at the end of my main XO config file at "/etc/xo-server/config.toml".

      Best regards

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I worked around this issue by changing my full backup job to "delta backup" and enabling "force full backup" in the schedule options.

      Delta backup seems more reliable as of now.

      Looking forward to a fix as Zstd compression is an appealing feature of the full backup method.

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      I can imagine that a fix could be to send "keepalive" packets in addition to the XCP-ng export-VM-data-stream so that the timeout on XO side does not occur 🤔

      posted in Backup
      M
      MajorP93

    Latest posts made by MajorP93

    • RE: backup mail report says INTERRUPTED but it's not ?

      @olivierlambert No.
      I reverted to Node 20 as previously mentioned.
      I was using Node 24 before but reverted to Node 20 as I hoped it would "fix" the issue.
      Using Node 20 it takes longer for these issues to arise but in the end they arise.

      3 of the users in this thread that encounter the issue said that they are using XOA.
      As XOA also uses Node 20, I think most people that reported this issue actually use Node 20.

      posted in Backup
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      During the past month my backups failed (status interrupted) 1-2 times per week due to this memory leak.
      When increasing heap size (node old space size) it takes longer but the backup fails when RAM usage eventually hits 100%.
      I guess I’ll go with @Pilow ‘s workaround for now and create a cronjob for rebooting XO VM right before backups start.

      posted in Backup
      M
      MajorP93
    • RE: 🛰️ XO 6: dedicated thread for all your feedback!

      @marcoi Sounds like your issue is related to: https://github.com/vatesfr/xen-orchestra/issues/9500

      ronivay created this issue in vatesfr/xen-orchestra

      open v5 interface doesn't work if hostname set #9500

      posted in Xen Orchestra
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      Hi @Bastien-Nollet,

      oh okay, thanks for clarifying!

      posted in Backup
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      I wonder if this PR https://github.com/vatesfr/xen-orchestra/pull/9506 aims to solve the issue that was discussed in this thread.
      To me it looks like it's the case as the issue seems to be related to RAM used by backup jobs not being freed correctly and the PR seems to add some garbage collection to backup jobs.
      I hope that it will fix the issue and if needed I can test a branch.

      b-Nollet opened this pull request in vatesfr/xen-orchestra

      open Backup tasks gc #9506

      posted in Backup
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      After implementing the --max-old-space-size Node parameter as recommended by @pilow it took longer time for the VM to hit the issue.
      Still: backups went into interrupted status.
      Memory leak seems to be still there.
      With each subsequent backup run the memory usage rises and rises. After backup run the memory usage does not fully go back to "normal".

      6ad321a1-2e39-4bca-9285-062e502a17b2-grafik.png

      After adding the node parameter there was no heap size error on Node anymore since the heap size got increased. The system went into various OOM errors in kernel log (dmesg) despite not all RAM (8GB) being used.

      This is what htop looks like with 3 backup jobs running:
      68db77eb-26f2-4dbb-b1db-2273984eabb3-grafik.png

      posted in Backup
      M
      MajorP93
    • RE: ASUS NUC NUC14MNK-B LAN problems

      @olivierlambert as the driver provided by @andrew is based on upstream r8125 version 9.016.01 and seems to solve the issue of user @paha I was wondering: are there any plans to update the driver to latest upstream version in official XCP-ng package repository?

      posted in Hardware
      M
      MajorP93
    • RE: ASUS NUC NUC14MNK-B LAN problems

      @paha Hi

      Try to install "r8125-module" by running

      yum install r8125-module
      

      Reboot your XCP-ng node afterwards

      1ee11405-70a2-4db1-a223-ddf3753c83e1-grafik.png

      //EDIT: Hmm I just checked on one of my XCP-ng nodes, r8125-module seems to be installed by default (despite me not using Realtek NIC).
      So installing it as suggested before will most probably have no impact in your case.

      From own experience I can say that Realtek NICs have been causing issues on Linux systems for a long time. You could try to use a different NIC, maybe even via USB adapter.
      I purchased a NIC with robust driver support on Linux on Amazon for 15$ for a private machine.
      Intel based NIC would be your best bet I guess.

      If trying a different NIC is no option your issue could possibly be fixed by XCP-ng team updating r8125 driver to latest upstream version (which seems to be 9.016.01 as of now)

      //EDIT2: another thing you should check first is disabling all sorts of energy saving options for your NIC in BIOS. PCI/PCIe ASPM being the most important one. I remembered that ASPM had been causing issues on Realtek NIC on one of my systems in the past.
      The issue you described (NIC connecting and disconnecting repeatedly) could be caused by some sort of energy saving feature bug.

      posted in Hardware
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      @olivierlambert Right now I am using Node JS version 20 as I saw that XOA uses that version aswell. I thought it might be best to use all dependencies at the versions that XOA uses.

      I was having the issue with backup job "interrupted" status on Node JS 24 aswell as documented in this thread.

      Actually since I downgraded to Node 20 total system RAM usage seems to have decreased by a fair bit which can be seen by comparing the 2 screenshots that I posted in this thread. On first screenshot I was using Node 24 und second screenshot Node 20.
      Despite that the issue re-occurred after a few days of XO running.

      I hope that --max-old-space-size Node parameter as suggested by @pilow solves my issue.
      I will report back.

      Best regards

      posted in Backup
      M
      MajorP93
    • RE: backup mail report says INTERRUPTED but it's not ?

      @Pilow said in backup mail report says INTERRUPTED but it's not ?:

      @MajorP93 you say to have 8GB Ram on XO, but it OOMkills at 5Gb Used RAM.

      did you do those additionnal steps in your XO Config ?

      You can increase the memory allocated to the XOA VM (from 2GB to 4GB or 8GB).
      Note that simply increasing the RAM for the VM is not enough.
      You must also edit the service file (/etc/systemd/system/xo-server.service) 
      to increase the memory allocated to the xo-server process itself.
      
      You should leave ~512MB for the debian OS itself. Meaning if your VM has 4096MB total RAM, you should use 3584 for the memory value below.
      
      - ExecStart=/usr/local/bin/xo-server
      + ExecStart=/usr/local/bin/node --max-old-space-size=3584 /usr/local/bin/xo-server
      The last step is to refresh and restart the service:
      
      $ systemctl daemon-reload
      $ systemctl restart xo-server
      

      Interesting!
      I did not know that it is recommended to set "--max-old-space-size=" as a startup parameter for Node JS with the result of (total system ram - 512MB).
      I added that, restarted XO and my backup job.

      I will test if that gives my backup jobs more stability.
      Thank you very much for taking the time and recommending the parameter.

      posted in Backup
      M
      MajorP93