XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. MajorP93
    3. Posts
    M
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 4
    • Posts 54
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @shorian The documentation never stated otherwise… https://docs.xcp-ng.org/management/updates/#-how-to-apply-the-updates

      Those steps that I mentioned previously in this thread were taken from official xcp ng documentation. If you pay attention to the numbers in front of the sentences in the document I just linked and follow them in numerical order you will end up exactly with my routine.

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @magicker said in "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update:

      @olivierlambert said in "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update:

      Because doing an update without rebooting doesn't reload the updated main programs, like XAPI. A host in only updated after a full reboot.

      Reply

      Hi there
      Is it just me or is this a chicken and egg situation.

      you upgrade the master... how the pool is in NOT_SUPPORTED_DURING_UPGRADE() stage. You cant move vms off the master so all you can do is close down vms.. reboot.. pray

      then move the a non master.. you cant move the vms off here either NOT_SUPPORTED_DURING_UPGRADE(). So you have do the same..

      needless to say I hit issues on each reboot which caused 30- 60 min delays in getting vms back up and running.

      can you Warm migrate or is this dead also (to scared to test)

      For me this workflow worked every time there were upgrades available:

      -disable HA on pool level
      -disable load balancer plugin
      -upgrade master
      -upgrade all other nodes
      -restart toolstack on master
      -restart toolstack on all other nodes
      -live migrate all VMs running on master to other node(s)
      -reboot master
      -reboot next node (live migrate all VMs running on that particular node away before doing so)
      -repeat until all nodes have been rebooted (one node at a time)
      -re-enable HA on pool level
      -re-enable load balancer plugin

      Never had any issues with that. No downtime for none of the VMs.

      posted in Backup
      M
      MajorP93
    • Xen Orchestra OpenMetrics Plugin - Grafana Dashboard

      Hello XCP-ng community!

      Since Vates released the new OpenMetrics plugin for Xen Orchestra we now have an official, built-in exporter for Prometheus metrics!

      I was using xen-exporter before in order to make hypervisor internal RRD database available in the form of Prometheus metrics.
      I migrated to the new plugin which works just fine.

      I updated the Grafana dashboard that I was using in order to be compatible with the official OpenMetrics plugin and thought "why not share it with other users"?

      In case you are interested you can find my dashboard JSON here: https://gist.github.com/MajorP93/3a933a6f03b4c4e673282fb54a68474b

      It is based on the xen-exporter dashboard made by MikeDombo: https://grafana.com/grafana/dashboards/16588-xen/

      In case you also use Prometheus for scraping Xen Orchestra OpenMetrics plugin in combination with Grafana you can copy the JSON from my gist, import it and you are ready to go!

      Hope it helps!

      Might even be a good idea to include the dashboard as an example in the Xen Orchestra documentation. 🙂

      Best regards

      posted in Infrastructure as Code
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert @andriy.sultanov I saw that the linked PR got merged, which is awesome! I was wondering: do you plan to release this with the next round of patches or even before that as a hotfix?
      In each case: if you have a test build I am happy to try it out.

      Thank you and best regards

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      @olivierlambert said in Potential bug with Windows VM backup: "Body Timeout Error":

      This is the PR: https://github.com/xapi-project/xen-api/pull/6786

      It's ready/reviewed, we are waiting for upstream merge. We'll make sure to re-ask for a merge ASAP.

      Stay tuned!

      Thank you very much for the update!

      posted in Backup
      M
      MajorP93
    • RE: Potential bug with Windows VM backup: "Body Timeout Error"

      Hey @andriy.sultanov @olivierlambert

      Any news on this?
      If I recall correctly some work towards fixing the "compressed full backup of VMs with lots of free space" issue has been done but a fix was not (yet) pushed to the package repositories.

      Is there a new test build that I could test or similar?
      I would love to be able to use full backup method again.

      Thanks!

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @archw Ohh I get it now! The "rebooting" instead of "rebooted" can be understood as "did applying the patches cause the systems to reboot" as in system crash or similar.
      Gotcha!
      Understood what you meant now.

      //EDIT: anyways, back to topic. In case the systems were already rebooted after applying the updates I currently do not have an idea what might cause this...

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @archw Mhh, Danp asked if you rebooted your hosts after applying the patches and you said nope.

      posted in Backup
      M
      MajorP93
    • RE: "NOT_SUPPORTED_DURING_UPGRADE()" error after yesterday's update

      @archw I'd advise to read documentation at the part related to rebooting after package upgrades: https://docs.xcp-ng.org/management/updates/#-when-to-reboot

      In this case "xen" updates were included in December package updates which results in one of the criteria for having to reboot being met.

      Personally I reboot every time package updates were installed. You stay on the safe side by doing so.

      posted in Backup
      M
      MajorP93
    • RE: Xen Orchestra Node 24 compatibility

      @olivierlambert said in Xen Orchestra Node 24 compatibility:

      Can you reproduce the issue on XOA? Or it's only on the sources + your current OS?

      We do not have a XOA license (yet) which is why I am currently solely using XO from sources. Therefore currently not able to reproduce on XOA. OS is Debian 13.

      posted in Xen Orchestra
      M
      MajorP93
    • RE: Xen Orchestra Node 24 compatibility

      said in Xen Orchestra Node 24 compatibility:

      After moving from Node 22 to Node 24 on my XO instance I started to see more "Error: ENOMEM: not enough memory, close" for my backup jobs even though my XO VM has 8GB of RAM...

      I will revert back to Node 22 for now.

      I did some further troubleshooting and was able to pinpoint it down to SMB encryption on Xen Orchestra backup remotes ("seal" CIFS mount flag).
      "ENOMEM" errors seem to occur only when I enable previously explained option.
      Seems to be related to some buffering that is controlled by Linux kernel CIFS implementation that is failing when SMB encryption is being used.
      CIFS operation gets killed due to buffer exhaustion caused by encryption and Xen Orchestra shows "ENOMEM".
      Somehow this issue gets more visible when using Node 24 vs Node 22 which is why I thought it was caused by the Node version + XO version combination. I switched Node version at the same time I enabled SMB encryption.
      However this seems to be not directly related to Xen Orchestra and more a Node / Linux kernel CIFS implementation thing.
      Apparently not a Xen Orchestra bug per se.

      posted in Xen Orchestra
      M
      MajorP93
    • RE: XO5 breaks after defaulting to XO6 (from source)

      @MathieuRA I disabled Traefik and reverted to my old XO config (port 443, ssl encryption, http to https redirection), rebuild the docker container using your branch and tested:

      it is working fine on my end now 🙂

      Thank you very much!

      I did not expect this to get fixed so fast!

      posted in Xen Orchestra
      M
      MajorP93
    • RE: XO5 breaks after defaulting to XO6 (from source)

      @MathieuRA said in XO5 breaks after defaulting to XO6 (from source):

      @MajorP93 various fixes have been performed on master.

      Your config file should look like this

      [http.mounts]
      # Uncomment to setup a default version.
      # Otherwise, XO5 will be the default for stable channel and XO6 for latest and source
      # '/' = '../xo-web/dist/'
      
      '/v5' = '../xo-web/dist/'
      '/v6' = '../../@xen-orchestra/web/dist/'
      
      [http.proxies]
      # [port] is used to reuse the same port declared in [http.listen.0]
      '/v5/api' = '[protocol]//localhost:[port]/api'
      '/v5/api/updater' = 'ws://localhost:9001'
      '/v5/rest' = 'http://localhost:[port]/rest'
      

      If you are using a different configuration file that overrides the default configuration, please verify that the paths to xo-5 and xo-6 are correct. You can use absolute paths if necessary.

      Thanks!
      I tried your approach but it did not fix the issue.
      The maintainer of the Docker container (ronivay) was able to narrow the issue down to users that let XO handle http to https redirection and SSL encryption.

      For now I made Xen Orchestra listen on port 80 only and disabled http to https redirection.
      I use Traefik as reverse proxy now and let it handle those parts.

      With that setup it works without any issues.

      Maybe at some point somebody could further investigate wether or not http to https redirection in combination with SSL encryption is broken after XO 6 became the default.

      Would be interesting to see if this issue occurs for non-docker users aswell.

      posted in Xen Orchestra
      M
      MajorP93
    • RE: XO5 breaks after defaulting to XO6 (from source)

      Hey,

      I am not using the XenOrchestraInstallerUpdater script by ronivay but the XO docker container made by the same author: https://github.com/ronivay/xen-orchestra-docker

      When using a XO commit after XO 6 became the default all the links to XO 5 do not work anymore.
      XO 6 is the default and appears to be working (at least the pages that have been ported to XO 6 already) but when trying to access XO 5 via url/v5 I get infinite loading / spinning wheel.

      When following the official documentation steps for making XO 5 default (https://docs.xen-orchestra.com/configuration#making-xo-5-the-default-ui) I get the same issue that was already mentioned in this thread (error "Cannot get /").

      By following the official documentation I mean applying the following:

      [http.mounts]
      '/v6' = '../../@xen-orchestra/web/dist/'
      '/' = '../xo-web/dist/'
      
      [http.proxies]
      '/v5/api' = 'ws://localhost:9000/api'
      '/v5/api/updater' = 'ws://localhost:9001'
      '/v5/rest' = 'http://localhost:9000/rest'
      

      I tried this using latest commit (d9fe9b603fa9c8d668fa90486ae12fe7ad49b257).

      For now I had to revert to the commit before XO 6 was made the default.

      P.S. (in case this is related): I enabled SSL encryption by setting the following in XO configuration:

      [http]
      redirectToHttps = true
      
      [[http.listen]]
      port = 443
      autoCert = true
      cert = '/cert.pem'
      key = '/cert.key'
      
      posted in Xen Orchestra
      M
      MajorP93
    • RE: Ansible Role - Install XO from source - Now available

      @probain Your role seems to neither install or pin node js itself so it does not fully follow the official steps from Xen Orchestra docs.
      Xen Orchestra docs state to use LTS node js - installing it is something that the the ronivay XO script does.
      Due to this part missing in your role it can not be executed on a freshly provisioned VM, something that is not really best practice. A role that was made for the purpose of deploying a software should also install / manage it's dependencies.

      Other than that, thanks for sharing your role.
      Every contribution towards this community is great IMO.

      posted in Infrastructure as Code
      M
      MajorP93
    • RE: log_fs_usage / /var/log directory on pool master filling up constantly

      Another thing that I noticed: despite enabling remote syslog (to graylog) for all XCP-ng hosts in the pool /var/log gets filled up to 100%.
      Adding remote syslog seem to not change usage of /var/log at all.

      Official XCP-ng documentation states otherwise here: https://docs.xcp-ng.org/installation/install-xcp-ng/#installation-on-usb-drives

      The linked part of the documentation indicates that configuring remote syslog can be a possible solution for /var/log space constraints which seems to be not the case.

      I feel like logging could use some investigation by Vates in general.

      posted in XCP-ng
      M
      MajorP93
    • RE: Xen Orchestra Node 24 compatibility

      After moving from Node 22 to Node 24 on my XO instance I started to see more "Error: ENOMEM: not enough memory, close" for my backup jobs even though my XO VM has 8GB of RAM...

      I will revert back to Node 22 for now.

      posted in Xen Orchestra
      M
      MajorP93
    • RE: log_fs_usage / /var/log directory on pool master filling up constantly

      Well I am not entirely sure but in case the effect of SR.scan on logging gets amplified by the size of virtual disks aswell (in the addition to the number of virtual disks) it might be caused by that. I have a few virtual machines that have a) many disks (up to 9) and b) large disks.
      I know it is rather bad design to run VMs this way (in my case these are file servers), I understand that using a NAS and mounting a share is better in this case but I had to migrate these VMs from the old environment and keep them running the way they are.
      That is the only thing I could think of that could result in SR.scan having this big of an impact in my pool.

      posted in XCP-ng
      M
      MajorP93
    • RE: log_fs_usage / /var/log directory on pool master filling up constantly

      @Pilow correct me if I'm wrong but I think day-to-day operations like VM start/stop, SR attach, VDI create, etc. perform explicit storage calls anyway so they should not depend strongly on this periodic SR.scan which is why I considered applying this safe

      posted in XCP-ng
      M
      MajorP93
    • RE: log_fs_usage / /var/log directory on pool master filling up constantly

      I applied

      xe host-param-set other-config:auto-scan-interval=120 uuid=<Host UUID>
      

      on my pool master as suggested by @flakpyro and it had a direct impact on the frequency of SR.scan tasks popping up and the amount of log output!

      I implemented graylog and remote syslog on my XCP-ng pool after posting the first message of this thread and in the image pasted below you can clearly see the effect of "auto-scan-interval" on the logging output.

      9814ad82-f2cd-4a66-a583-0e91fae9c01e-grafik.png

      I will keep monitoring this but it seems to improve things quite substantially!

      Since it appears that multiple users are affected by this it may be a good idea to change the default value within XCP-ng and/or add this to official documentation.

      posted in XCP-ng
      M
      MajorP93