XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Server will not migrate VMs to enter maintenance mode

    Scheduled Pinned Locked Moved Management
    18 Posts 5 Posters 530 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      pmcgrail
      last edited by Danp

      If a server has a running VM such as orchestrator, the server throw an error "HOST_NOT_ENOUGH_FREE_MEMORY".

      Manually migrating the VM solves the issue, but entering into maintenance mode should evacuate the hosts.

      host.setMaintenanceMode
      {
        "id": "68e82a9c-5b0d-497c-98e7-2e8af13100e0",
        "maintenance": true
      }
      {
        "code": "HOST_NOT_ENOUGH_FREE_MEMORY",
        "params": [
          "OpaqueRef:9f140062-18cd-1d9b-1980-35f9c5fa2b7b"
        ],
        "task": {
          "uuid": "fbfbd332-1e40-5b13-36e1-379e9914f6c5",
          "name_label": "Async.host.evacuate",
          "name_description": "",
          "allowed_operations": [],
          "current_operations": {},
          "created": "20250113T20:24:23Z",
          "finished": "20250113T20:24:23Z",
          "status": "failure",
          "resident_on": "OpaqueRef:f557ff05-15ae-ed72-07cd-0837ae050369",
          "progress": 1,
          "type": "<none/>",
          "result": "",
          "error_info": [
            "HOST_NOT_ENOUGH_FREE_MEMORY",
            "OpaqueRef:9f140062-18cd-1d9b-1980-35f9c5fa2b7b"
          ],
          "other_config": {},
          "subtask_of": "OpaqueRef:NULL",
          "subtasks": [],
          "backtrace": "(((process xapi)(filename ocaml/xapi/xapi_host.ml)(line 614))((process xapi)(filename hashtbl.ml)(line 159))((process xapi)(filename hashtbl.ml)(line 165))((process xapi)(filename hashtbl.ml)(line 170))((process xapi)(filename ocaml/xapi/xapi_host.ml)(line 610))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/rbac.ml)(line 191))((process xapi)(filename ocaml/xapi/rbac.ml)(line 200))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 75)))"
        },
        "message": "HOST_NOT_ENOUGH_FREE_MEMORY(OpaqueRef:9f140062-18cd-1d9b-1980-35f9c5fa2b7b)",
        "name": "XapiError",
        "stack": "XapiError: HOST_NOT_ENOUGH_FREE_MEMORY(OpaqueRef:9f140062-18cd-1d9b-1980-35f9c5fa2b7b)
          at Function.wrap (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_XapiError.mjs:16:12)
          at default (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/_getTaskResult.mjs:13:29)
          at Xapi._addRecordToCache (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1047:24)
          at file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1081:14
          at Array.forEach (<anonymous>)
          at Xapi._processEvents (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1071:12)
          at Xapi._watchEvents (file:///usr/local/lib/node_modules/xo-server/node_modules/xen-api/index.mjs:1244:14)"
      }
      
      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Hi,

        This means all VMs cannot be evacuated to another host because the other hosts cannot contains all the VMs in their own memory/RAM.

        DanpD P 2 Replies Last reply Reply Quote 0
        • DanpD Offline
          Danp Pro Support Team @olivierlambert
          last edited by

          Could also be due to this -- https://github.com/xapi-project/xen-api/issues/4323

          olivierlambert created this issue in xapi-project/xen-api

          open Mysterious failure in HA with HOST_NOT_ENOUGH_FREE_MEMORY #4323

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            If HA is enabled, yes.

            1 Reply Last reply Reply Quote 0
            • D Offline
              DustinB
              last edited by

              In any scenario, the error message isn't clear and doesn't help to address the issue. As @Danp pointed to the issue created by @olivierlambert should at least better define the root issue.

              1 Reply Last reply Reply Quote 0
              • P Offline
                pmcgrail @olivierlambert
                last edited by

                @olivierlambert said in Server will not migrate VMs to enter maintenance mode:

                host
                I have three hosts with 1.5 TB of Memory, I have 2 VMs running on the hosts using less then 10 GB of Ram, so memory is not the issue.

                I can manually migrate the VM and the host will go into maintenance mode.

                The error is bogus, the issue may be more related to the XO VM running on the host and the host fails to suspend the VMS on the host.

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  The error is coming from XCP-ng (it's in all caps), not from XO.

                  1 Reply Last reply Reply Quote 0
                  • DanpD Offline
                    Danp Pro Support Team
                    last edited by

                    Unsure why this detail wasn't included here, but his support ticket indicates that HA is enabled on the pool.

                    P 1 Reply Last reply Reply Quote 0
                    • P Offline
                      pmcgrail @Danp
                      last edited by

                      @Danp OK, so here is the situation....

                      Manual Migrations work manually regardless of the VM HA state...

                      Auto-Migration fails if anything but Restart is selected for the HA Mode.

                      Cluster in HA, VMs in best effort HA mode - Memory Error is thrown
                      Cluster in HA, VMs in disabled HA mode - Memory Error is thrown

                      Cluster in HA, VMs in restart HA mode - No Memory Error is thrown

                      1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by

                        Last test: cluster without HA.

                        P 1 Reply Last reply Reply Quote 0
                        • P Offline
                          pmcgrail @olivierlambert
                          last edited by

                          @olivierlambert
                          OK, so with both VMs set to Best Effort and HA disabled on the Pool the error does not occur.

                          If the pool is set to HA the VMs need to be set to Restart the error occurs

                          Only one setting works when the Pool in HA Mode, the VM Set to restart.

                          If I disable the HA Pool setting the VMs migrate as needed and no errors occurs regardless of the VMs HA Settings.

                          P 1 Reply Last reply Reply Quote 0
                          • P Offline
                            pmcgrail @pmcgrail
                            last edited by

                            @pmcgrail said in Server will not migrate VMs to enter maintenance mode:

                            If the pool is set to HA the VMs need to be set to Restart the error occurs
                            If the pool is set to HA the VMs need to be set to Restart or the error occurs

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Offline
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              So it's clearly https://github.com/xapi-project/xen-api/issues/4323 as @Danp suggested.

                              olivierlambert created this issue in xapi-project/xen-api

                              open Mysterious failure in HA with HOST_NOT_ENOUGH_FREE_MEMORY #4323

                              1 Reply Last reply Reply Quote 0
                              • olivierlambertO Offline
                                olivierlambert Vates 🪐 Co-Founder CEO
                                last edited by olivierlambert

                                I just sent a precision upstream with your report @pmcgrail

                                1 Reply Last reply Reply Quote 0
                                • I Offline
                                  idar21
                                  last edited by

                                  Apologies for jumping in, but what are the plans to resolve this issue. Any update will be appreciated.

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    Hi,

                                    What's your issue functionally speaking?

                                    I 1 Reply Last reply Reply Quote 0
                                    • I Offline
                                      idar21 @olivierlambert
                                      last edited by

                                      @olivierlambert - With the host not able to evacuate, i will have to manually move VMs around to other hosts in the pool and then perform maintenance on the host. Imagine you have to do this for few hundred VMs and multiple physical hosts.

                                      Also, the issue is not just the evacuate, if the pool ha is enabled and you disable all the ha property for all VMs, it allows to put the host in maint fine. But when i tried to disable the maint mode, i got this:
                                      "code": "HA_OPERATION_WOULD_BREAK_FAILOVER_PLAN"

                                      So, i disabled ha on the pool, then the disable maint on the host worked fine.

                                      I think whole HA needs to be fully validated and every aspect needs to cross checked, otherwise i dont think its production ready.

                                      For smaller environments, it might not be too much of a pain but even a medium environment this needs to be fixed.

                                      1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Offline
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        Please open a support ticket, obviously for a large infrastructure that would be logical to be sure it's not behaving like that or to make sure XO disable things in the correct order before trying to evacuate a host 🙂

                                        1 Reply Last reply Reply Quote 0
                                        • First post
                                          Last post