XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Maintenance mode auto disabled after some time

    Scheduled Pinned Locked Moved Management
    13 Posts 3 Posters 773 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • nick.lloydN Offline
      nick.lloyd @Tristis Oris
      last edited by

      @Tristis-Oris Do you have HA enabled? Also, I believe that maintenance mode is supposed to be automatically disabled after a reboot

      Tristis OrisT 1 Reply Last reply Reply Quote 0
      • Tristis OrisT Offline
        Tristis Oris Top contributor @nick.lloyd
        last edited by

        @nick-lloyd no HA.

        So that a part of rolling update logic. In that case mode should be disabled after reboot, yes.

        For real hardware maintenance, best way is disable plugin, migrate VM once and then do what i need.

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          You need to disable the load balancer plugin and/or HA if you have any of the two. That's what RPU does behind the scene 🙂

          Tristis OrisT 1 Reply Last reply Reply Quote 0
          • Tristis OrisT Offline
            Tristis Oris Top contributor @olivierlambert
            last edited by

            @olivierlambert did more tests.

            Balancer plugin is disabled. Pool with 3 hosts, i enabled mode on 2. After 20-30min mode disabled on 1 hosts, and 5min later on 2nd, right after 1 VM migration completed.
            VMs not evacuated, only about 1-2 from each host.

            Log created for both hosts at the time when mode been disabled.

            host.setMaintenanceMode
            {
              "id": "85630f99-fabf-4a3c-a75b-b6a7fea42c81",
              "maintenance": true
            }
            {
              "code": "VM_SUSPEND_TIMEOUT",
              "params": [
                "OpaqueRef:7937dfa9-50d9-411e-a1de-f3b51d4763be",
                "1200."
              ],
              "task": {
                "uuid": "36aebea0-8422-68a2-cce9-e88f7e7fbf33",
                "name_label": "Async.host.evacuate",
                "name_description": "",
                "allowed_operations": [],
                "current_operations": {},
                "created": "20240828T07:40:27Z",
                "finished": "20240828T08:02:05Z",
                "status": "failure",
                "resident_on": "OpaqueRef:f4ca6efc-e3c3-4eb2-92d7-e431a7376162",
                "progress": 1,
                "type": "<none/>",
                "result": "",
                "error_info": [
                  "VM_SUSPEND_TIMEOUT",
                  "OpaqueRef:7937dfa9-50d9-411e-a1de-f3b51d4763be",
                  "1200."
                ],
                "other_config": {},
                "subtask_of": "OpaqueRef:NULL",
                "subtasks": [],
                "backtrace": "(((process xapi)(filename ocaml/xapi-client/client.ml)(line 7))((process xapi)(filename ocaml/xapi-client/client.ml)(line 19))((process xapi)(filename ocaml/xapi-client/client.ml)(line 6172))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/xapi_host.ml)(line 612))((process xapi)(filename ocaml/xapi/xapi_host.ml)(line 621))((process xapi)(filename hashtbl.ml)(line 266))((process xapi)(filename hashtbl.ml)(line 272))((process xapi)(filename hashtbl.ml)(line 277))((process xapi)(filename ocaml/xapi/xapi_host.ml)(line 629))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))"
              },
              "message": "VM_SUSPEND_TIMEOUT(OpaqueRef:7937dfa9-50d9-411e-a1de-f3b51d4763be, 1200.)",
              "name": "XapiError",
              "stack": "XapiError: VM_SUSPEND_TIMEOUT(OpaqueRef:7937dfa9-50d9-411e-a1de-f3b51d4763be, 1200.)
                at Function.wrap (file:///opt/xo/xo-builds/xen-orchestra-202408271341/packages/xen-api/_XapiError.mjs:16:12)
                at default (file:///opt/xo/xo-builds/xen-orchestra-202408271341/packages/xen-api/_getTaskResult.mjs:13:29)
                at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202408271341/packages/xen-api/index.mjs:1041:24)
                at file:///opt/xo/xo-builds/xen-orchestra-202408271341/packages/xen-api/index.mjs:1075:14
                at Array.forEach (<anonymous>)
                at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202408271341/packages/xen-api/index.mjs:1065:12)
                at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202408271341/packages/xen-api/index.mjs:1238:14)"
            }
            
            1 Reply Last reply Reply Quote 0
            • olivierlambertO Offline
              olivierlambert Vates 🪐 Co-Founder CEO
              last edited by

              @Tristis-Oris said in Maintenance mode auto disabled after some time:

              VM_SUSPEND_TIMEOUT

              This means we couldn't migrate the VM, because the VM couldn't suspend. Fix that first and the rest should work.

              Tristis OrisT 1 Reply Last reply Reply Quote 0
              • Tristis OrisT Offline
                Tristis Oris Top contributor @olivierlambert
                last edited by Tristis Oris

                @olivierlambert hmm. Why it can't suspend? As i understand that a last step of migration, when data already transefered. i never get such problem with before.
                That a huge docker vm, which is really heavy and need a lot time to shutdown. And it still ongoing migration - task not interrupted, but maintanance is already canceled.

                I'm still sure it have same 20-30min timeout as for rolling update.

                1 Reply Last reply Reply Quote 0
                • Tristis OrisT Offline
                  Tristis Oris Top contributor
                  last edited by

                  well migration going not very fine, but services still working.
                  92748428-8204-4f4a-bc98-0c05d31c96bc-image.png

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by olivierlambert

                    So your problem is the following: when you start a migration, Xen will "deflate" the VM by shrinking its memory to "dynamic min value". If you don't have enough memory for your system to run on dynamic min, then the migration will fail. Raise your dynamic min to a higher value to solve this.

                    Tristis OrisT 1 Reply Last reply Reply Quote 0
                    • Tristis OrisT Offline
                      Tristis Oris Top contributor @olivierlambert
                      last edited by

                      @olivierlambert hm. i never changed this value, but database VM with high memory load always migrated fine.
                      1fcbafe2-e164-49e0-8c32-b981d702457f-image.png

                      so i just need to increase min memory?

                      1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by olivierlambert

                        Well, if you are already at 32/32/32 it just means you don't have enough RAM in your guest at the moment, you need more. You can see the OOM killer trying to keep up, in short your guest OS isn't sized properly.

                        Tristis OrisT 1 Reply Last reply Reply Quote 0
                        • Tristis OrisT Offline
                          Tristis Oris Top contributor @olivierlambert
                          last edited by

                          @olivierlambert this is DB vm. always migrated fine without problem.
                          65f1b385-9826-4f94-bcb0-3bdb45892ea4-image.png

                          docker vm can't
                          cec34af7-43cb-4a63-8121-5adf0bfc8a7e-image.png

                          i do not understand something.

                          1 Reply Last reply Reply Quote 0
                          • olivierlambertO Offline
                            olivierlambert Vates 🪐 Co-Founder CEO
                            last edited by

                            This is an issue inside your guest, so take time to read the logs and try to make sense of it. It's possible you don't have enough resources in the VM to make the migration, but from "outside" it's hard to tell.

                            1 Reply Last reply Reply Quote 0
                            • First post
                              Last post