Live Migration in XO Fails
-
I'm having an issue with XO (built from source) where live migrations are failing. In XO, when I select a test VM and the target host, it fails after just a second or two. However, it works perfectly if I do it from the command line while SSH'd into the pool master. The command I'm using is:
[22:06 xcp01 ~]# xe vm-migrate uuid=406bc5e7-e814-dc16-780e-adfc2635dfbe host-uuid=2133772d-f69e-4930-980e-583e81e0afb8 [22:07 xcp01 ~]#
The full details of the error are:
vm.migrate { "vm": "406bc5e7-e814-dc16-780e-adfc2635dfbe", "migrationNetwork": "76cfdb59-4a35-9d50-6d86-99d68317d61c", "targetHost": "2133772d-f69e-4930-980e-583e81e0afb8" } { "code": "SR_BACKEND_FAILURE_202", "params": [ "", "General backend error [opterr=rc: 21, stdout: , stderr: iscsiadm: No records found ]", "" ], "task": { "uuid": "f3e2ae4b-890b-4d1b-ee11-36d151482a0a", "name_label": "Async.VM.migrate_send", "name_description": "", "allowed_operations": [], "current_operations": {}, "created": "20240528T02:01:51Z", "finished": "20240528T02:01:55Z", "status": "failure", "resident_on": "OpaqueRef:cbbc463f-6d3d-4693-b5fe-333944df6766", "progress": 1, "type": "<none/>", "result": "", "error_info": [ "SR_BACKEND_FAILURE_202", "", "General backend error [opterr=rc: 21, stdout: , stderr: iscsiadm: No records found ]", "" ], "other_config": {}, "subtask_of": "OpaqueRef:NULL", "subtasks": [], "backtrace": "(((process xapi)(filename ocaml/xapi/helpers.ml)(line 1690))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 134))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 35))((process xapi)(filename lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/xapi/rbac.ml)(line 205))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 95)))" }, "message": "SR_BACKEND_FAILURE_202(, General backend error [opterr=rc: 21, stdout: , stderr: iscsiadm: No records found ], )", "name": "XapiError", "stack": "XapiError: SR_BACKEND_FAILURE_202(, General backend error [opterr=rc: 21, stdout: , stderr: iscsiadm: No records found ], ) at Function.wrap (file:///data/xo/xo-builds/xen-orchestra-202405272127/packages/xen-api/_XapiError.mjs:16:12) at default (file:///data/xo/xo-builds/xen-orchestra-202405272127/packages/xen-api/_getTaskResult.mjs:11:29) at Xapi._addRecordToCache (file:///data/xo/xo-builds/xen-orchestra-202405272127/packages/xen-api/index.mjs:1035:24) at file:///data/xo/xo-builds/xen-orchestra-202405272127/packages/xen-api/index.mjs:1069:14 at Array.forEach (<anonymous>) at Xapi._processEvents (file:///data/xo/xo-builds/xen-orchestra-202405272127/packages/xen-api/index.mjs:1059:12) at Xapi._watchEvents (file:///data/xo/xo-builds/xen-orchestra-202405272127/packages/xen-api/index.mjs:1232:14) at runNextTicks (node:internal/process/task_queues:60:5) at processImmediate (node:internal/timers:447:9) at process.callbackTrampoline (node:internal/async_hooks:128:17)" }
The VM is Ubuntu 22, does have XenTools installed, and has been rebooted recently (earlier this afternoon).
Any ideas?
-
Hi,
I have some doubt the same command will work with xe and fails with XO while the error message is coming from XCP-ng directly (and its storage stack). Restart iscsiadm that should do the trick
-
@olivierlambert Do you mean restart the iscsid service on the XCP host?
-
Yes
-
@olivierlambert Sorry, same error. I made sure there were no VM's on 2 different hosts, then restarted iscsid on both, then (via CLI) moved one VM back on. Then I tried migrating it from XO, and got the same error. I also made sure XO was updated to the latest stable release.
Random question, does XO need to be on the same subnet (or broadcast network) as the XCP hosts?
-
@omatsei I found the following error on the source host, if it helps. I rebooted it and restarted iscsid on both the source and destination hosts, just to make sure nothing was pending or hung.
May 28 10:15:32 xcp09 xapi: [error||2507 ||backtrace] SR.scan D:9f4f3c05cc88 failed with exception Storage_error ([S(Redirect);[S(192.168.1.201)]]) May 28 10:15:32 xcp09 xapi: [error||2507 ||backtrace] Raised Storage_error ([S(Redirect);[S(192.168.1.201)]]) May 28 10:15:32 xcp09 xapi: [error||2507 ||backtrace] 1/1 xapi Raised at file (Thread 2507 has no backtrace table. Was with_backtraces called?, line 0 May 28 10:15:32 xcp09 xapi: [error||2507 ||backtrace] May 28 10:15:32 xcp09 xapi: [error||2507 ||storage_interface] Storage_error ([S(Redirect);[S(192.168.1.201)]]) (File "storage/storage_interface.ml", line 436, characters 51-58) May 28 10:15:32 xcp09 xapi: [error||2506 HTTP 127.0.0.1->:::80|Querying services D:6b15aa4c5bcd|storage_interface] Storage_error ([S(Redirect);[S(192.168.1.201)]]) (File "storage/storage_interface.ml", line 431, characters 49-56) May 28 10:15:32 xcp09 xapi: [error||2506 HTTP 127.0.0.1->:::80|Querying services D:6b15aa4c5bcd|storage_interface] Storage_error ([S(Redirect);[S(192.168.1.201)]]) (File "storage/storage_interface.ml", line 436, characters 51-58)
Note that 192.168.1.201 is the pool master. I ended up rebooting the pool master after manually migrating VM's off it, and it seems to have fixed the issue. No idea why, but whatever.