Something went wrong with a (xeno4) host and it rebooted. After reboot it is behaving weirdly. Rebooting again does not resolve the issue.
Attempting to start a vm with xostor vdi results in the following
vm.start
{
"id": "3db40547-fcbf-35b1-4f1d-fc29ca851a57",
"bypassMacAddressesCheck": false,
"force": false,
"host": "3aa66f69-ea6f-465a-83a7-c2c1c43eb3e3"
}
{
"code": "SR_BACKEND_FAILURE_1200",
"params": [
"",
"[Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0'",
""
],
"task": {
"uuid": "484997e0-b959-3f38-5711-0e1f14031fea",
"name_label": "Async.VM.start_on",
"name_description": "",
"allowed_operations": [],
"current_operations": {},
"created": "20260511T18:43:18Z",
"finished": "20260511T18:44:44Z",
"status": "failure",
"resident_on": "OpaqueRef:1f61b22b-05b3-4724-9805-284d1079c6f7",
"progress": 1,
"type": "<none/>",
"result": "",
"error_info": [
"SR_BACKEND_FAILURE_1200",
"",
"[Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0'",
""
],
"other_config": {
"debug_info:cancel_points_seen": "1"
},
"subtask_of": "OpaqueRef:NULL",
"subtasks": [],
"backtrace": "(((process xapi)(filename ocaml/xapi-client/client.ml)(line 7))((process xapi)(filename ocaml/xapi-client/client.ml)(line 19))((process xapi)(filename ocaml/xapi-client/client.ml)(line 7879))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 144))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 1990))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 24))((process xapi)(filename ocaml/libs/xapi-stdext/lib/xapi-stdext-pervasives/pervasiveext.ml)(line 39))((process xapi)(filename ocaml/xapi/message_forwarding.ml)(line 1974))((process xapi)(filename ocaml/xapi/rbac.ml)(line 228))((process xapi)(filename ocaml/xapi/rbac.ml)(line 238))((process xapi)(filename ocaml/xapi/server_helpers.ml)(line 78)))"
},
"message": "SR_BACKEND_FAILURE_1200(, [Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0', )",
"name": "XapiError",
"stack": "XapiError: SR_BACKEND_FAILURE_1200(, [Errno 30] Read-only file system: '/dev/drbd/by-res/xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c/0', )
at XapiError.wrap (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/_XapiError.mjs:16:12)
at default (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/_getTaskResult.mjs:13:29)
at Xapi._addRecordToCache (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1078:24)
at file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1112:14
at Array.forEach (<anonymous>)
at Xapi._processEvents (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1102:12)
at Xapi._watchEvents (file:///opt/xo/xo-builds/xen-orchestra-202605041856/packages/xen-api/index.mjs:1275:14)"
}
However another vm with a xostor vdi started
[image: 1778525823776-7af8a0b6-3276-4fd0-81f0-f669ff93d5aa-image-resized.jpeg]
When I look at that resource in linstor/xostor
jonathon@jonathon-framework:~$ linstor --controllers=10.2.0.11 r l | grep -e 'xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c'
| xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c | ovbh-pprod-xen01 | DRBD,STORAGE | Unused | Connecting(ovbh-pprod-xen04) | UpToDate | 2025-05-23 13:49:57 |
| xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c | ovbh-pprod-xen02 | DRBD,STORAGE | Unused | Connecting(ovbh-pprod-xen04) | UpToDate | 2025-05-23 13:49:57 |
| xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c | ovbh-pprod-xen04 | DRBD,STORAGE | Unused | StandAlone(ovbh-pprod-xen02,ovbh-pprod-xen01) | UpToDate | 2025-05-23 13:49:57 |
Restarting the satellite on xen04 does not help.
jonathon@jonathon-framework:~$ linstor --controllers=10.2.0.11 n l
╭──────────────────────────────────────────────────────────────────────────────────────────╮
┊ Node ┊ NodeType ┊ Addresses ┊ State ┊
╞══════════════════════════════════════════════════════════════════════════════════════════╡
┊ ovbh-pprod-xen01 ┊ COMBINED ┊ 10.2.0.10:3366 (PLAIN) ┊ Online ┊
┊ ovbh-pprod-xen02 ┊ COMBINED ┊ 10.2.0.11:3366 (PLAIN) ┊ Online ┊
┊ ovbh-pprod-xen03 ┊ COMBINED ┊ 10.2.0.12:3366 (PLAIN) ┊ Online ┊
┊ ovbh-pprod-xen04 ┊ COMBINED ┊ 10.2.0.13:3366 (PLAIN) ┊ Online ┊
┊ ovbh-pprod-xen05 ┊ COMBINED ┊ 10.2.0.14:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-k8s01-worker01.example.com ┊ SATELLITE ┊ 10.1.8.103:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-k8s01-worker02.example.com ┊ SATELLITE ┊ 10.1.8.104:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-k8s01-worker03.example.com ┊ SATELLITE ┊ 10.1.8.105:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-k8s01-worker10.example.com ┊ SATELLITE ┊ 10.1.8.112:3366 (PLAIN) ┊ OFFLINE ┊
┊ ovbh-vprod-k8s01-worker13.example.com ┊ SATELLITE ┊ 10.1.8.115:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-rancher01.example.com ┊ SATELLITE ┊ 10.1.8.41:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-rancher02.example.com ┊ SATELLITE ┊ 10.1.8.42:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vprod-rancher03.example.com ┊ SATELLITE ┊ 10.1.8.43:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vtest-k8s01-worker01.example.com ┊ SATELLITE ┊ 10.1.8.64:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vtest-k8s01-worker02.example.com ┊ SATELLITE ┊ 10.1.8.65:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vtest-k8s01-worker03.example.com ┊ SATELLITE ┊ 10.1.8.66:3366 (PLAIN) ┊ Online ┊
┊ ovbh-vtest-k8s01-worker04.example.com ┊ SATELLITE ┊ 10.1.8.60:3366 (PLAIN) ┊ OFFLINE ┊
┊ ovbh-vtest-k8s01-worker05.example.com ┊ SATELLITE ┊ 10.1.8.59:3366 (PLAIN) ┊ Online ┊
╰──────────────────────────────────────────────────────────────────────────────────────────╯
Looking at logs on xen04
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( StandAlone -> Unconnected ) [connect]
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Starting receiver thread (peer-node-id 0)
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( Unconnected -> Connecting ) [connecting]
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( StandAlone -> Unconnected ) [connect]
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Starting receiver thread (peer-node-id 1)
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( Unconnected -> Connecting ) [connecting]
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Handshake to peer 0 successful: Agreed network protocol version 123
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Feature flags enabled on protocol level: 0x7f TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES RESYNC_DAGTAG
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: expected AuthChallenge packet, received: P_PROTOCOL (0x000b)
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Authentication of peer failed
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( Connecting -> Disconnecting )
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Terminating sender thread
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Starting sender thread (peer-node-id 0)
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Handshake to peer 1 successful: Agreed network protocol version 123
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Feature flags enabled on protocol level: 0x7f TRIM THIN_RESYNC WRITE_SAME WRITE_ZEROES RESYNC_DAGTAG
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: expected AuthChallenge packet, received: P_PROTOCOL (0x000b)
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Authentication of peer failed
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( Connecting -> Disconnecting )
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Terminating sender thread
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Starting sender thread (peer-node-id 1)
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Connection closed
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: helper command: /sbin/drbdadm disconnected
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Connection closed
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: helper command: /sbin/drbdadm disconnected
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: helper command: /sbin/drbdadm disconnected exit code 0
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: conn( Disconnecting -> StandAlone ) [disconnected]
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen01: Terminating receiver thread
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: helper command: /sbin/drbdadm disconnected exit code 0
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: conn( Disconnecting -> StandAlone ) [disconnected]
[Mon May 11 00:46:28 2026] drbd xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c ovbh-pprod-xen02: Terminating receiver thread
[00:44 ovbh-pprod-xen04 ~]# drbdadm status xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c
xcp-volume-dc9de686-850a-4c6f-896f-0dacc2e1e35c role:Secondary suspended:quorum
disk:UpToDate quorum:no open:no blocked:upper
ovbh-pprod-xen01 connection:StandAlone
ovbh-pprod-xen02 connection:StandAlone
I did update all servers to this patch https://xcp-ng.org/blog/2026/05/05/april-2026-security-and-maintenance-updates-for-xcp-ng-8-3-lts-2/
And everything got restarted and was happy. Shortly after I saw this, https://xcp-ng.org/blog/2026/05/07/may-2026-updates-2-for-xcp-ng-8-3-lts/, and installed it to all hosts.
[image: 1778526243545-b6333da8-1667-4863-9a15-1452d9803dd0-image-resized.jpeg]
xen01 is the current master
[image: 1778526265248-61a98b9d-ac8d-4493-9e59-dcf5883a2a0b-image-resized.jpeg]
Has anyone seen this before?