@ronan-a
Hi,
I did some experiment... and cannot get the missing piece of the puzzle to add my xcp-ng-04 host.
From my last status, I have my new xcp-ng-04 host part of the pool and I have installed all the tools for the linstor.
I checked the services for satellite and controller on the xcp-ng-04 and they are not running. I have no Idea if I need to start something manualy of not.
Here my SMlog on a freshly booted xcp-ng-04
Mar 22 11:54:14 xcp-ng-04 SM: [2685] sr_attach {'sr_uuid': '38c2baa3-bc8f-fbc5-ef5a-42461db92c51', 'subtask_of': 'DummyRef:|b4ea936f-80bb-4a6b-98b9-66a95f86b006|SR.attach', 'args': [],$
Mar 22 11:54:14 xcp-ng-04 SMGC: [2685] === SR 38c2baa3-bc8f-fbc5-ef5a-42461db92c51: abort ===
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: opening lock file /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/running
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: opening lock file /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/gc_active
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: tried lock /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/gc_active, acquired: True (exists: True)
Mar 22 11:54:14 xcp-ng-04 SMGC: [2685] abort: releasing the process lock
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: released /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/gc_active
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: opening lock file /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/sr
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: acquired /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/running
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: acquired /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/sr
Mar 22 11:54:14 xcp-ng-04 SM: [2685] RESET for SR 38c2baa3-bc8f-fbc5-ef5a-42461db92c51 (master: True)
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: released /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/sr
Mar 22 11:54:14 xcp-ng-04 SM: [2685] lock: released /var/lock/sm/38c2baa3-bc8f-fbc5-ef5a-42461db92c51/running
Mar 22 11:54:14 xcp-ng-04 SM: [2685] set_dirty 'OpaqueRef:ea31ae92-4207-4c8b-8db6-da901c6a00a8' succeeded
Mar 22 11:54:14 xcp-ng-04 SM: [2709] sr_update {'sr_uuid': '38c2baa3-bc8f-fbc5-ef5a-42461db92c51', 'subtask_of': 'DummyRef:|ae4505d5-b9fe-4f39-92dc-67ab9d0b579b|SR.stat', 'args': [], '$
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: acquired /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/sr
Mar 22 11:54:15 xcp-ng-04 SM: [2725] sr_attach {'sr_uuid': 'bef191f3-e976-94ec-6bb7-d87529a72dbb', 'subtask_of': 'DummyRef:|ae624d96-e91a-4a12-afd6-35593be4ce51|SR.attach', 'args': [],$
Mar 22 11:54:15 xcp-ng-04 SMGC: [2725] === SR bef191f3-e976-94ec-6bb7-d87529a72dbb: abort ===
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: opening lock file /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: tried lock /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active, acquired: True (exists: True)
Mar 22 11:54:15 xcp-ng-04 SMGC: [2725] abort: releasing the process lock
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/gc_active
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: acquired /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
Mar 22 11:54:15 xcp-ng-04 SM: [2725] RESET for SR bef191f3-e976-94ec-6bb7-d87529a72dbb (master: False)
Mar 22 11:54:15 xcp-ng-04 SM: [2725] lock: released /var/lock/sm/bef191f3-e976-94ec-6bb7-d87529a72dbb/running
Mar 22 11:54:16 xcp-ng-04 SM: [2725] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 0
Mar 22 11:54:19 xcp-ng-04 SM: [2725] Got exception: Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']. Retry number: 1
On xcp-ng-02 (linstor controller)
[11:42 xcp-ng-02 ~]# linstor node list
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā®
ā Node ā NodeType ā Addresses ā State ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā”
ā xcp-ng-01 ā COMBINED ā 192.168.2.221:3366 (PLAIN) ā Online ā
ā xcp-ng-02 ā COMBINED ā 192.168.2.222:3366 (PLAIN) ā Online ā
ā xcp-ng-03 ā COMBINED ā 192.168.2.223:3366 (PLAIN) ā Online ā
ā°āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāÆ
I list all the linstor PBD on hosts:
[11:49 xcp-ng-01 ~]# xe pbd-list | grep linstor -3
uuid ( RO) : e75fdc51-29a4-aa57-bc44-459f80a0d230
host-uuid ( RO): ad95c6ca-612d-42af-8909-d4e9dc7645bb
sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
device-config (MRO): provisioning: thin; redundancy: 2; group-name: linstor_group/thin_device; hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03
currently-attached ( RO): true
--
uuid ( RO) : 063db650-55b1-a3e0-9d9a-e94ce938988d
host-uuid ( RO): 5747f145-0dc2-4987-a6b9-b6c5a7ed0505
sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
device-config (MRO): hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03; group-name: linstor_group/thin_device; redundancy: 2; provisioning: thin
currently-attached ( RO): false
--
uuid ( RO) : a1f876b1-0568-71ac-9ffb-720e626cb4ab
host-uuid ( RO): e286a04a-69bf-4d59-a0c8-e7338e8c1831
sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
device-config (MRO): provisioning: thin; redundancy: 2; group-name: linstor_group/thin_device; hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03
currently-attached ( RO): true
uuid ( RO) : 08564ab5-a518-f709-8527-f592c2592d14
host-uuid ( RO): eb48f91d-9916-4542-9cf4-4a718abdc451
sr-uuid ( RO): bef191f3-e976-94ec-6bb7-d87529a72dbb
device-config (MRO): provisioning: thin; redundancy: 2; group-name: linstor_group/thin_device; hosts: xcp-ng-01,xcp-ng-02,xcp-ng-03
currently-attached ( RO): true
After that, I remove all the PBD to all hosts to be able to recreate the PBDs on all hosts with the new xcp-ng-04
xe pbd-create host-uuid=e286a04a-69bf-4d59-a0c8-e7338e8c1831 sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
xe pbd-create host-uuid=ad95c6ca-612d-42af-8909-d4e9dc7645bb sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
xe pbd-create host-uuid=eb48f91d-9916-4542-9cf4-4a718abdc451 sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
xe pbd-create host-uuid=5747f145-0dc2-4987-a6b9-b6c5a7ed0505 sr-uuid=bef191f3-e976-94ec-6bb7-d87529a72dbb device-config:provisioning=thin device-config:redundancy=2 device-config:group-name=linstor_group/thin_device device-config:hosts=xcp-ng-01,xcp-ng-02,xcp-ng-03,xcp-ng-04
all succeed. No error.
After that I try to connect all PBDs to the SR
[11:13 xcp-ng-01 ~]# xe pbd-plug uuid=7d588c37-a152-9666-175e-91b2d48c150f
[11:13 xcp-ng-01 ~]# xe pbd-plug uuid=99f76235-1b1a-e5fa-bb19-3883737fcc6d
[11:13 xcp-ng-01 ~]# xe pbd-plug uuid=df727345-f475-b929-ecc1-b506f0053361
[11:13 xcp-ng-01 ~]# xe pbd-plug uuid=8b4ddc6f-e25a-1942-a435-345ccc93551a
Error code: SR_BACKEND_FAILURE_47
Error parameters: , The SR is not available [opterr=Error: Unable to connect to any of the given controller hosts: ['linstor://xcp-ng-02']],
From there, I'm a bit lost...
1- Do I need to add the linstor satellite xcp-ng-04 before doing all the PBDs ?
2- Should I start any of the services on xcp-ng-04 before doing all this ?
Regards