Thank you for the replies
Sorry for all the newb questions - I'm diving into this when time permits. Appreciate the help and understanding.
Thank you for the replies
Sorry for all the newb questions - I'm diving into this when time permits. Appreciate the help and understanding.
@olivierlambert said in XOSTOR hyperconvergence preview:
So I imagine a very low latency between the 2 DCs? One pool with 6 hosts total and 3 per DC right?
For now, there's no placement preference, we need to discuss with LINBIT about topology.
And if the 2x DCs are far each other, I would advice to get 2x pools and use 2x XOSTOR total
This can be done using placement policies as outlined in the LINSTOR users guide. It will probably require a bit of extra work on XO to use those properties
I did those commands on xcp1 (pool master) and on the SR that was XOSTOR (linstor) and powered off xcp2. At that point the pool disappeared.
Now I'm getting the following on the xcp servers console:
Broadcast message from systemd-journald@xcp3 (Thu 2024-02-08 14:03:12 EST):
xapi-nbd[5580]: main: Failed to log in via xapi's Unix domain socket in 300.000000 seconds
Broadcast message from systemd-journald@xcp3 (Thu 2024-02-08 14:03:12 EST):
xapi-nbd[5580]: main: Caught unexpected exception: (Failure
Broadcast message from systemd-journald@xcp3 (Thu 2024-02-08 14:03:12 EST):
xapi-nbd[5580]: main: "Failed to log in via xapi's Unix domain socket in 300.000000 seconds")
After powering up xcp2 the pool never comes back in the XOA interface.
I'm seeing this on
xcp1:
[14:04 xcp1 ~]# drbdadm status
xcp-persistent-database role:Secondary
disk:Diskless quorum:no
xcp2 connection:Connecting
xcp3 connection:Connecting
xcp2 and 3
[14:10 xcp2 ~]# drbdadm status
# No currently configured DRBD found.
Seems like I hosed this thing up really good. I assume this broke because XOSTOR isn't a shared disk technically.
[14:15 xcp1 /]# xe sr-list
The server could not join the liveset because the HA daemon could not access the heartbeat disk.
Is HA + XOSTOR something that should work?
Thank you for the replies
Sorry for all the newb questions - I'm diving into this when time permits. Appreciate the help and understanding.
I'm not sure what the expected behavior is but....
I have xcp1, xcp2, xcp3 as hosts in my XOSTOR pool, using an XOSTOR repository. I had a VM running on xcp2, unplugged the power from it and left it uplugged for about 5 minutes. The VM remained "running" according to XOA, however it wasn't.
What is the expected behavior when this happens and how do you go about recovering from a temporarily failed/powered off node?
My expectation was that my vm would move to xcp1 (where there is a replica) and start, then outdate xcp2. I have "auto start" enabled under advanced on the VM.
Thanks for the replies. My issues are currently with the GUI so I don't know if that applies here. This is all from the GUI, so please let me know if that's outside the scope of this post and I can post elsewhere.
One issue is upon creating a new XOSTOR SR, the packages are installed, however the SR creation fails due to one of the package, sm-rawhba, that needs updating. You have to apply patched through the GUI then reboot the node, or execute "xe-restart-toolstack" on each node. You can then go back and create a new SR, but only after wiping the disks that you originally tried to create the SR on; vgremove and pvremove.
I'm planning on doing some more testing, please let me know if GUI issues are appropriate to post here.
This thread has grown quite large and has a lot of information in it. Is there an official documentation chapter on XOSTOR available anywhere?