Hello @ronan-a
I will reproduce the case, i will re-destroy one hypervisor and retrigger the case.
Thank you @ronan-a et @olivierlambert
If you need me to tests some special case don't hesit, we have a pool dedicated for this
Hello @ronan-a
I will reproduce the case, i will re-destroy one hypervisor and retrigger the case.
Thank you @ronan-a et @olivierlambert
If you need me to tests some special case don't hesit, we have a pool dedicated for this
Hello, @ronan-a
I will reinstall my hypervisor this week.
I will reproduce it and then, resend you the logs.
Bonne journée,
Hello, @DustinB
The https://vates.tech/xostor/ says:
The maximum size of any single Virtual Disk Image (VDI) will always be limited by the smallest disk in your cluster.
But in this case, maybe it can be stored in the "2TB disks" ? Maybe others can answer, i didn't test it.
This test permit to cover the following scenario:
Impact:
Expected results:
We didn't tests other filesystem than XFS for Linux based operating system because we use only XFS.
[hdevigne@VM1 ~]$ htop^C
[hdevigne@VM1 ~]$ echo "coucou" > test
-bash: test: Input/output error
[hdevigne@VM1 ~]$ dmesg
-bash: /usr/bin/dmesg: Input/output error
[hdevigne@VM1 ~]$ d^C
[hdevigne@VM1 ~]$ sudo -i
-bash: sudo: command not found
[hdevigne@VM1 ~]$ dm^C
[hdevigne@VM1 ~]$ sudo -i
-bash: sudo: command not found
[hdevigne@VM1 ~]$ dmesg
-bash: /usr/bin/dmesg: Input/output error
[hdevigne@VM1 ~]$ mount
-bash: mount: command not found
[hdevigne@VM1 ~]$ sud o-i
-bash: sud: command not found
[hdevigne@VM1 ~]$ sudo -i
As we predicted it, the vm is completly fucked-up
Windows VM crash and reboot in loop.
Linstor controller was on node 1, so we will not be able to see linstor nodes status, but we supposed they are in "disconnected" and in "pending eviction", but that doesn't matter a lot, disks are in read only, vm are fucked up after writing, it was our expected bevahior.
Re-plug node 1 and node 2.
Windows boot normally
Linux VM stays in a "broken state"
➜ ~ ssh VM1
suConnection closed by UNKNOWN port 65535
We didn't test a duration up to the eviction states of linstor nodes, but the documentation show that a linstor node restore
would works ( see https://docs.xcp-ng.org/xostor/#what-to-do-when-a-node-is-in-an-evicted-state )
We didn't use HA at this time in the cluster, that could helped a bit in the recovery process. but in a precedent experience that i didn't "historize" like this one, the HA was completely down because it was not able to mount a file, i will probably write another topic on the forum to bring my results public.
Having HA change the criticity of the following note.
Thanks to @olivierlambert, @ronan and other people on the discord canal for answering to daily question which permit to this kind of tests to be made. As promissed, i put my result online
Thanks for XOSTOR.
Futher tests to do: Retry with HA
@olivierlambert our opnsense resets the TCP states so the firewall block packet because it forgot about the tcp session.
And then, a timeout occured in the middle of the export.
Hello @olivierlambert
I confirm my issue came from my Firewall so, not related to XO.
However, it could be great to make logs more "clear", i mean:
Error: read ETIMEDOUT"
Become
Error: read ETIMEDOUT while connect to X.X.X.X:ABC
That would permit to understand more quickly my "real and weird" issue
Best regards,