Hi,
We are currently using an XCP-ng pool with an XOA (Xen Orchestra) instance hosted within the same pool.
Recently, we encountered a crash of the Master node. At that precise moment, we had disabled High Availability (HA) for specific maintenance on the pool that had been performed a few days prior .
Although the XOA VM remained operational on a Slave node, we found ourselves in a "blind management" situation: the XOA interface could no longer communicate with the pool because the XAPI entry point (the Master) was down.
To avoid this scenario in the future, I would appreciate your opinion on the feasibility and best practices regarding the following points:
Out-of-band management: Is it recommended to move XOA to a physical server (or a management pool) completely independent of the production pool it manages to ensure visibility in case of quorum loss?
Cross-Configuration: If we have two separate pools (Pool A and Pool B), is it advisable to host XOA-A on Pool B to manage Pool A, and vice versa?
High Availability (HA) Behavior: Even with HA enabled, while the system elects a new Master and the XAPI stack restarts, will there always be a period of unavailability for the XOA interface?
We are looking to ensure that our management tools remain available and "visible" even in the event of a critical Master failure.
Thank you in advance for your advice and for all the work done on XCP-ng.
Best regards
Olivier