This might be related to a known issue about PCI passthrough with nvme devices: the kernel tries to allocate more MSI-X vectors than the guest can handle. You can try to increase the number of guest IRQs with the Xen boot parameter extra_guest_irqs. The default is 64 and you can increase it to 128 with:
@KPS Both APIs are completely different, the oldest one has never been intended for public consumption, it was developed for XO's internal needs even though we are not preventing anyone from using it 🙂
The REST API has been created exactly to address the need of a public API, use it if it covers your need.
Also, feel free to provide feedback and we'lll improve it.
Regarding the limit issue, it has been fixed 6 weeks ago, please make sure you are up to date.
@Jonathon@olivierlambert If it is not being blocked, then it may be their efforts to prevent node saturation by moving from node to node. What plan are you on, is it the free one?
If so then this movement will occur more frequently as they move from node to node roughly around every 5-10 minutes. So will experience this loss (timeout) of connection on a regular basis! With a paid plan its much less frequently, also the Cloudflare ZT support team with detailed logs may be able to aid in stabilising connectivity.
Their minimum paid plan with 100% uptime and SLA is the "Pay-as-you-go" plan. So if your going to be carrying out actions in XOA regularly over a remote connection, which need to have a long timeout then a paid plan would pay for itself.
@pdonias That would make sense for most use cases. If no tools are detected (whether running or not), don't do anything with IPs.
However, that leads to a different corner case. What if you shut down a machine that you don't intend to bring back up (except if there's an emergency), and you do want the IPs to get removed. Maybe at that point, the responsibility falls on the admin to manually update Netbox.
@kevdog You can force a host to be ejected even if the host is not reachable any longer from the pool master using:
xe host-forget uuid=UUID
If there is any issue because VMs are thought to be running still on that host,
you may need to do a power state reset on such before you can get rid of that host.
See if that works for you.