@R2rho
Faulty gear always sucks. But who would've guessed that two separate systems would produce the same problems. That is highly unlikely, but never impossible.
Good luck with the RMA
@R2rho
Faulty gear always sucks. But who would've guessed that two separate systems would produce the same problems. That is highly unlikely, but never impossible.
Good luck with the RMA
Well, unfortunately I got nothin... Extremely weird indeed
Given that BIOS and everything is updated to latest version possible.
First thing I do then with these kinds of symptoms, is to disable all kinds of power management and/or C-states in BIOS.
Some combinations of OS and hardware, just doesn't work properly.
If for nothing else, it's a easy non-intrusive test to do.
Update: I see that your motherboard has an IPMI interface. If the issues happen again, after you've disabled power management/c-states. You could use the remote functionality of the impi, to hopefully get some more info from the sensors and stuff.
Looking a tiny bit further. The same discrepancy is present with Disaster recovery too. Being reference as Full Replication (formerly: Disaster recovery)
XO and the docs are conflicting with each other, with what the backup function should actually be called. See attached pic for example.
This is truly a niche sittuation. But I noticed that when I have VMs without any disks attached. The Rolling Snapshot schedule doesn't remove snapshots in accordance to the schedules Snapshot retention.
So I'm guessing that the schedule only looks at cleaning up snapshots of disks. But since the snapshots are acctually of the entire VM. Then maybe this should be takin into account as well?
If this is working as intended, then just ignore this post
I finally have some new hardware to play with. And I'm noticing that the Healtcheck fails, due to the vGPU being busy with the acctual host.
INTERNAL_ERROR(xenopsd internal error: Cannot_add(0000:81:00.0, Xenctrlext.Unix_error(4, "16: Device or resource busy")))
My suggestion is that for the sake of Healthchecks should unassign any attached PCIe devices. If it is crucial that they are attached, then maybe have an opt-in checkbox either in the VM or next to the Healthcheck portion of backups?
The address field doesn't trim trailing whitespace. Not a big deal breaker. But it did take me a couple of minutes until I found out why my copy/paste address was giving me errors
I'm unsure if Threadripper is affected. But the EPYCs have a problem with networking speeds. And I'm quite sure they haven't found the root cause yet,
@Houbsi
what NAS are you using?
I've acctually got a PR to help solve the documentation regarding the yarn forever part. Since I never could get that to work, I implemented a systemd variant instead.
Ping @olivierlambert for visibility
@zach
XCP-ng is unfortunately quite slow on individual streams for storage. It kind of disappears when using lots of VMs. But looking at any one individual one, it is surprising.
A good read is the write up on Understanding the storage stack in XCP-ng
@ludovic78
In the case of such a power outage, the backup would fail anyway.
And I'm also assuming that you have a separate share/dataset for those specific backups.
Coble it together with the Health Checks, and it is semi-production at least.
If it is for such a highly critical environment, that it wouldn't tolerate more than that. Then obviously you should open a support ticket
@ludovic78
Try and set the Sync-setting to Disabled, on the target dataset that you're sharing. See if that does any difference.
This makes a GIANT difference for me.
@jasonnix
Yeah that folder can fill up quite heavily.
See the following on how to clear the cache with yarn
https://yarnpkg.com/cli/cache/clean
@olivierlambert If I could get a pointer as to which source-document we're talking about. Then yes, I could whip something up.