Backup fails only on master with VM_NO_SUSPEND_SR
-
I recently changed my backups from Normal to With Memory. I have two hosts in my pool. The master host gives me the VM_NO_SUSPEND_SR error when backing up any VMs on it. The other host successfully performs backups.
I do not have a Suspend SR set on the pool nor do any of the VMs have one set. If I migrate a VM from the master to my other host then I can successfully backup. However, upon attempting to migrate the VM back to the master host gives me the cryptic log message of "operation failed".
There are no settings differences that I can find between the two hosts. Hardware wise the working host is one generation newer a cpu but I'm not sure why that would let it suspend a VM.
Based on this post https://xcp-ng.org/forum/topic/5475/backup-failed/5 my understanding is that because I don't have a Suspend SR configured anywhere that all of my backups should fail. But instead they only fail when on the master host.
-
Check each host to see what the following command returns --
xe host-list params=suspend-image-sr-uuid
-
They both returned the same result.
suspend-image-sr-uuid ( RW) : <not in database> suspend-image-sr-uuid ( RW) : UUID
The UUID that gets returned shows up as the local storage on the working host when I paste it into the Storage filter box.
-
@CJ Those seem to be different results to me. I'm guessing the host that returns a UUID is the one that successfully performs the backup, correct?
-
@Danp Both hosts return the exact same text from the command. They both show two entries.
Yes, the host that works is the one whose local storage matches the UUID.
-
It means you have no suspend SR defined when "<not in database>" and that's your problem
-
@olivierlambert Well, yes. But how is there one suspend-sr defined? I never set any as I wasn't even aware that I needed to and there isn't anything in the XO UI that shows it.
How does one host have a suspend-sr and the other if it's defined at the pool level? Does it need to be networked storage or can each host just use their local storage?
-
In the pool view/advanced tab/Suspend SR.
It's mandatory to be able to suspend/restore snapshots with memory
edit: it's a setting that's pool wide, not host wide. It's not set by default, it's up to you to choose the SR to suspend.
-
@olivierlambert I understand that a Suspend SR is required. What I'm struggling with is that I have no Suspend SR defined anywhere in XO. The advanced tab for my Pool shows None, as does the advanced tab for all of my VMs.
Running the command @Danp provided shows the two entries, which I'm additionally confused about if the setting is pool wide. Both hosts return a "not in database" result as well as the UUID result.
The reason I asked if it needs to be networked is because I currently use local storage for my VMs. Therefore I assume I would need a Suspend SR for each host, but as it's pool wide than I believe I would need to make a network share that they could both access.
-
Okay I understand. It's meant to work poolwide, but if you are using a local SR (which is not really the best practice in a pool with multiple hosts), then you can setup one per host.
You can specify one per host with the following CLI:
xe host-param-set suspend-image-sr-uuid=<SR UUID>
-
@CJ said in Backup fails only on master with VM_NO_SUSPEND_SR:
Both hosts return a "not in database" result as well as the UUID result.
Maybe this will make it more understandable --
xe host-list params=uuid,suspend-image-sr-uuid
-
@olivierlambert Any ideas how it would have gotten set in the first place and why it doesn't show in XO? I didn't even know it was a thing previously so I didn't set it.
-
@Danp That still returns the same result from each host. But the included UUIDs map the "not in database" to my master host.
-
@CJ Right... each time you run the command it returns the result for all hosts. By adding the UUID param, you can more easily identify which result goes with which host.
-
@Danp Your initial post seemed to suggest that I should see different results on each host.
Any ideas how the one got set?
-
Yes... I currently have only a single host for testing, and I didn't realize that the result from
xe
would contain the setting for all hosts. -
@olivierlambert I set the suspend-sr for the master host and now my backup completes successfully. Can XO be changed to expose this setting for each host?
I do want to clarify two things though in order to make sure I understand things correctly.
-
If I set a Suspend SR in XO for the pool, will this replace the existing values for each host?
-
If my VMs are on local SR but the Suspend SR is a network share, does this mean that the backup process will send the memory to the networked SR during the snapshot and then send the data from the local SR and the networked SR to XO for transfer to the backup remote?
-
-
It's not meant to be configured per host, only in cases where you need local SR, that's why it's not visible in XO. But in those specific cases, the CLI does the trick
If you set it for the pool, I'm not sure it will erase it for each host, I never tried myself to modify it both at pool and host level, so I don't know.
The suspend SR is only used to store the memory, nothing else. So yes in your case, the memory disk will only be sent to the networked SR and that's it. XO will fetch all the disks, but in any case this will go through the host themselves then to XOA.
-
My VM backups have been working since I set the Suspend SR for the other host. However, my metadata backup failed with the following error.
EBUSY: resource busy or locked, unlink '/run/xo-server/mounts/UUID/xo-pool-metadata-backups/UUID/UUID/DATETIME/.nfs000000000000000e00000060'
I restarted the XO VM and was able to successfully complete a metadata backup. Hopefully this isn't a consequence of backing XO up with memory.
-
It's completely unrelated