Attempting to confirm what's expected vs what's observed....
If retention is the number of backups kept, regardless of the date, then if I had a retention of 2, and ran 5 consecutive backups, only the last two backups should remain on the remote?
Attempting to confirm what's expected vs what's observed....
If retention is the number of backups kept, regardless of the date, then if I had a retention of 2, and ran 5 consecutive backups, only the last two backups should remain on the remote?
Just to make sure I'm understanding the backup side of XO...
Backup retention is how many backups will be kept on the remote, and any backed up data that's older than the retention number should be removed automatically by the backup process?
For example, a "full backup" schedule that runs daily with a retention of 2, should only ever have two backups on the remote?
If XO cleans up behind itself...What exactly is it keying off of to determine what "old" files to delete?
I ask because it doesn't look like XOfS is doing any house-keeping on S3 storage (specifically BackBlaze B2).
For example, I started a daily full backup schedule with a retention of 2 on 5-21-23. As of today, all backups were still in the bucket. Before the job ran, I manually removed everything from the bucket that had file dates up to 20230526*. After the job completed, I checked and I still had backups from 20230527* on to the expected 20230531* for today. I changed retention to 3 and set "delete before backup" and executed again, but I just ended up with another 20230531* backup set. I did notice that the files themselves were coded with the day of the backup, but that the actual date on the files was within the past two days...Even if the file was an older file.
Example:
20230527T040007Z.json (2) * 14.1 KB 05/29/2023 00:04
20230527T040007Z.json.checksum (hidden) 0 bytes 05/29/2023 00:04
20230527T040007Z.xva (2) * 995.5 MB 05/29/2023 00:04
20230527T040007Z.xva.checksum (2) * 36.0 bytes 05/29/2023 00:04
20230528T040008Z.json (2) * 14.1 KB 05/30/2023 00:06
20230528T040008Z.json.checksum (hidden) 0 bytes 05/30/2023 00:06
20230528T040008Z.xva (2) * 1.0 GB 05/30/2023 00:06
20230528T040008Z.xva.checksum (2) * 36.0 bytes 05/30/2023 00:06
This could just be a BackBlaze specific thing that they're doing. As you can see though, the file names indicate the date/time XO created them, but the BackBlaze file (system?) date is two days later. If XO is looking at the remote filesystem date, then this makes sense why those older backups are still retained. However if XO is looking at the filenames it creates, then I would expect it to have cleared off the older backups.
This also begs a question...If the retention is set, is the retention the number of copies, or is the retention the number of scheduled cycles? If copies, then presumably manually executing a daily backup a couple of times in a row would clean up the previous two days of backups. If cycles...Then presumably a retention of "2" for daily backups would mean it would keep all backups less than two days old. If the retention is "8" for an hourly backup, then any backups older than 8 hours would be cleared off.
The cycles method based on remote file system dates makes more sense to me and is what I would suspect XO is doing. In my case with BB, it would just appear that something strange is happening on their file system that is throwing the dates off.
@Andrew Thanks for that added detail.
Your success to Wasabi is encouraging. Perhaps Planedrops performance issues with BackBlaze B2 is related to a specific combination of implementation of S3 between BackBlaze and XO.
Things to test:
XO to AWS
XO to Wasabi
XO to BackBlaze
Theoretically, the performance should be the same to all S3 endpoints.
@planedrop Olivier is of the mindset that the VM/Server should really be the OS/Application and that the large data should be on some other storage. This keeps your VM's light and agile. Backups and migrations of the hosts are "fast" then. Let the storage subsystem do the heavy lifting of the storage. It's a more cloud-centric way of thinking than what we've traditionally done and makes upgrading servers/apps potentially less painful having the data-stores separate from the actual VM/Server.
Back on topic....I can try S3 to B2 and AWS from another site with more bandwidth and see how that goes and report back.
For Andrew...When you say the "last big XO update" what are you referring to exactly? Which version specifically? I ran an update to the local XOfS instance on-site over the weekend, so presumably that system has the update.
Thanks!!!
@planedrop Mr. Lambert will be by shortly to chastise you for having VM's with 2TiB VHD's
I don't know if I would consider something that has occasional failures necessarily out of beta either.
I ran a couple of test backups from XO to B2 and saw 30-40Mb...But that's the cap of that sites current upstream (shopping for some better bandwidth...But that's what the site has for now) so it didn't trigger any alarms for me. I think if that's a speed limitation in the XO implementation, that would be a potential problem as well. Have you tested this to an actual AWS S3 bucket and observed the same speed issue from XO?
I've thought about building a TrueNAS as a backup target as I know it can replicate to B2 fairly well. I just saw the XO S3 integration and thought that might be a usable option without having to add more hardware.
Thanks for your feedback planedrop!
@florent I can get behind a few more options in the encryption side of the S3. I also think that's a really important feature when using some sort of cloud-based storage. My concern is mainly in how to deal with the keys and DR. What do I do if the XO and VM farm is destroyed? How do I rebuild the farm and recover the data?
The current S3 remote form says to input a 32-character key...But is that an actual key or is that a pass-phrase to generate a key? What pieces do I need to backup and safeguard in order to recover the data from the S3 storage? This feature isn't really documented and it doesn't seem to be fully fleshed out yet. If there's something I can do to help, I'd be glad too.
Caveats from the S3 remote encryption:
All the files of the remote except the encryption.json are encrypted, that means you can only activate encryption or change key on an empty remote.
You won't be able to get your data back if you lose the encryption key. The encryption key is saved in the XO config backup, they should be secured correctly. Be careful, if you saved it on an encrypted remote, then you won't be able to access it without the remote encryption key.
Size of backup is not updated when using encryption.
I'm looking for the status of a couple of XOA/XOfS backup features.
The S3 remote is listed as "beta" and apparently has been for a couple of years.
There's a further option on the S3 remote for encryption that's currently (at least in XOfS) listed as an "alpha" feature but aside from the disclaimers in the XO screen, there's really nothing that discusses that feature in-depth.
I've actually tested the S3 backup to Backblaze B2 and it seems to work fine. I have not tried the encryption feature as I want to better understand the implications on recovery, but it doesn't seem to be documented.
Thanks!
Just making sure I'm not missing anything...The latest version of the XCP-ng drivers/agent is a RC from 2019?
When IBM/RedHat "killed" CentOS, the rest of the world took a hint and left. Companies and projects left CentOS in droves as the future of their products were in jeopardy due to the loss of CentOS.
At this point, the damage is done.