XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Too many snapshots

    Scheduled Pinned Locked Moved Backup
    44 Posts 6 Posters 2.2k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • M Offline
      McHenry @Andrew
      last edited by

      @Andrew

      Is this it?

      44abaea6-12cc-4310-8aa0-82a66ef69a0e-image.jpeg

      P 1 Reply Last reply Reply Quote 0
      • P Offline
        Pilow @McHenry
        last edited by

        @McHenry for TOO MANY SNAPSHOTS in this case, better look the RETENTION applied to this CR schedule

        how many retention points do you have in the schedule of this job ?

        M 1 Reply Last reply Reply Quote 0
        • M Offline
          McHenry @Pilow
          last edited by

          @Pilow

          Currently set to 15 on one schedule and 1 on the other schedule
          6222e208-fb69-42bd-90af-2c5c97179263-image.jpeg

          P 1 Reply Last reply Reply Quote 0
          • P Offline
            Pilow @McHenry
            last edited by Pilow

            @McHenry could you screenshot the SNAPSHOTS tab of a problematic VM ?

            the new backup & CR code changed the way it is handled as you understood... no more "one VM per replica point", but "one VM with a snapshot per replica point"

            you should see 16 snapshots on the VM ?

            EDIT : my bad, its the first screenshot you provided... indeed 16 snapshots

            I do not remember the snapshots limit per VM (50?) , but you seem to bee far from it

            P M 2 Replies Last reply Reply Quote 0
            • P Offline
              Pilow @Pilow
              last edited by

              the limit enforced by XOA is 30

              https://docs.xen-orchestra.com/backup_troubleshooting?utm_source=chatgpt.com#vdi-chain-protection

              M tjkreidlT 2 Replies Last reply Reply Quote 1
              • M Offline
                McHenry @Pilow
                last edited by

                @Pilow

                Thank you. I did not know this change had been made.

                So now I have the master VM with a snapshot for each backup schedule.
                dc816079-74a3-4c24-8367-d150206d920f-image.jpeg

                And a CR created VM with 16 snapshots.
                d3953d63-e368-49bf-9a5d-a6fb87ad5687-image.jpeg

                db8e1103-04ef-4c0f-aff2-5757bf767cc2-image.jpeg

                I wonder why the health check shows too many snapshots as I am well under 30 per VM
                5ca56285-69cb-4c2e-a648-208feaadaae6-image.jpeg

                P 1 Reply Last reply Reply Quote 0
                • P Offline
                  Pilow @McHenry
                  last edited by

                  @McHenry could be another cap limit on the SR
                  can you screenshot homepage GENERAL tab of HST150 SR ?

                  how many VDIs present ?
                  what type of SR is it ?

                  M 1 Reply Last reply Reply Quote 0
                  • J Offline
                    jr-m4
                    last edited by

                    Throwing my two cents into the hat, as well.

                    I had a similar situation. Snapshots would't be rotated out of retention, regardless of what I set on the retention setting (on a subset of VMs ie not all of the).. The thing that solved it for my, was to recreate the affected schedule.
                    Do note that this will mark the previous backups taken by that technically deleted schedule, as being abandoned/orphaned.

                    But after I did this, snapshots would be properly rotated as expected.

                    M 1 Reply Last reply Reply Quote 0
                    • M Offline
                      McHenry @Pilow
                      last edited by

                      @Pilow

                      396d876a-3faf-43c6-afcc-e7f372a6477d-image.jpeg

                      SR is local storage
                      8160315f-ad51-46dd-b1d9-2b6449af50b8-image.jpeg

                      1 Reply Last reply Reply Quote 0
                      • M Offline
                        McHenry @jr-m4
                        last edited by

                        @jr-m4

                        Recreated, testing now.

                        1 Reply Last reply Reply Quote 0
                        • M Offline
                          McHenry @Pilow
                          last edited by

                          @Pilow
                          Was there an announcement about the change in how these CR backups are done with snapshots now?

                          I'd love to read up on it.

                          P 1 Reply Last reply Reply Quote 0
                          • P Offline
                            Pilow @McHenry
                            last edited by Pilow

                            @McHenry screenshot the GENERAL tab of "Disaster Recovery" SR please
                            just to see how many VDIs it hosts... at least 288

                            it was announced here
                            https://xen-orchestra.com/blog/xen-orchestra-6-3/#💾-backup
                            but it is not explaing in details, I gathered information in another topix in this forum from @florent

                            you also have the Changelog of 6.3.0
                            https://github.com/vatesfr/xen-orchestra/blob/master/CHANGELOG.md#630-2026-03-31

                            points to PR9524
                            [Replication] Reuse the same VM as an incremental replication target (PR #9524)

                            M 1 Reply Last reply Reply Quote 0
                            • M Offline
                              McHenry @Pilow
                              last edited by

                              @Pilow

                              I am rerunning the backups after recreating the schedule to see if the error clears

                              6f0d3051-1254-4d10-989b-c77a832378c9-image.jpeg

                              1 Reply Last reply Reply Quote 0
                              • tjkreidlT Offline
                                tjkreidl Ambassador @Pilow
                                last edited by

                                @Pilow Yes; it's been my understanding that this has been the default for many years now.

                                P 1 Reply Last reply Reply Quote 0
                                • P Offline
                                  Pilow @tjkreidl
                                  last edited by Pilow

                                  @tjkreidl yeah, but he has 16 snapshots.
                                  but the documentation also talks about vdi chain length

                                  but it seems to me impossible to have only 16 snaps and a vdi chain length >30

                                  thats why I wondered perhaps is it a cap limit of snapshots per SR, but I didn't find relevant info about this possibility

                                  you know a lot about Xen, ever heard of this type of per SR limit of snapshots ?
                                  Only info I found is that more than 100/150 VDIs in production per SR can degrade performance

                                  tjkreidlT 1 Reply Last reply Reply Quote 0
                                  • tjkreidlT Offline
                                    tjkreidl Ambassador @Pilow
                                    last edited by

                                    @Pilow Yes, you are correct that the chain length is also limited. You might try to manually delete some of the snapshots and though the limit is supposed to be 30, perhaps there are other factors involved? Does that VM have a particularly large amount of storage and a lot of changes between snapshots? Are any other of your VMs experiencing similar issues? Your SR appears to be mostly empty, correct? Are there any related errors showing up in /var/log/SMlog ?

                                    M 1 Reply Last reply Reply Quote 0
                                    • M Offline
                                      McHenry @tjkreidl
                                      last edited by

                                      @tjkreidl

                                      I wish to maintain 16 restore points using CR, being an hourly restore point over the last two days (8 per day)
                                      I perform a full backup nightly to reset the chain.
                                      08c8a44a-4c6a-4509-9b44-cbe28fd6c4be-image.jpeg

                                      It appears that each CR creates a new snapshot and the old snapshot is removed when a new one is crated
                                      The documentation states this error is shown then there are more than 3 snapshots on a VM
                                      https://docs.xen-orchestra.com/manage_infrastructure#too-many-snapshots

                                      Is this a problematic backup strategy?

                                      tjkreidlT 1 Reply Last reply Reply Quote 0
                                      • tjkreidlT Offline
                                        tjkreidl Ambassador @McHenry
                                        last edited by

                                        @McHenry Are you sure with that frequent running backups that each backup completes successfully before the next one starts? How long does the full backup typically take (less than 7 hours?) as well as the incrementals (under 1 hour?)? Again, I'd suggest looking in /var/log/SMlog for any error conditions that might help identify an issue. Also, how fragmented is your storage, as that can slow things down quite a bit, as can the lack of adequate CPU power as well as memory (run the top or xentop utility to view the load during backups).

                                        P 1 Reply Last reply Reply Quote 0
                                        • P Offline
                                          Pilow @tjkreidl
                                          last edited by Pilow

                                          @tjkreidl @mchenry haaaa I remember how & when I was able to provoke this error
                                          I was trying to purge 12 "replica VM" with the new CR method by forcing CR manually to get 1 VM with 12 replicas

                                          so I ended up clicking START on the CR job as soon as the CR finished, and got this same error.
                                          this was because GC didn't finish the previous job. Just had to wait 2 min for GC to reduce the chain length and I could go manual again on the CR

                                          so I guess @tjkreidl is right, and the error message is misleading
                                          your CR probably finish before the one hour interval BUT Garbage Collector do not

                                          you have two options

                                          • space up your CR jobs to give GC some time to finish
                                          • find why GC is taking too much time (could be SR performance, nerver ending GC because of high I/O on the VM, ...)
                                          P tjkreidlT 2 Replies Last reply Reply Quote 0
                                          • P Offline
                                            Pilow @Pilow
                                            last edited by

                                            @florent @bastien-nollet could it be possible to monitor GC job to pause the job instead of failing with misleading error message ?

                                            instead of TOO MANY SNAPSHOTS juste pause with WAITING PREVIOUS GARBAGE COLLECTOR TO FINISH and resume ASAP ?

                                            this would force the admin of backup to re think his CR RPO/RTO strategy but not fail jobs

                                            1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better 💗

                                            Register Login
                                            • First post
                                              Last post