XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Replication is leaving VDIs attached to Control Domain, again

    Scheduled Pinned Locked Moved Backup
    11 Posts 4 Posters 725 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      Pinging @florent

      1 Reply Last reply Reply Quote 0
      • florentF Offline
        florent Vates 🪐 XO Team @Andrew
        last edited by

        @Andrew is it always the same VM/disk ?

        A 2 Replies Last reply Reply Quote 0
        • A Offline
          Andrew Top contributor @florent
          last edited by

          @florent Different random ones.

          1 Reply Last reply Reply Quote 0
          • A Offline
            Andrew Top contributor @florent
            last edited by

            @florent With CR running and NBD enabled for 2, I see both exports and one import (per disk). It's never the import that's stuck and only one (not both) of the exports (if it happens).

            I have updated XCP 8.3 to the new January 2026 patch and XO to current master and will keep an eye on it again.

            florentF 1 Reply Last reply Reply Quote 0
            • florentF Offline
              florent Vates 🪐 XO Team @Andrew
              last edited by

              @Andrew and the CR is completing correctly ?

              A 3 Replies Last reply Reply Quote 0
              • A Offline
                Andrew Top contributor @florent
                last edited by

                @florent Yes.

                1 Reply Last reply Reply Quote 0
                • A Offline
                  Andrew Top contributor @florent
                  last edited by

                  @florent Delta backup is also leaving old snapshots on some VMs. It should only have one (current) snapshot for the nightly backup. This is an issue on 1 of 100 VMs.

                  XCP (Jan 2026 update) and XO (91c5d) are current.

                  30de9803-0b8d-488d-b160-57c623f66e78-image.png

                  1 Reply Last reply Reply Quote 0
                  • A Offline
                    Andrew Top contributor @florent
                    last edited by

                    @florent I rebuilt my XCP hosting environment (everything is faster and bigger stuffed into one rack).... and this issue is now worse.

                    The main changes to this new setup are 2x40Gb networking, faster NFS NVMe NAS, faster pool servers, more memory, much faster CR destination machine with ZFS.

                    Running XCP 8.3 (March 2026 updates) and XO (master a2e33).

                    Replication is leaving many attached to control domain every day with NBD connection set to 2. Changing it to 1 seems to resolve the issue (no more stuck to control domain).

                    poddingueP 1 Reply Last reply Reply Quote 0
                    • poddingueP Online
                      poddingue Vates 🪐 @Andrew
                      last edited by

                      Thanks for the detailed report and the NBD=2 vs NBD=1 correlation, Andrew, that's a genuinely useful clue. 👍
                      From what I understand, a VDI staying attached to dom0 like this tends to point at the storage side on the XCP-ng host rather than XO's replication logic itself, so it's probably worth a look from @Team-Storage. 👀
                      To give them something concrete to chase, would you be able to share your SR type/storage backend, plus the SMlog (and kern.log) from around the time a VDI gets left stuck?
                      With those details, the storage folks would have a much better starting point.
                      Thanks again for staying on top of it.

                      A 1 Reply Last reply Reply Quote 0
                      • A Offline
                        Andrew Top contributor @poddingue
                        last edited by

                        @poddingue Since setting NBD=1, I have not seen the problem.

                        SR is NFS on dual 40G ethernet with a TrueNAS scale 25.10 server using all NVMe SSD, so storage performance is as good as I can make it.

                        I'll have to enable NBD=2 again to see if it still happens and if I can find the relevant part of the logs. As this is a random problem I can't recreate it on a normal test environment.

                        1 Reply Last reply Reply Quote 1

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post