XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Orphan VDI snapshot after CR backup

    Scheduled Pinned Locked Moved Backup
    13 Posts 3 Posters 921 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Andrew Top contributor
      last edited by

      With XO (from source), I'm using Continuous Replication every hour. After some backups (not every time), an orphaned VDI snapshot is left (not the same VDI). The backups are successful but a detached snapshot is left until I remove it. These snapshots don't show up on the VM disks but show up on the Dashboard Health report (not the detached backup report).

      For example, yesterday it left 6 at different times of different VDIs. Today, it left none. VMs are running on different hosts (in the same pool). Everything is up to date (XCP 8.2.1, XO from source commit df07d). It's been going on for a while and I just delete them occasionally (without harm). Storage is thin on NFS so it's not eating a lot of space or taking a lot of time as long as I take care of them.

      1 Reply Last reply Reply Quote 0
      • DanpD Offline
        Danp Pro Support Team
        last edited by

        Hi Andrew,

        Anything unusual in the logs?

        Dan

        A 1 Reply Last reply Reply Quote 0
        • A Offline
          Andrew Top contributor @Danp
          last edited by

          @Danp Nope... XO left another one this morning but there is nothing in the logs at that time (no errors, no warnings, no messages). Backup worked successfully but left a new orphaned snapshot.

          There are other warnings at other times about other stuff, but it seems to be fine...

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Online
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            xo-server output would be really interesting. In theory, XO is REALLY really really careful and try to remove a disk for 20 minutes when XAPI refuses to do so (and you should have a trace in the logs, I mean, the console output of xo-server).

            My gut feeling is that XAPI saying that's OK but it's not, and knowing why might help to find a storage race condition somewhere.

            A 2 Replies Last reply Reply Quote 0
            • A Offline
              Andrew Top contributor @olivierlambert
              last edited by

              @olivierlambert It's good that it is really really really careful. The rule is: Primum non nocere (First, do no harm). The backup job completes without logging an error about failing to remove the snapshot.

              I'll have to increase the XO logging and see if there is more output about it.

              Where is the XAPI log file should I look at?

              1 Reply Last reply Reply Quote 0
              • A Offline
                Andrew Top contributor @olivierlambert
                last edited by

                @olivierlambert Or is it actually a coalesce problem? The VM/VDI are not listed under the "VDIs to coalesce" after they finish.

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Online
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  It's hard to know exactly: is it something we can see on XO's side or not? I can't tell. Maybe SMlog got more info at the time the VM snapshot is removed.

                  A 1 Reply Last reply Reply Quote 0
                  • A Offline
                    Andrew Top contributor @olivierlambert
                    last edited by

                    @olivierlambert It's still an ongoing issue (XO community commit f1ab6).

                    Here is an error XO when it fails to remove the old snapshot:

                    Sep 21 16:00:59 xo1 xo-server[613294]: 2022-09-21T20:00:59.229Z xo:xapi:vm WARN VM_destroy: failed to destroy VDI {
                    Sep 21 16:00:59 xo1 xo-server[613294]:   error: XapiError: HANDLE_INVALID(VBD, OpaqueRef:6b28b472-e82e-4117-a0c0-b61ee894e3b5)
                    Sep 21 16:00:59 xo1 xo-server[613294]:       at XapiError.wrap (/opt/xo/xo-builds/xen-orchestra-202209211219/packages/xen-api/dist/_XapiError.js:26:12)
                    Sep 21 16:00:59 xo1 xo-server[613294]:       at /opt/xo/xo-builds/xen-orchestra-202209211219/packages/xen-api/dist/transports/json-rpc.js:46:30
                    Sep 21 16:00:59 xo1 xo-server[613294]:       at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
                    Sep 21 16:00:59 xo1 xo-server[613294]:     code: 'HANDLE_INVALID',
                    Sep 21 16:00:59 xo1 xo-server[613294]:     params: [ 'VBD', 'OpaqueRef:6b28b472-e82e-4117-a0c0-b61ee894e3b5' ],
                    Sep 21 16:00:59 xo1 xo-server[613294]:     call: { method: 'VBD.get_VM', params: [Array] },
                    Sep 21 16:00:59 xo1 xo-server[613294]:     url: undefined,
                    Sep 21 16:00:59 xo1 xo-server[613294]:     task: undefined
                    Sep 21 16:00:59 xo1 xo-server[613294]:   },
                    Sep 21 16:00:59 xo1 xo-server[613294]:   vdiRef: 'OpaqueRef:56e6071e-eb67-4e02-b6d1-b814ea43eeeb',
                    Sep 21 16:00:59 xo1 xo-server[613294]:   vmRef: 'OpaqueRef:31957bf1-2f2b-474d-a496-e2a2460f533f'
                    Sep 21 16:00:59 xo1 xo-server[613294]: }
                    
                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Online
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      We got an exception from XAPI, but let's see if it's "because" of XO. Pinging @julien-f

                      A 1 Reply Last reply Reply Quote 0
                      • A Offline
                        Andrew Top contributor @olivierlambert
                        last edited by

                        @olivierlambert This issue still continues... Using current XO Source and current XCP 8.2.1

                        1 Reply Last reply Reply Quote 0
                        • olivierlambertO Online
                          olivierlambert Vates 🪐 Co-Founder CEO
                          last edited by

                          I don't know what XAPI refuse to destroy the VDI… I don't think it's an XO issue.

                          A 1 Reply Last reply Reply Quote 0
                          • A Offline
                            Andrew Top contributor @olivierlambert
                            last edited by

                            @olivierlambert @julien-f Enabling Use NBD protocol to transfer disk if available (and actually using NBD) for the job in XO source (commit 3abbc) seems to resolve this issue. If I disable NBD then I start to see this random problem again in about a day. With NBD enabled I have not seen the problem for weeks.

                            1 Reply Last reply Reply Quote 0
                            • olivierlambertO Online
                              olivierlambert Vates 🪐 Co-Founder CEO
                              last edited by

                              Good news then 🙂

                              1 Reply Last reply Reply Quote 0
                              • olivierlambertO olivierlambert moved this topic from Xen Orchestra on
                              • First post
                                Last post