XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. jshiells
    3. Posts
    J
    Offline
    • Profile
    • Following 1
    • Followers 0
    • Topics 7
    • Posts 43
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: Backups don't time out

      @olivierlambert minor update... its now been 48+ hours and task is still in the same state, not timed out yet. 🙂

      posted in Backup
      J
      jshiells
    • RE: Backups don't time out

      @olivierlambert the answer seems to be "no timeout" as of now the task has been stuck like this for 30+ hours. only way i can make it go away is to restart the XO Proxy. something i can't do right now due to another task running a huge delta mirror sync.

      posted in Backup
      J
      jshiells
    • RE: Backups don't time out

      @DustinB thats not the point of the post.

      the point of the post is on HTTP connection has timed out, the TASK never actually times out on the proxy.

      I am aware it says space issues on the SR and its unfortunate that error is in the screen shot but its a distraction, not the actual problem.

      posted in Backup
      J
      jshiells
    • Backups don't time out

      XOA 5.110.1
      XOA Proxy involved is version 0.29.30

      we have a backup task that is refusing to time out. the only way i can get this to end is by restarting the proxy

      this has been an issue sense the new backup process was introduced.

      timeout for the task is set to 8 hours but XOA and the XOA proxy seem to not care what this value is.. it just sits there and is keeping the VDI's locked.

      fe56da05-bda2-409a-ae8c-150254f1234b-image.png

      dont get hyper focused on this next screen shot, it has nothing todo with the actual issue, just showing that even though the process is timed out but running.. the VID's are still locked by the host
      6776cda2-2b24-44b8-a8d7-0caf49c7be87-image.png

      Thanks

      posted in Backup
      J
      jshiells
    • RE: Backup and Backup copy with XO Proxies. Bug or by design ?

      @Pilow not easily. as an Telco/ISP we run massive multiple 40 to 100gb/s links across the country so data transfers from site to site is not an issue allowing us to have priv routed IP networks spanning anywhere. Much harder todo for SMB deployments.
      may also involve adding more vnics and IP's to XOA and XOA-PROXY

      SiteA:
      XOA
      proxy-A
      Storage-A-SR
      Backup-A-SR

      priv network routing in both directions to:

      SiteB:
      proxy-B
      Storage-B-SR
      Backup-B-SR

      so XOA, Proxy-A, Proxy-B ALL have access to:
      Storage-A-SR
      Backup-A-SR
      Storage-B-SR
      Backup-B-SR

      sorry.. this is probably not helpfull for most people 😞
      1250fe45-cf70-4fb9-9424-9f0cb0aafbc6-image.png

      posted in Backup
      J
      jshiells
    • RE: Backup and Backup copy with XO Proxies. Bug or by design ?

      @Pilow thats what we had todo..

      we had to allow XOA AND ALL proxies access to ALL backup remotes.

      posted in Backup
      J
      jshiells
    • RE: 10gb backup only managing about 80Mb

      @Pilow I think what you are seeing is the result of tapdisk single threaded nature (among other things master Dom0 related). I would suggest changing concurrency to 4 or 8 and see if the speed AT THE PORT is higher during backups. You may still only see <80MiB/s PER vm getting backed up, maybe less, but you may end with 4 or 8 backing up at 40 to 60MiB/s == high total bandwidth.

      also note.. you 61.63MiB (note capital B for Mebibyte/s) is == 517.267 mb/s network speed megabite/second

      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      @olivierlambert the 3 truenas servers are very bored 🙂 we be lucking to be hitting 10% of there abilities even during the backups.

      i think i have stumbled across something that maybe a bug or the cause:

      lets say we have 5 servers, VM getting this error during backups is running on server3, When this happens we power it down on server3 and clear the stuck tapdisk connections a log entry shows up saying a snapshot link on server5 was also cleared....

      example: "Cleared RW for 7d213264-1f89-488f-b273-eb91c186eaab on host 14c98969-ae12-4d7f-9c42-d7656aae01e5 on server5"
      ... but the VM was running on server3.....

      so HERE is what i think is happening... just a theory... I think the load balance plugin is moving VMs around during the backup process. could that be possible?? i feel based on what i am seeing this could be the cause.

      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      This problem is back with vengeance. after almost a year of no issues, we have upgraded to version 109.2 of XOA and boom... probably is back

      XCP-NG servers are on 8.2.1 May 6th update

      we get this following error in XOA

      Error: SR_BACKEND_FAILURE_82(, Failed to snapshot VDI [opterr=failed to pause VDI 7d213264-1f89-488f-b273-eb91c186eaab], )

      cant make snapshots of the effected VM anymore...
      manual snapshots we get same error:
      SR_BACKEND_FAILURE_82(, Failed to snapshot VDI [opterr=failed to pause VDI c51f22e2-ba5c-413b-8afc-9208a2b944ef],
      but there is NO issues with the SR.

      failed 17 backups last night,
      failed 3 backups tonight.
      across 3 different storage arrays

      all those VM's we had to shut down and fix.
      tapdisk connections to the disk have to be manually killed.

      we will double check all our interfaces again but i suspect it will be the same as last time:, no errors on networking or storage side.

      something has changed, reverted.

      posted in Backup
      J
      jshiells
    • RE: Netdata-ui updated and VM's not listed anymore - ISSUE

      I got this figured out...

      the xcp-ng-updated packages for netdata-ui DO NOT work with xcp-ng, as per my oringial post.
      netdata is getting upted by normal xcp-ng updates
      8109f4a0-820d-4733-a7a2-507b5cc45b47-image.png

      this new version does not work , VM's no longer listed.

      steps to remove and put on orig version

      yum remove netdata-ui -y
      yum remove netdata -y
      #you have todo this as well or old version wont install
      #yum leave files in /usr/share/netdata/ that blokcs installing 1.19
      rpm -e netdata-data-1.44.3-1.2.xcpng8.2.noarch
      rpm -e netdata-conf-1.44.3-1.2.xcpng8.2.noarch
      yum install netdata-ui-1.19.0
      #run next line or manualy type it in, you do you
      echo "exclude=netdata*" >> /etc/yum.conf

      after this, we have version 1.19 installed again that actually works with XEN xcp-ng

      posted in Advanced features
      J
      jshiells
    • RE: XO - Files Restore

      I can confirm this as well

      several new ubuntu 24 vm's with LVM having this problem. I thought i was going crazy

      existing debian 10,11,12 VM's using LVM are not effected
      vm's not using LVM are not effected.

      posted in Backup
      J
      jshiells
    • Can't save backup task after removing proxy - BUG

      Current version: 5.106.4
      Mirror Incremental backup task

      bug: proxy field cannot be changed back to empty if previously saved with a proxy

      I cannot change task from using a proxy to NOT using a proxy.

      changing config from :
      Proxy
      source via proxy
      target via proxy
      1f644a73-8a1c-428e-b921-9ded6a2f592f-image.png

      TO:

      NO proxy
      source
      target

      72b7c14b-d3d2-468c-bfff-372cb5879903-image.png

      result:
      d097fdce-af65-4f6a-b01a-f9ef42c322bd-image.png

      fdcc6a38-2e04-4fb0-b6a4-d74f02a8dba9-image.png
      error seems like you cant change the proxy value from being filled in to being empty.. however you can make new tasks with this field being empty.

      only way i can get around this to re create the back task from scratch.

      all proxies and XOA can talk to the remotes in play
      3ddf6763-2419-43e0-ba51-5e535b9395b1-image.png

      Thanks

      posted in Backup
      J
      jshiells
    • Netdata-ui updated and VM's not listed anymore - ISSUE

      hello

      running xcp-ng version 8.2.1 build date 2025-05-07
      restart of node has been done

      netdata-ui
      version 1.19
      release 6.xcpng8.2
      from xcp-ng-updates repo

      netdata
      version1.44.3
      release 6.xcpng8.2
      from xcp-ng-updates repo

      confirmed following also installed
      xen-dom0-libs-devel
      yajl-devel
      xen-devel

      i have removed and reinstalled netdata-ui as well

      this was working fine before but sometime in the last few months , maybe when netdata updated to the versions above and the interface all changed, VM's are no longer visible in netdata

      c0b11d3e-0a0f-41dc-b122-4f1e769a92fe-image.png
      use to be a giant list of VMs here,

      this is happening on all my xcp-ng servers, not just the one. they are all at the above mentioned xcp-ng version and netdata version.

      anyone else experiencing this? have any suggestions on whats going on, why the VM's are not showing up in netdata anymore on any of my hosts?

      posted in Advanced features
      J
      jshiells
    • maybe a bug when restarting mirroring delta backups that have failed

      Hi,

      last night we had a couple of vm's in a MIRRORING of delta backups task fail. 3 VM's did not mirror for reasons that do not matter for this bug.

      when we corrected the issue that caused them mirroring fail we went into XOA and pressed the button to just restart the tasks for the failed VM's only... however OXA decided to redo the entire mirror task and sync EVERY vm on the src backup location.

      a0130ea1-1993-47f2-bafc-887fc7a0dc42-image.png
      clicking this to restart just those 3 vm's

      caused this to happen... it re synced ALL of them, not just the 3 that failed
      c7e709b3-cf77-4e7e-93bb-5d155f4c00d3-image.png

      XOA version: Current version: 5.103.1

      i am assuming this is not working correctly?

      posted in Backup
      J
      jshiells
    • RE: Question on backup sequence

      I would like to ask a followup question to confirm.

      so IF using sequences, we should disable the backup tasks on the overview tab?

      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      I have, what is hopefully a final update to this issue(s).

      we upgraded to xoa version .99 a few weeks ago and the problem has now gone away. we suspect that some changes were made for timeouts in xoa that have resolved this , and a few other related problems.

      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      @olivierlambert

      Just an update on this:

      we made sure all our server times were synced. issue happened again the next run.

      just for shits and giggles we restarted toolstack on the all the hosts yesterday and the problem went away. no issues with the backup last night. maybe just a coincidence, we are continuing to monitor.

      we also noticed that even though this CHOP error is coming up, snapshots are getting created

      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      @olivierlambert

      After much digging I traced the following formal exceptions below from
      XEN. For most of them, the "chop" error is burried in the bowels of XAPI.

      ie

                      Oct 18 21:46:29 fen-xcp-01 SMGC: [28249]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 238, in _parse_result
               Oct 18 21:46:29 fen-xcp-01 SMGC: [28249]     raise
      Failure(result['ErrorDescription'])
      

      It's worth noting due to the time involved in this, the only things
      considered in play where nfs-01/02 and lesser degree 03. The one XCP
      pool (fen-xcp), and even lesser degree XOA itself.'

      I could not find any obvious issues with storage, neither hardware nor
      data. Scrubs were fine, no cpu/hardware errors.

      I could not find any obvious issues with xcp hosts, neither hardware nor
      data. No cpu/hardware errors.

      The only real change made was to correct the clock on nfs-01. I don't
      see how that could affect this since most if not all locking is done
      with flock.

      There is a valid argument to be made that Xen technically was responding
      to an issue, though not entirely clear what nor how. Most of the other
      errors / wtfbbq states are either directly related (in call path), or
      indirectly (xen wanted a thing, didn't get it). Those are some deep
      rabbit holes.

      There is more pre/post context to these, tried to include what I thought
      made them a bit more easier to understand.


      ./SMlog.4.gz:Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]          *  E X C E
      P T I O N  *
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729] leaf-coalesce: EXCEPTION <class
      'XenAPI.Failure'>, ['INTERNAL_ERROR', 'Invalid argument: chop']
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 1774, in coalesceLeaf
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     self._coalesceLeaf(vdi)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2048, in _coalesceLeaf
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     if not
      self._snapshotCoalesce(vdi):
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2153, in _snapshotCoalesce
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     self._coalesce(tempSnap)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 1962, in _coalesce
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     self.deleteVDI(vdi)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2469, in deleteVDI
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     self._checkSlaves(vdi)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2482, in _checkSlaves
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     self._checkSlave(hostRef, vdi)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2491, in _checkSlave
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     text  =
      _host.call_plugin(*call)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 264, in __call__
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     return
      self.__send(self.__name, args)
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 160, in xenapi_request
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     result =
      _parse_result(getattr(self, methodname)(*full_params))
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 238, in _parse_result
              Oct 19 00:19:36 fen-xcp-01 SMGC: [3729]     raise
      Failure(result['ErrorDescription'])
      
      ./SMlog.4.gz:Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]          *  E X C E
      P T I O N  *
      
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729] leaf-coalesce: EXCEPTION <class
      'XenAPI.Failure'>, ['INTERNAL_ERROR', 'Invalid argument: chop']
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 1774, in coalesceLeaf
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     self._coalesceLeaf(vdi)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2048, in _coalesceLeaf
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     if not
      self._snapshotCoalesce(vdi):
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2153, in _snapshotCoalesce
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     self._coalesce(tempSnap)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 1962, in _coalesce
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     self.deleteVDI(vdi)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2469, in deleteVDI
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     self._checkSlaves(vdi)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2482, in _checkSlaves
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     self._checkSlave(hostRef, vdi)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2491, in _checkSlave
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     text  = _host.call_plugin(*call)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 264, in __call__
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     return
      self.__send(self.__name, args)
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 160, in xenapi_request
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     result =
      _parse_result(getattr(self, methodname)(*full_params))
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 238, in _parse_result
      Oct 19 00:22:04 fen-xcp-01 SMGC: [3729]     raise
      Failure(result['ErrorDescription'])
      
      ./SMlog.4.gz:Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]          *  E X C E
      P T I O N  *
      
              Oct 19 00:22:11 fen-xcp-01 SMGC: [3729] gc: EXCEPTION <class
      'XenAPI.Failure'>, ['INTERNAL_ERROR', 'Invalid argument: chop']
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 3388, in gc
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     _gc(None, srUuid,
      dryRun)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 3273, in _gc
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     _gcLoop(sr, dryRun)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 3214, in _gcLoop
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]
      sr.garbageCollect(dryRun)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 1794, in garbageCollect
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]
      self.deleteVDIs(vdiList)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2374, in deleteVDIs
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     SR.deleteVDIs(self,
      vdiList)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 1808, in deleteVDIs
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     self.deleteVDI(vdi)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2469, in deleteVDI
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     self._checkSlaves(vdi)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2482, in _checkSlaves
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]
      self._checkSlave(hostRef, vdi)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/opt/xensource/sm/cleanup.py", line 2491, in _checkSlave
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     text  =
      _host.call_plugin(*call)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 264, in __call__
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     return
      self.__send(self.__name, args)
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 160, in xenapi_request
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     result =
      _parse_result(getattr(self, methodname)(*full_params))
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 238, in _parse_result
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]     raise
      Failure(result['ErrorDescription'])
               Oct 19 00:22:11 fen-xcp-01 SMGC: [3729]
      
      
      
      ./SMlog.5.gz:Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]          *  E X C
      E P T I O N  *
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714] GC process exiting, no work left
              Oct 18 21:51:23 fen-xcp-01 SM: [30714] lock: released
      /var/lock/sm/0cff5362-5c89-2241-2207-a1d736d9ef5e/gc_active
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714] In cleanup
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714] SR 0cff ('fen-nfs-03 - DR
      (Diaster Recovery Storage ZFS/NFS)') (608 VDIs in 524 VHD trees): no changes
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]
      *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]          ***********************
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]          *  E X C E P T I O N  *
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]          ***********************
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714] gc: EXCEPTION <class
      'XenAPI.Failure'>, ['INTERNAL_ERROR', 'Invalid argument: chop']
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 3388, in gc
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     _gc(None, srUuid, dryRun)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 3273, in _gc
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     _gcLoop(sr, dryRun)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 3214, in _gcLoop
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     sr.garbageCollect(dryRun)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 1794, in garbageCollect
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     self.deleteVDIs(vdiList)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 2374, in deleteVDIs
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     SR.deleteVDIs(self, vdiList)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 1808, in deleteVDIs
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     self.deleteVDI(vdi)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 2469, in deleteVDI
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     self._checkSlaves(vdi)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 2482, in _checkSlaves
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     self._checkSlave(hostRef, vdi)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/opt/xensource/sm/cleanup.py", line 2491, in _checkSlave
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     text  =
      _host.call_plugin(*call)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 264, in __call__
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     return
      self.__send(self.__name, args)
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 160, in xenapi_request
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     result =
      _parse_result(getattr(self, methodname)(*full_params))
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]   File
      "/usr/lib/python2.7/site-packages/XenAPI.py", line 238, in _parse_result
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]     raise
      Failure(result['ErrorDescription'])
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]
      *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714] * * * * * SR
      0cff5362-5c89-2241-2207-a1d736d9ef5e: ERROR
              Oct 18 21:51:23 fen-xcp-01 SMGC: [30714]
              Oct 18 21:51:28 fen-xcp-01 SM: [26746] lock: opening lock file
      /var/lock/sm/894e5d0d-c100-be00-4fc4-b0c6db478a26/sr
      
      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      @tuxen no sorry, great idea but we are not seeing any errors like that in kern.log. this problem when it happens is across several xen hosts all at the same time. it would be wild if all of the xen hosts were having hardware issues during the small window of time this problem happened in. if it was one xen server then i would look at hardware but its all of them, letting me believe its XOA, a BUG in xcp-ng or a storage problem (even though we have seen no errors or monitoring blips at all on the truenas server)

      posted in Backup
      J
      jshiells
    • RE: MAP_DUPLICATE_KEY error in XOA backup - VM's wont START now!

      @olivierlambert digging more into this today, we did find this error in xensource.log related to that "CHOP" message

      xensource.log.11.gz:Oct 20 10:22:08 xxx-xxx-01 xapi: [error||115 pool_db_backup_thread|Pool DB sync D:d79f115776bd|pool_db_sync] Failed to synchronise DB with host OpaqueRef:a87f2682-dd77-4a2d-aa1a-b831b1d5107f: Server_error(INTERNAL_ERROR, [ Invalid argument: chop ])

      xensource.log.22.gz:Oct 19 06:06:03 fen-xcp-01 xapi: [error||27967996 :::80||backtrace] host.get_servertime D:61ad83a0cd72 failed with exception (Invalid_argument chop)

      posted in Backup
      J
      jshiells