XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. geoffbland
    3. Posts
    G
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 4
    • Posts 49
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: XOSTOR hyperconvergence preview

      @ronan-a said in XOSTOR hyperconvergence preview:

      The outdated flag is removed automatically after a short delay if there is no issue with the network.
      See: https://linbit.com/drbd-user-guide/drbd-guide-9_0-en/#s-outdate
      Do you still have this flag? 🙂

      Sorry about the long delay in this response - unfortunately I have been busy with work and so not able to spend much time looking at this. But two weeks later after the Outdated volume is still present. As far as I can tell there was no issue with the network.

      I wiped the install again and could get DRDB in the same state again by creating a few VMs each with several disks and then deleting the VMs - eventually the issue occurs again.

      I have since wiped again and done a fresh XCPNG install - this time with a dedicated network (separate NICs and private switch) for data and I'll see how that goes.

      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @ronan-a said in XOSTOR hyperconvergence preview:

      You can forget the VDI to remove the VM

      I couldn't forget it as the VM needs to be started to forget it and the VM is stuck in a "paused" state.

      I was eventually able to get the VM in a stopped state by force rebooting all the hosts in the pool. Once the VM was stopped by this I was then able to delete the VM and all XOSTOR disks were then also removed.

      Do you always have this issue when you create new VMs?
      Yes, I got this error anytime I try to create a new VM on the XOSTOR SR. However after rebooting all the hosts in the pool I am able to recreate VMs again.

      I will continue with more testing as and when I get time. Currently I have a VM up and running and seemingly healthy yet linstor reports the volume as outdated, what would cause this and how do I fix it?

      ┊ XCPNG30 ┊ xcp-volume-9163fab8-a449-439d-a599-05b8b2fa27bf ┊ DfltDisklessStorPool ┊     0 ┊    1002 ┊ /dev/drbd1002 ┊           ┊ InUse  ┊ Diskless ┊
      ┊ XCPNG31 ┊ xcp-volume-9163fab8-a449-439d-a599-05b8b2fa27bf ┊ xcp-sr-linstor_group ┊     0 ┊    1002 ┊ /dev/drbd1002 ┊ 20.05 GiB ┊ Unused ┊ UpToDate ┊
      ┊ XCPNG32 ┊ xcp-volume-9163fab8-a449-439d-a599-05b8b2fa27bf ┊ xcp-sr-linstor_group ┊     0 ┊    1002 ┊ /dev/drbd1002 ┊ 20.05 GiB ┊ Unused ┊ Outdated ┊
      
      ┊ XCPNG30 ┊ COMBINED ┊ 192.168.1.30:3366 (PLAIN) ┊ Online ┊
      ┊ XCPNG31 ┊ COMBINED ┊ 192.168.1.31:3366 (PLAIN) ┊ Online ┊
      ┊ XCPNG32 ┊ COMBINED ┊ 192.168.1.32:3366 (PLAIN) ┊ Online ┊
      
      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @ronan-a In my latest test I created a new VM with multiple disks on XOSTOR. This worked OK and I was able to run and access all the disks.

      However I then tried to remove this VM. After a long period of nothing happening (other than the spinning icon on the remove button) I get a "operation timed out" error and the VM is now shown as paused again.

      vm.delete
      {
        "id": "90613dbb-bd40-8082-c227-a318cbdbd01d"
      }
      {
        "call": {
          "method": "VM.hard_shutdown",
          "params": [
            "OpaqueRef:8aa8abb0-d204-43fd-897f-04425b790e68"
          ]
        },
        "message": "operation timed out",
        "name": "TimeoutError",
        "stack": "TimeoutError: operation timed out
          at Promise.call (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/timeout.js:11:16)
          at Xapi.apply (/opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/index.js:693:37)
          at Xapi._call (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/limit-concurrency-decorator/src/index.js:85:24)
          at /opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/index.js:771:21
          at loopResolver (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:83:46)
          at Promise._execute (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/debuggability.js:384:9)
          at Promise._resolveFromExecutor (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:518:18)
          at new Promise (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:103:10)
          at loop (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:85:22)
          at retry (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:87:10)
          at Xapi._sessionCall (/opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/index.js:762:20)
          at Xapi.call (/opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/index.js:273:14)
          at loopResolver (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:83:46)
          at Promise._execute (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/debuggability.js:384:9)
          at Promise._resolveFromExecutor (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:518:18)
          at new Promise (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:103:10)
          at loop (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:85:22)
          at Xapi.retry (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:87:10)
          at Xapi.call (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/promise-toolbox/retry.js:103:18)
          at Xapi.destroy (/opt/xo/xo-builds/xen-orchestra-202206111352/@xen-orchestra/xapi/vm.js:361:18)
          at Api.callApiMethod (file:///opt/xo/xo-builds/xen-orchestra-202206111352/packages/xo-server/src/xo-mixins/api.mjs:310:20)"
      }
      

      If I try to delete again, the same thing happens.

      All the volumes used by this VM still exist on linstor and linstor shows no errors.

      Now when I try to create any new VM, this now also fails with the following error:

      vm.create
      {
        "clone": true,
        "existingDisks": {},
        "installation": {
          "method": "cdrom",
          "repository": "16ead07f-2f23-438f-9010-6f1e6c847e2c"
        },
        "name_label": "testx",
        "template": "d276dc0c-3870-2b7e-64c2-b612bb856227-2cf37285-57bc-4633-a24f-0c6c825dda66",
        "VDIs": [
          {
            "bootable": true,
            "device": "0",
            "size": 23622320128,
            "type": "system",
            "SR": "141d63f6-d3ed-4a2f-588a-1835f0cea588",
            "name_description": "testx_vdi",
            "name_label": "testx_xostor_vdi"
          }
        ],
        "VIFs": [
          {
            "network": "965db545-28a2-5daf-1c90-0ae9a7882bc1",
            "allowedIpv4Addresses": [],
            "allowedIpv6Addresses": []
          }
        ],
        "CPUs": "4",
        "cpusMax": 4,
        "cpuWeight": null,
        "cpuCap": null,
        "name_description": "testx",
        "memory": 4294967296,
        "bootAfterCreate": true,
        "copyHostBiosStrings": false,
        "secureBoot": false,
        "share": false,
        "coreOs": false,
        "tags": [],
        "hvmBootFirmware": "bios"
      }
      {
        "code": "SR_BACKEND_FAILURE_78",
        "params": [
          "",
          "VDI Creation failed [opterr=error Invalid path, current=/dev/drbd1031, expected=/dev/drbd/by-res/xcp-volume-cc55faf8-84a0-431c-a2dc-a618d70e2c49/0 (realpath=/dev/drbd/by-res/xcp-volume-cc55faf8-84a0-431c-a2dc-a618d70e2c49/0)]",
          ""
        ],
        "call": {
          "method": "VDI.create",
          "params": [
            {
              "name_description": "testx_vdi",
              "name_label": "testx_xostor_vdi",
              "other_config": {},
              "read_only": false,
              "sharable": false,
              "SR": "OpaqueRef:7709e595-7889-4cf1-8980-c04bd145d296",
              "type": "user",
              "virtual_size": 23622320128
            }
          ]
        },
        "message": "SR_BACKEND_FAILURE_78(, VDI Creation failed [opterr=error Invalid path, current=/dev/drbd1031, expected=/dev/drbd/by-res/xcp-volume-cc55faf8-84a0-431c-a2dc-a618d70e2c49/0 (realpath=/dev/drbd/by-res/xcp-volume-cc55faf8-84a0-431c-a2dc-a618d70e2c49/0)], )",
        "name": "XapiError",
        "stack": "XapiError: SR_BACKEND_FAILURE_78(, VDI Creation failed [opterr=error Invalid path, current=/dev/drbd1031, expected=/dev/drbd/by-res/xcp-volume-cc55faf8-84a0-431c-a2dc-a618d70e2c49/0 (realpath=/dev/drbd/by-res/xcp-volume-cc55faf8-84a0-431c-a2dc-a618d70e2c49/0)], )
          at Function.wrap (/opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/_XapiError.js:16:12)
          at /opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/transports/json-rpc.js:37:27
          at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
          at cb (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/util.js:355:42)
          at tryCatcher (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/util.js:16:23)
          at Promise._settlePromiseFromHandler (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:547:31)
          at Promise._settlePromise (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:604:18)
          at Promise._settlePromise0 (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:649:10)
          at Promise._settlePromises (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:729:18)
          at _drainQueueStep (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:93:12)
          at _drainQueue (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:86:9)
          at Async._drainQueues (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:102:5)
          at Immediate.Async.drainQueues [as _onImmediate] (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:15:14)
          at processImmediate (internal/timers.js:464:21)
          at process.callbackTrampoline (internal/async_hooks.js:130:17)"
      }
      

      Note /dev/drbd1031 does not exist in /dev/drdb or as a volume.

      How do I remove the test VM? How to fix the issue with creating new VMs?

      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @ronan-a The volumes are reachable on all nodes:

      [16:13 XCPNG30 ~]# linstor --controllers=192.168.1.30,192.168.1.31,192.168.1.32 resource list | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG30 | 7001 | InUse  | Ok    |   UpToDate | 2022-07-15 20:03:53 |
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG31 | 7001 | Unused | Ok    |   UpToDate | 2022-07-15 20:03:59 |
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG32 | 7001 | Unused | Ok    |   Diskless | 2022-07-15 20:03:51 |
      
      [16:12 XCPNG31 ~]# linstor --controllers=192.168.1.30,192.168.1.31,192.168.1.32 resource list | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG30 | 7001 | InUse  | Ok    |   UpToDate | 2022-07-15 20:03:53 |
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG31 | 7001 | Unused | Ok    |   UpToDate | 2022-07-15 20:03:59 |
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG32 | 7001 | Unused | Ok    |   Diskless | 2022-07-15 20:03:51 |
      
      [16:14 XCPNG32 ~]# linstor --controllers=192.168.1.30,192.168.1.31,192.168.1.32 resource list | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG30 | 7001 | InUse  | Ok    |   UpToDate | 2022-07-15 20:03:53 |
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG31 | 7001 | Unused | Ok    |   UpToDate | 2022-07-15 20:03:59 |
      | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | XCPNG32 | 7001 | Unused | Ok    |   Diskless | 2022-07-15 20:03:51 |
      

      Volumes appear to be OK on 2 hosts and not present on the third - although with rep set as 2 I think that is expected?

      [16:14 XCPNG30 ~]# lvs | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
        xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f_00000 linstor_group                                      -wi-ao---- <40.10g                   
      
      [16:15 XCPNG31 ~]# lvs | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
        xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f_00000 linstor_group                                      -wi-ao---- <40.10g                                    
        
      [16:19 XCPNG32 ~]# lvs | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
      No lines
      

      Although when running the lvs command on each host I am getting a lot of warnings about DRDB volumes - these seem to be volumes that were previously deleted but not cleaned up fully:

        /dev/drbd1024: open failed: Wrong medium type
        /dev/drbd1026: open failed: Wrong medium type
        /dev/drbd1028: open failed: Wrong medium type
        /dev/drbd1000: open failed: Wrong medium type
        /dev/drbd1002: open failed: Wrong medium type
        /dev/drbd1012: open failed: Wrong medium type
        /dev/drbd1014: open failed: Wrong medium type
        /dev/drbd1016: open failed: Wrong medium type
        /dev/drbd1018: open failed: Wrong medium type
        /dev/drbd1020: open failed: Wrong medium type
        /dev/drbd1022: open failed: Wrong medium type
      
      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @ronan-a

      As promised I have done some more "organised" testing, with a brand new cluster set up to test XOSTOR. Early simple tests seemed to be OK and I can create, restart, snapshot, move and delete VMs with no problem.

      But then after putting a VM under load for an hour and then restarting it I am seeing the same weird behaviour I saw previously with XOSTOR.

      Firstly, the VM took far longer than expected to shutdown. Then trying to restart the VM fails.

      Let me know if you want me to provide logs or any specific testing.

      vm.start
      {
        "id": "ade699f2-42f0-8629-35ea-6fcc69de99d7",
        "bypassMacAddressesCheck": false,
        "force": false
      }
      {
        "code": "SR_BACKEND_FAILURE_1200",
        "params": [
          "",
          "No such Tapdisk(minor=12)",
          ""
        ],
        "call": {
          "method": "VM.start",
          "params": [
            "OpaqueRef:a60d0553-a2f2-41e6-9df4-fad745fbacc8",
            false,
            false
          ]
        },
        "message": "SR_BACKEND_FAILURE_1200(, No such Tapdisk(minor=12), )",
        "name": "XapiError",
        "stack": "XapiError: SR_BACKEND_FAILURE_1200(, No such Tapdisk(minor=12), )
          at Function.wrap (/opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/_XapiError.js:16:12)
          at /opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/transports/json-rpc.js:37:27
          at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
          at cb (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/util.js:355:42)
          at tryCatcher (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/util.js:16:23)
          at Promise._settlePromiseFromHandler (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:547:31)
          at Promise._settlePromise (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:604:18)
          at Promise._settlePromise0 (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:649:10)
          at Promise._settlePromises (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:729:18)
          at _drainQueueStep (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:93:12)
          at _drainQueue (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:86:9)
          at Async._drainQueues (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:102:5)
          at Immediate.Async.drainQueues [as _onImmediate] (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:15:14)
          at processImmediate (internal/timers.js:464:21)
          at process.callbackTrampoline (internal/async_hooks.js:130:17)"
      }
      

      The linstore volume for this VM appears to be OK (by the way I hope eventually we have an easy way to match up VMs to Linstore volumes)

      [16:53 XCPNG31 ~]# linstor volume list | grep xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f
      | XCPNG30 | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | xcp-sr-linstor_group |     0 |    1001 | /dev/drbd1001 | 40.10 GiB | InUse  |   UpToDate |
      | XCPNG31 | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | xcp-sr-linstor_group |     0 |    1001 | /dev/drbd1001 | 40.10 GiB | Unused |   UpToDate |
      | XCPNG32 | xcp-volume-00b34ae3-2ad3-44ea-aa13-d5de1fbf756f | DfltDisklessStorPool |     0 |    1001 | /dev/drbd1001 |           | Unused |   Diskless |
      
      [16:59 XCPNG31 ~]# linstor node list
      ╭─────────────────────────────────────────────────────────╮
      ┊ Node    ┊ NodeType ┊ Addresses                 ┊ State  ┊
      ╞═════════════════════════════════════════════════════════╡
      ┊ XCPNG30 ┊ COMBINED ┊ 192.168.1.30:3366 (PLAIN) ┊ Online ┊
      ┊ XCPNG31 ┊ COMBINED ┊ 192.168.1.31:3366 (PLAIN) ┊ Online ┊
      ┊ XCPNG32 ┊ COMBINED ┊ 192.168.1.32:3366 (PLAIN) ┊ Online ┊
      ╰─────────────────────────────────────────────────────────╯
      [17:13 XCPNG31 ~]# linstor storage-pool list
      ╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
      ┊ StoragePool          ┊ Node    ┊ Driver   ┊ PoolName      ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName ┊
      ╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
      ┊ DfltDisklessStorPool ┊ XCPNG30 ┊ DISKLESS ┊               ┊              ┊               ┊ False        ┊ Ok    ┊            ┊
      ┊ DfltDisklessStorPool ┊ XCPNG31 ┊ DISKLESS ┊               ┊              ┊               ┊ False        ┊ Ok    ┊            ┊
      ┊ DfltDisklessStorPool ┊ XCPNG32 ┊ DISKLESS ┊               ┊              ┊               ┊ False        ┊ Ok    ┊            ┊
      ┊ xcp-sr-linstor_group ┊ XCPNG30 ┊ LVM      ┊ linstor_group ┊     3.50 TiB ┊      3.64 TiB ┊ False        ┊ Ok    ┊            ┊
      ┊ xcp-sr-linstor_group ┊ XCPNG31 ┊ LVM      ┊ linstor_group ┊     3.49 TiB ┊      3.64 TiB ┊ False        ┊ Ok    ┊            ┊
      ┊ xcp-sr-linstor_group ┊ XCPNG32 ┊ LVM      ┊ linstor_group ┊     3.50 TiB ┊      3.64 TiB ┊ False        ┊ Ok    ┊            ┊
      ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
      
      

      I am seeing the following errors in the SMLog (bb027c9a-5655-4f93-9090-e76b34b2c90d is the disk for this VM)

      Jul 23 16:48:30 XCPNG30 SM: [22961] lock: opening lock file /var/lock/sm/bb027c9a-5655-4f93-9090-e76b34b2c90d/vdi
      Jul 23 16:48:30 XCPNG30 SM: [22961] blktap2.deactivate
      Jul 23 16:48:30 XCPNG30 SM: [22961] lock: acquired /var/lock/sm/bb027c9a-5655-4f93-9090-e76b34b2c90d/vdi
      Jul 23 16:48:30 XCPNG30 SM: [22961] ['/usr/sbin/tap-ctl', 'close', '-p', '19527', '-m', '2', '-t', '30']
      Jul 23 16:49:00 XCPNG30 SM: [22961]  = 5
      Jul 23 16:49:00 XCPNG30 SM: [22961] ***** BLKTAP2:<function _deactivate_locked at 0x7f6fb33208c0>: EXCEPTION <class 'blktap2.CommandFailure'>, ['/usr/sbin/tap-ctl', 'close', '-p', '19527', '-m', '2', '-t', '30'] failed: status=5, pid=22983, errmsg=Input/output error
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 85, in wrapper
      Jul 23 16:49:00 XCPNG30 SM: [22961]     ret = op(self, *args)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1744, in _deactivate_locked
      Jul 23 16:49:00 XCPNG30 SM: [22961]     self._deactivate(sr_uuid, vdi_uuid, caching_params)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1785, in _deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     self._tap_deactivate(minor)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1368, in _tap_deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     tapdisk.shutdown()
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 880, in shutdown
      Jul 23 16:49:00 XCPNG30 SM: [22961]     TapCtl.close(self.pid, self.minor, force)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 433, in close
      Jul 23 16:49:00 XCPNG30 SM: [22961]     cls._pread(args)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 296, in _pread
      Jul 23 16:49:00 XCPNG30 SM: [22961]     tapctl._wait(quiet)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 285, in _wait
      Jul 23 16:49:00 XCPNG30 SM: [22961]     raise self.CommandFailure(self.cmd, **info)
      Jul 23 16:49:00 XCPNG30 SM: [22961]
      Jul 23 16:49:00 XCPNG30 SM: [22961] lock: released /var/lock/sm/bb027c9a-5655-4f93-9090-e76b34b2c90d/vdi
      Jul 23 16:49:00 XCPNG30 SM: [22961] call-plugin on 7aaaf4a5-0e43-442e-a9b1-38620c87fd69 (linstor-manager:lockVdi with {'groupName': 'linstor_group', 'srUuid': '141d63f6-d3ed-4a2f-588a-1835f0cea588', 'vdiUuid': 'bb027c9a-5655-4f93-9090-e76b34b2c90d', 'locked': 'False'}) returned: True
      Jul 23 16:49:00 XCPNG30 SM: [22961] ***** generic exception: vdi_deactivate: EXCEPTION <class 'blktap2.CommandFailure'>, ['/usr/sbin/tap-ctl', 'close', '-p', '19527', '-m', '2', '-t', '30'] failed: status=5, pid=22983, errmsg=Input/output error
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Jul 23 16:49:00 XCPNG30 SM: [22961]     return self._run_locked(sr)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Jul 23 16:49:00 XCPNG30 SM: [22961]     rv = self._run(sr, target)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 274, in _run
      Jul 23 16:49:00 XCPNG30 SM: [22961]     caching_params)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1729, in deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     if self._deactivate_locked(sr_uuid, vdi_uuid, caching_params):
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 85, in wrapper
      Jul 23 16:49:00 XCPNG30 SM: [22961]     ret = op(self, *args)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1744, in _deactivate_locked
      Jul 23 16:49:00 XCPNG30 SM: [22961]     self._deactivate(sr_uuid, vdi_uuid, caching_params)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1785, in _deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     self._tap_deactivate(minor)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1368, in _tap_deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     tapdisk.shutdown()
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 880, in shutdown
      Jul 23 16:49:00 XCPNG30 SM: [22961]     TapCtl.close(self.pid, self.minor, force)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 433, in close
      Jul 23 16:49:00 XCPNG30 SM: [22961]     cls._pread(args)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 296, in _pread
      Jul 23 16:49:00 XCPNG30 SM: [22961]     tapctl._wait(quiet)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 285, in _wait
      Jul 23 16:49:00 XCPNG30 SM: [22961]     raise self.CommandFailure(self.cmd, **info)
      Jul 23 16:49:00 XCPNG30 SM: [22961]
      Jul 23 16:49:00 XCPNG30 SM: [22961] ***** LINSTOR resources on XCP-ng: EXCEPTION <class 'blktap2.CommandFailure'>, ['/usr/sbin/tap-ctl', 'close', '-p', '19527', '-m', '2', '-t', '30'] failed: status=5, pid=22983, errmsg=Input/output error
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 378, in run
      Jul 23 16:49:00 XCPNG30 SM: [22961]     ret = cmd.run(sr)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Jul 23 16:49:00 XCPNG30 SM: [22961]     return self._run_locked(sr)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Jul 23 16:49:00 XCPNG30 SM: [22961]     rv = self._run(sr, target)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/SRCommand.py", line 274, in _run
      Jul 23 16:49:00 XCPNG30 SM: [22961]     caching_params)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1729, in deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     if self._deactivate_locked(sr_uuid, vdi_uuid, caching_params):
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 85, in wrapper
      Jul 23 16:49:00 XCPNG30 SM: [22961]     ret = op(self, *args)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1744, in _deactivate_locked
      Jul 23 16:49:00 XCPNG30 SM: [22961]     self._deactivate(sr_uuid, vdi_uuid, caching_params)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1785, in _deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     self._tap_deactivate(minor)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 1368, in _tap_deactivate
      Jul 23 16:49:00 XCPNG30 SM: [22961]     tapdisk.shutdown()
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 880, in shutdown
      Jul 23 16:49:00 XCPNG30 SM: [22961]     TapCtl.close(self.pid, self.minor, force)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 433, in close
      Jul 23 16:49:00 XCPNG30 SM: [22961]     cls._pread(args)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 296, in _pread
      Jul 23 16:49:00 XCPNG30 SM: [22961]     tapctl._wait(quiet)
      Jul 23 16:49:00 XCPNG30 SM: [22961]   File "/opt/xensource/sm/blktap2.py", line 285, in _wait
      Jul 23 16:49:00 XCPNG30 SM: [22961]     raise self.CommandFailure(self.cmd, **info)
      Jul 23 16:49:00 XCPNG30 SM: [22961]
      

      If I try to start this VM after the above issue I get this error:

      vm.start
      {
        "id": "ade699f2-42f0-8629-35ea-6fcc69de99d7",
        "bypassMacAddressesCheck": false,
        "force": false
      }
      {
        "code": "FAILED_TO_START_EMULATOR",
        "params": [
          "OpaqueRef:a60d0553-a2f2-41e6-9df4-fad745fbacc8",
          "domid 12",
          "QMP failure at File \"xc/device.ml\", line 3366, characters 71-78"
        ],
        "call": {
          "method": "VM.start",
          "params": [
            "OpaqueRef:a60d0553-a2f2-41e6-9df4-fad745fbacc8",
            false,
            false
          ]
        },
        "message": "FAILED_TO_START_EMULATOR(OpaqueRef:a60d0553-a2f2-41e6-9df4-fad745fbacc8, domid 12, QMP failure at File \"xc/device.ml\", line 3366, characters 71-78)",
        "name": "XapiError",
        "stack": "XapiError: FAILED_TO_START_EMULATOR(OpaqueRef:a60d0553-a2f2-41e6-9df4-fad745fbacc8, domid 12, QMP failure at File \"xc/device.ml\", line 3366, characters 71-78)
          at Function.wrap (/opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/_XapiError.js:16:12)
          at /opt/xo/xo-builds/xen-orchestra-202206111352/packages/xen-api/src/transports/json-rpc.js:37:27
          at AsyncResource.runInAsyncScope (async_hooks.js:197:9)
          at cb (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/util.js:355:42)
          at tryCatcher (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/util.js:16:23)
          at Promise._settlePromiseFromHandler (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:547:31)
          at Promise._settlePromise (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:604:18)
          at Promise._settlePromise0 (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:649:10)
          at Promise._settlePromises (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/promise.js:729:18)
          at _drainQueueStep (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:93:12)
          at _drainQueue (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:86:9)
          at Async._drainQueues (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:102:5)
          at Immediate.Async.drainQueues [as _onImmediate] (/opt/xo/xo-builds/xen-orchestra-202206111352/node_modules/bluebird/js/release/async.js:15:14)
          at processImmediate (internal/timers.js:464:21)
          at process.callbackTrampoline (internal/async_hooks.js:130:17)"
      }
      

      This VM is set up the same way as several other VMs I have running on another cluster but using SRs on NFS mounts.

      This is with latest XCP-ng version, all servers patched and up-to-date.

      yum update
      ...
      No packages marked for update
      
      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @ronan-a said in XOSTOR hyperconvergence preview:

      Yes, you can. Please take a look to: https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/#s-managing_network_interface_cards
      To get the storage pool name, execute this command in your pool:

      OK - so to set up this XOSTOR SR I would first

      xe sr-create type=linstor name-label=XOSTOR01 host-uuid=xxx device-config:hosts=<XCPNG Host Names> ...etc...

      to create the Linstor storage-pool. Then

      linstor storage-pool list

      to get the name the pool. Then on each node found in device-config:hosts=<XCPNG Host Names> run the following command:

      linstor storage-pool set-property <host/node_name> <pool_name> PrefNic <nic_name>

      where nic_name is the name of the Linstor interface created for the specific NIC.

      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      Doing some more testing on XOSTOR and starting from scratch again. A brand-new XCP-ng installation made onto a new 3 server pool, each server with a blank 4GB disk.
      All servers have xcpng patched up to date.
      Then I installed XOSTOR onto each of these servers, Linstor installed OK and I have the linstor group on each.

      [19:22 XCPNG30 ~]# vgs
        VG                                                 #PV #LV #SN Attr   VSize    VFree
        VG_XenStorage-a776b6b1-9a96-e179-ea12-f2419ae512b6   1   1   0 wz--n- <405.62g 405.61g
        linstor_group                                        1   0   0 wz--n-   <3.64t  <3.64t
      [19:22 XCPNG30 ~]# rpm -qa | grep -E "^(sm|xha)-.*linstor.*"
      xha-10.1.0-2.2.0.linstor.2.xcpng8.2.x86_64
      sm-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64
      sm-rawhba-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64
      
      [19:21 XCPNG31 ~]# vgs
        VG                                                 #PV #LV #SN Attr   VSize    VFree
        VG_XenStorage-f75785ef-df30-b54c-2af4-84d19c966453   1   1   0 wz--n- <405.62g 405.61g
        linstor_group                                        1   0   0 wz--n-   <3.64t  <3.64t
      [19:21 XCPNG31 ~]# rpm -qa | grep -E "^(sm|xha)-.*linstor.*"
      xha-10.1.0-2.2.0.linstor.2.xcpng8.2.x86_64
      sm-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64
      sm-rawhba-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64
      
      [19:23 XCPNG32 ~]# vgs
        VG                                                 #PV #LV #SN Attr   VSize    VFree
        VG_XenStorage-abaf8356-fc58-9124-a23b-c29e7e67c983   1   1   0 wz--n- <405.62g 405.61g
        linstor_group                                        1   0   0 wz--n-   <3.64t  <3.64t
      [19:23 XCPNG32 ~]# rpm -qa | grep -E "^(sm|xha)-.*linstor.*"
      xha-10.1.0-2.2.0.linstor.2.xcpng8.2.x86_64
      sm-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64
      sm-rawhba-2.30.7-1.3.0.linstor.1.xcpng8.2.x86_64
      
      [19:26 XCPNG31 ~]# xe host-list
      uuid ( RO)                : 7c3f2fae-0456-4155-a9ad-43790fcb4155
                name-label ( RW): XCPNG32
          name-description ( RW): Default install
      
      uuid ( RO)                : 2e48b46a-c420-4957-9233-3e029ea39305
                name-label ( RW): XCPNG30
          name-description ( RW): Default install
      
      uuid ( RO)                : 7aaaf4a5-0e43-442e-a9b1-38620c87fd69
                name-label ( RW): XCPNG31
          name-description ( RW): Default install
      

      But I am not able to create the SR.

      xe sr-create type=linstor name-label=XOSTOR01 host-uuid=7aaaf4a5-0e43-442e-a9b1-38620c87fd69 device-config:hosts=xcpng30,xcpng31,xcpng32 device-config:group-name=linstor_group device-config:redundancy=2 shared=true device-config:provisioning=thick
      

      This gives the following error:

      Error code: SR_BACKEND_FAILURE_5006
      Error parameters: , LINSTOR SR creation error [opterr=Not enough online hosts],
      

      Here's the error in the SMLog

      Jul 15 19:29:22 XCPNG31 SM: [9747] sr_create {'sr_uuid': '14aa2b8b-430f-34e5-fb74-c37667cb18ec', 'subtask_of': 'DummyRef:|d39839f1-ee3a-4bfe-8a41-7a077f4f2640|SR.create', 'args': ['0'], 'host_ref': 'OpaqueRef:196f738d-24fa-4598-8e96-4a13390abc87', 'session_ref': 'OpaqueRef:e806b347-1e5f-4644-842f-26a7b06b2561', 'device_config': {'group-name': 'linstor_group', 'redundancy': '2', 'hosts': 'xcpng30,xcpng31,xcpng32', 'SRmaster': 'true', 'provisioning': 'thick'}, 'command': 'sr_create', 'sr_ref': 'OpaqueRef:7ded7feb-729f-47c3-9893-1b62db0b7e17'}
      Jul 15 19:29:22 XCPNG31 SM: [9747] LinstorSR.create for 14aa2b8b-430f-34e5-fb74-c37667cb18ec
      Jul 15 19:29:22 XCPNG31 SM: [9747] Raising exception [5006, LINSTOR SR creation error [opterr=Not enough online hosts]]
      Jul 15 19:29:22 XCPNG31 SM: [9747] lock: released /var/lock/sm/14aa2b8b-430f-34e5-fb74-c37667cb18ec/sr
      Jul 15 19:29:22 XCPNG31 SM: [9747] ***** generic exception: sr_create: EXCEPTION <class 'SR.SROSError'>, LINSTOR SR creation error [opterr=Not enough online hosts]
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return self._run_locked(sr)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Jul 15 19:29:22 XCPNG31 SM: [9747]     rv = self._run(sr, target)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 323, in _run
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return sr.create(self.params['sr_uuid'], long(self.params['args'][0]))
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 612, in wrap
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return load(self, *args, **kwargs)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 597, in load
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return wrapped_method(self, *args, **kwargs)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 443, in wrapped_method
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return method(self, *args, **kwargs)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 688, in create
      Jul 15 19:29:22 XCPNG31 SM: [9747]     opterr='Not enough online hosts'
      Jul 15 19:29:22 XCPNG31 SM: [9747]
      Jul 15 19:29:22 XCPNG31 SM: [9747] ***** LINSTOR resources on XCP-ng: EXCEPTION <class 'SR.SROSError'>, LINSTOR SR creation error [opterr=Not enough online hosts]
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 378, in run
      Jul 15 19:29:22 XCPNG31 SM: [9747]     ret = cmd.run(sr)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return self._run_locked(sr)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Jul 15 19:29:22 XCPNG31 SM: [9747]     rv = self._run(sr, target)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/SRCommand.py", line 323, in _run
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return sr.create(self.params['sr_uuid'], long(self.params['args'][0]))
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 612, in wrap
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return load(self, *args, **kwargs)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 597, in load
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return wrapped_method(self, *args, **kwargs)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 443, in wrapped_method
      Jul 15 19:29:22 XCPNG31 SM: [9747]     return method(self, *args, **kwargs)
      Jul 15 19:29:22 XCPNG31 SM: [9747]   File "/opt/xensource/sm/LinstorSR", line 688, in create
      Jul 15 19:29:22 XCPNG31 SM: [9747]     opterr='Not enough online hosts'
      

      I have found the issue - the device-config:hosts list is case-sensitive, if the hosts are given in lower-case the above error occurs. Specifying the hosts in upper-case works.

      Also using a fully-qualified name for the host fails - regardless of the case used.

      posted in XOSTOR
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      Is it possible to use a separate network for the XOSTOR/Linstor disk replication from the "main" network used for XCP-ng servers?

      If so when the SR is created with this command:

      xe sr-create type=linstor name-label=XOSTOR host-uuid=bc3cd3af-3f09-48cf-ae55-515ba21930f5 device-config:hosts=host-a,host-b,host-c,host-d device-config:group-name=linstor_group/thin_device device-config:redundancy=4 shared=true device-config:provisioning=thin
      

      Do the device-config:hosts need to be XCP-ng hosts - or can IP address of the "data-replication" network be provided here.

      For example, my XCP-NG servers have dual NICs, I could use the second NIC on a private network/switch with a different subnet to the "main" hosts and use this solely for XOSTOR/Linstor disk replication. Is this possible?

      posted in XOSTOR
      G
      geoffbland
    • RE: XO 5.72 Storage Maintenance

      Could you install your XO VM on a completely different host/hypervisor - for example, on Hyper-V on you PC?

      I have one XO VM running on my XCP-NG pool and another XO VM running on an UNRAID server - just for these kind of situations.

      posted in Xen Orchestra
      G
      geoffbland
    • RE: XCP / XO and Truenas Scale or Core

      @mauzilla said in XCP / XO and Truenas Scale or Core:

      I opted to install Scale today but as murphy would have it, I did not know of the caveat of raid controller support only to find this just as I was about to setup the NFS mount. The servers I am using is Dell R720DX with H710 raid controller, and my understanding is that it's a no-go due ZFS and performance impact / possibility of data loss when using a raid controller. It's a bit of a logistical nightmare as I also don't feel comfortable in flashing vendor hardware with non-vendor firmware (just doesn't feel right for something meant to be in production) and I am 800km away from the cabinet so i've opted to take a chance until I find a suitable HBA card

      FWIW I too have a R720 with a H710P RAID card. I was able to follow this guide https://fohdeesha.com/docs/perc.html to flash the card into "IT mode" and access the disks as HBA. It worked flawlessly and was fairly easy and quick to do. You can flash it back to RAID if needed.

      My tests showed TrueNAS Scale worked well with the H710P in IT mode, seeing all the individual disks with no issue.

      Eventually I decided to drop back to TrueNAS Core due to issues unrelated to the controller or disks.

      posted in Xen Orchestra
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      All the access rights to the isos looked OK on the remote server but just to check I wiped the share. Recreated it and copied back the isos. In then recreated the SR in XCP-ng and now it works. It had all been working fine and I had not changed anything on the share - it just stopped working. So please consider this fixed now - looks like some weird issue with the mount and not a problem with XCP-ng - sorry for wasting your time looking at this.

      posted in Xen Orchestra
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      @olivierlambert said in Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.:

      Try to create a file in there instead of just listing 😉

      I have access....

      [18:13 XCPNG01 ec87c10e-1499-c1c5-cf3f-c234062bb459]# pwd
      /var/run/sr-mount/ec87c10e-1499-c1c5-cf3f-c234062bb459
      [18:13 XCPNG01 ec87c10e-1499-c1c5-cf3f-c234062bb459]# touch new_file
      [18:14 XCPNG01 ec87c10e-1499-c1c5-cf3f-c234062bb459]# ll new_file
      -rw-r----- 1 nfsnobody nfsnobody 0 Jun 13 18:14 new_file
      

      Why is R/W access needed on the ISO SR?

      posted in Xen Orchestra
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @yrrips said in XOSTOR hyperconvergence preview:

      when it came to enabling HA I ran into trouble. Is XOSTOR not a valid shared storage target to enable HA on?

      I'm also really hoping that XOSTOR works well and also feel like this is something XCP-NG really needs. I've tried other distributed storage solutions, notably GlusterFS but never found anything that really works 100% when outages occur.

      Note I plan to use XOSTOR just for VM disks and any data they use - not for the HA share and not for backups. My logic is that:

      • backups should not be anywhere near XCP-NG and should be isolated software and hardware - and hopefully location wise.
      • HA share needs to survive a issue where XCP-NG HA fails; if XCP-NG quorum is not working then XOSTOR (Linstor) quorum may be affected in a similar way - so I keep my HA share on a reliable NFS share. Note that in testing I found if the HA share is not available for a while that XCP-NG stays running OK (just don't make any configuration changes until HA share is back).
      posted in XOSTOR
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      @ronan-a said in Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.:

      There is this error during the tapdisk open call: Permission denied (errno 13).
      Are you sure you can access correctly to the data of your SR?

      I'm pretty sure I can...

      [11:42 XCPNG01 ~]# whoami
      root
      [11:42 XCPNG01 ~]# ll /var/run/sr-mount/ec87c10e-1499-c1c5-cf3f-c234062bb459/ubuntu-22.04-live-server-amd64.iso
      -rwxrwx--- 1 root users 1466714112 Apr 21 19:20 /var/run/sr-mount/ec87c10e-1499-c1c5-cf3f-c234062bb459/ubuntu-22.04-live-server-amd64.iso
      
      posted in Xen Orchestra
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      @olivierlambert Is it possible to rollback to a "standard" version of XCP-NG on these servers? I did a quick search to see if there was a way to rollback to a given XCP-NG version but did not find anything specific.

      posted in Xen Orchestra
      G
      geoffbland
    • RE: XOSTOR hyperconvergence preview

      @ronan-a Could you please take a look at this issue I raised elsewhere on the forums.

      I am currently unable to create new VMs, getting a No such Tapdisk error - checking down the stack trace - it seems to be coming from a get() call in /opt/xensource/sm/blktap2.py and this code seems to have been changed in a XOSTOR release from 24th May.

      posted in XOSTOR
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      Checking the code of blktap2.py on github and what I have on my system and I can see that they are subtly different.

      I note last change to blktap2.py on github was 6th April 2020, I installed XCPNG November last year but I had recently been testing with the new XOSTOR. Checking the date of blktap2.py I find that it is 24th May 2022 so possibly this is a bug with the latest XOSTOR release?

      posted in Xen Orchestra
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      @Danp said in Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.:

      Have you checked SMlog?

      Creating a new VM now and checking the SMLog I see the a similar error about the tapdisk (VM was created on the master host of the pool) ***** generic exception: vdi_attach: EXCEPTION <class 'blktap2.TapdiskNotRunning'>, No such Tapdisk(minor=5)

      Full log:

      Jun  8 22:39:00 XCPNG02 SM: [11278] lock: opening lock file /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11278] lock: acquired /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11278] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:00 XCPNG02 SM: [11278]   pread SUCCESS
      Jun  8 22:39:00 XCPNG02 SM: [11278] lock: released /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11278] vdi_epoch_begin {'sr_uuid': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed', 'subtask_of': 'DummyRef:|667b7d40-d2f2-46a6-aedd-b7faf50894f8|VDI.epoch_begin', 'vdi_ref': 'OpaqueRef:e7f286ca-bfe4-4cfe-8ae6-20a5c589ae2e', 'vdi_on_boot': 'persist', 'args': [], 'o_direct': False, 'vdi_location': '1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:4e700825-f840-4e11-9c8b-eb08ade578a5', 'device_config': {'SRmaster': 'true', 'serverpath': '/mnt/Pool01/Remote_VM_Images/xcpng_vm_images', 'server': 'TNC01.NEWT.newtcomputing.com'}, 'command': 'vdi_epoch_begin', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:b1f916b2-51fb-4726-bcbf-49ab05b0cf50', 'vdi_uuid': '1d5654c5-0fe4-4a07-b5fb-29b58922870f'}
      Jun  8 22:39:00 XCPNG02 SM: [11305] vdi_epoch_begin {'sr_uuid': 'ec87c10e-1499-c1c5-cf3f-c234062bb459', 'subtask_of': 'DummyRef:|362ae06a-b4bc-412a-babf-762ff2cda1f0|VDI.epoch_begin', 'vdi_ref': 'OpaqueRef:e2db345c-54ef-47c0-ad1f-6f9ea27f6e4a', 'vdi_on_boot': 'persist', 'args': [], 'vdi_location': 'ubuntu-22.04-live-server-amd64.iso', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:f1407219-7293-44b2-9e3b-a1a03cf30416', 'device_config': {'SRmaster': 'true', 'location': 'UNRAID01.NEWT.newtcomputing.com:/mnt/user/isos'}, 'command': 'vdi_epoch_begin', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:9c3de093-8c48-4591-ac5d-dae913519937', 'vdi_uuid': '56e01d87-0eb5-4f03-b916-b74484360738'}
      Jun  8 22:39:00 XCPNG02 SM: [11322] lock: opening lock file /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11322] lock: acquired /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11322] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:00 XCPNG02 SM: [11322]   pread SUCCESS
      Jun  8 22:39:00 XCPNG02 SM: [11322] vdi_attach {'sr_uuid': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed', 'subtask_of': 'DummyRef:|cc3f614f-68af-4c5b-a9fa-2dcc4ccdbab1|VDI.attach2', 'vdi_ref': 'OpaqueRef:e7f286ca-bfe4-4cfe-8ae6-20a5c589ae2e', 'vdi_on_boot': 'persist', 'args': ['true'], 'o_direct': False, 'vdi_location': '1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:cfd26e8a-b81b-4099-ab0e-a60dd825cc1c', 'device_config': {'SRmaster': 'true', 'serverpath': '/mnt/Pool01/Remote_VM_Images/xcpng_vm_images', 'server': 'TNC01.NEWT.newtcomputing.com'}, 'command': 'vdi_attach', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:b1f916b2-51fb-4726-bcbf-49ab05b0cf50', 'vdi_uuid': '1d5654c5-0fe4-4a07-b5fb-29b58922870f'}
      Jun  8 22:39:00 XCPNG02 SM: [11322] lock: opening lock file /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:00 XCPNG02 SM: [11322] <__main__.NFSFileVDI object at 0x7fea6672d150>
      Jun  8 22:39:00 XCPNG02 SM: [11322] result: {'params_nbd': 'nbd:unix:/run/blktap-control/nbd/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'o_direct_reason': 'NO_RO_IMAGE', 'params': '/dev/sm/backend/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'o_direct': True, 'xenstore_data': {'scsi/0x12/0x80': 'AIAAEjFkNTY1NGM1LTBmZTQtNGEgIA==', 'scsi/0x12/0x83': 'AIMAMQIBAC1YRU5TUkMgIDFkNTY1NGM1LTBmZTQtNGEwNy1iNWZiLTI5YjU4OTIyODcwZiA=', 'vdi-uuid': '1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'mem-pool': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed'}}
      Jun  8 22:39:00 XCPNG02 SM: [11322] lock: released /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11353] lock: opening lock file /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11353] lock: acquired /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:00 XCPNG02 SM: [11353]   pread SUCCESS
      Jun  8 22:39:00 XCPNG02 SM: [11353] lock: released /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:00 XCPNG02 SM: [11353] vdi_activate {'sr_uuid': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed', 'subtask_of': 'DummyRef:|07b91a8d-5c31-4d6e-aca7-b8e3679dbc69|VDI.activate', 'vdi_ref': 'OpaqueRef:e7f286ca-bfe4-4cfe-8ae6-20a5c589ae2e', 'vdi_on_boot': 'persist', 'args': ['true'], 'o_direct': False, 'vdi_location': '1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:91fe63d1-9495-4ec6-bf3a-27a745a0561d', 'device_config': {'SRmaster': 'true', 'serverpath': '/mnt/Pool01/Remote_VM_Images/xcpng_vm_images', 'server': 'TNC01.NEWT.newtcomputing.com'}, 'command': 'vdi_activate', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:b1f916b2-51fb-4726-bcbf-49ab05b0cf50', 'vdi_uuid': '1d5654c5-0fe4-4a07-b5fb-29b58922870f'}
      Jun  8 22:39:00 XCPNG02 SM: [11353] lock: opening lock file /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:00 XCPNG02 SM: [11353] blktap2.activate
      Jun  8 22:39:00 XCPNG02 SM: [11353] lock: acquired /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:00 XCPNG02 SM: [11353] Adding tag to: 1d5654c5-0fe4-4a07-b5fb-29b58922870f
      Jun  8 22:39:00 XCPNG02 SM: [11353] Activate lock succeeded
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:00 XCPNG02 SM: [11353]   pread SUCCESS
      Jun  8 22:39:00 XCPNG02 SM: [11353] PhyLink(/dev/sm/phy/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f) -> /var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd
      Jun  8 22:39:00 XCPNG02 SM: [11353] <NFSSR.NFSFileVDI object at 0x7ff0bb6b9610>
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/tap-ctl', 'allocate']
      Jun  8 22:39:00 XCPNG02 SM: [11353]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/tap-ctl', 'spawn']
      Jun  8 22:39:00 XCPNG02 SM: [11353]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/tap-ctl', 'attach', '-p', '11418', '-m', '4']
      Jun  8 22:39:00 XCPNG02 SM: [11353]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/tap-ctl', 'open', '-p', '11418', '-m', '4', '-a', 'vhd:/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd', '-t', '40']
      Jun  8 22:39:00 XCPNG02 SM: [11353]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11353] ['/usr/sbin/tap-ctl', 'open', '-p', '11418', '-m', '4', '-a', 'vhd:/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd', '-t', '40']
      Jun  8 22:39:00 XCPNG02 SM: [11353]  = 114
      Jun  8 22:39:00 XCPNG02 SM: [11353] Set scheduler to [noop] on [/sys/dev/block/254:4]
      Jun  8 22:39:00 XCPNG02 SM: [11353] tap.activate: Launched Tapdisk(vhd:/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd, pid=11418, minor=4, state=R)
      Jun  8 22:39:00 XCPNG02 SM: [11353] DeviceNode(/dev/sm/backend/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f) -> /dev/xen/blktap-2/tapdev4
      Jun  8 22:39:00 XCPNG02 SM: [11353] NBDLink(/run/blktap-control/nbd/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f) -> /run/blktap-control/nbd11418.4
      Jun  8 22:39:00 XCPNG02 SM: [11353] lock: released /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:00 XCPNG02 SM: [11473] vdi_attach {'sr_uuid': 'ec87c10e-1499-c1c5-cf3f-c234062bb459', 'subtask_of': 'DummyRef:|f3f51060-a443-4a91-8eb9-e9b9e1b22a51|VDI.attach2', 'vdi_ref': 'OpaqueRef:e2db345c-54ef-47c0-ad1f-6f9ea27f6e4a', 'vdi_on_boot': 'persist', 'args': ['false'], 'vdi_location': 'ubuntu-22.04-live-server-amd64.iso', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:987a7a78-b617-4d8a-8d3d-68e565aca8aa', 'device_config': {'SRmaster': 'true', 'location': 'UNRAID01.NEWT.newtcomputing.com:/mnt/user/isos'}, 'command': 'vdi_attach', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:9c3de093-8c48-4591-ac5d-dae913519937', 'vdi_uuid': '56e01d87-0eb5-4f03-b916-b74484360738'}
      Jun  8 22:39:00 XCPNG02 SM: [11473] lock: opening lock file /var/lock/sm/56e01d87-0eb5-4f03-b916-b74484360738/vdi
      Jun  8 22:39:00 XCPNG02 SM: [11473] Attach & activate
      Jun  8 22:39:00 XCPNG02 SM: [11473] PhyLink(/dev/sm/phy/ec87c10e-1499-c1c5-cf3f-c234062bb459/56e01d87-0eb5-4f03-b916-b74484360738) -> /var/run/sr-mount/ec87c10e-1499-c1c5-cf3f-c234062bb459/ubuntu-22.04-live-server-amd64.iso
      Jun  8 22:39:00 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'allocate']
      Jun  8 22:39:00 XCPNG02 SM: [11473]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'spawn']
      Jun  8 22:39:00 XCPNG02 SM: [11473]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'attach', '-p', '11505', '-m', '5']
      Jun  8 22:39:00 XCPNG02 SM: [11473]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'open', '-p', '11505', '-m', '5', '-a', 'aio:/var/run/sr-mount/ec87c10e-1499-c1c5-cf3f-c234062bb459/ubuntu-22.04-live-server-amd64.iso', '-R']
      Jun  8 22:39:00 XCPNG02 SM: [11473]  = 13
      Jun  8 22:39:00 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'close', '-p', '11505', '-m', '5', '-t', '30']
      Jun  8 22:39:00 XCPNG02 SM: [11473]  = 0
      Jun  8 22:39:00 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'detach', '-p', '11505', '-m', '5']
      Jun  8 22:39:01 XCPNG02 SM: [11473]  = 0
      Jun  8 22:39:01 XCPNG02 SM: [11473] ['/usr/sbin/tap-ctl', 'free', '-m', '5']
      Jun  8 22:39:01 XCPNG02 SM: [11473]  = 0
      Jun  8 22:39:01 XCPNG02 SM: [11473] ***** generic exception: vdi_attach: EXCEPTION <class 'blktap2.TapdiskNotRunning'>, No such Tapdisk(minor=5)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Jun  8 22:39:01 XCPNG02 SM: [11473]     return self._run_locked(sr)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Jun  8 22:39:01 XCPNG02 SM: [11473]     rv = self._run(sr, target)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 247, in _run
      Jun  8 22:39:01 XCPNG02 SM: [11473]     return target.attach(self.params['sr_uuid'], self.vdi_uuid, writable, caching_params = caching_params)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 1557, in attach
      Jun  8 22:39:01 XCPNG02 SM: [11473]     {"rdonly": not writable})
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 1710, in _activate
      Jun  8 22:39:01 XCPNG02 SM: [11473]     self._get_pool_config(sr_uuid).get("mem-pool-size"))
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 1346, in _tap_activate
      Jun  8 22:39:01 XCPNG02 SM: [11473]     options)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 838, in launch_on_tap
      Jun  8 22:39:01 XCPNG02 SM: [11473]     tapdisk = cls.__from_blktap(blktap)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 749, in __from_blktap
      Jun  8 22:39:01 XCPNG02 SM: [11473]     tapdisk = cls.from_minor(minor=blktap.minor)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 745, in from_minor
      Jun  8 22:39:01 XCPNG02 SM: [11473]     return cls.get(minor=minor)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 735, in get
      Jun  8 22:39:01 XCPNG02 SM: [11473]     raise TapdiskNotRunning(**attrs)
      Jun  8 22:39:01 XCPNG02 SM: [11473]
      Jun  8 22:39:01 XCPNG02 SM: [11473] ***** ISO: EXCEPTION <class 'blktap2.TapdiskNotRunning'>, No such Tapdisk(minor=5)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 378, in run
      Jun  8 22:39:01 XCPNG02 SM: [11473]     ret = cmd.run(sr)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
      Jun  8 22:39:01 XCPNG02 SM: [11473]     return self._run_locked(sr)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 159, in _run_locked
      Jun  8 22:39:01 XCPNG02 SM: [11473]     rv = self._run(sr, target)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/SRCommand.py", line 247, in _run
      Jun  8 22:39:01 XCPNG02 SM: [11473]     return target.attach(self.params['sr_uuid'], self.vdi_uuid, writable, caching_params = caching_params)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 1557, in attach
      Jun  8 22:39:01 XCPNG02 SM: [11473]     {"rdonly": not writable})
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 1710, in _activate
      Jun  8 22:39:01 XCPNG02 SM: [11473]     self._get_pool_config(sr_uuid).get("mem-pool-size"))
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 1346, in _tap_activate
      Jun  8 22:39:01 XCPNG02 SM: [11473]     options)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 838, in launch_on_tap
      Jun  8 22:39:01 XCPNG02 SM: [11473]     tapdisk = cls.__from_blktap(blktap)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 749, in __from_blktap
      Jun  8 22:39:01 XCPNG02 SM: [11473]     tapdisk = cls.from_minor(minor=blktap.minor)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 745, in from_minor
      Jun  8 22:39:01 XCPNG02 SM: [11473]     return cls.get(minor=minor)
      Jun  8 22:39:01 XCPNG02 SM: [11473]   File "/opt/xensource/sm/blktap2.py", line 735, in get
      Jun  8 22:39:01 XCPNG02 SM: [11473]     raise TapdiskNotRunning(**attrs)
      Jun  8 22:39:01 XCPNG02 SM: [11473]
      Jun  8 22:39:01 XCPNG02 SM: [11551] lock: opening lock file /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:01 XCPNG02 SM: [11551] lock: acquired /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:01 XCPNG02 SM: [11551] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:01 XCPNG02 SM: [11551]   pread SUCCESS
      Jun  8 22:39:01 XCPNG02 SM: [11551] lock: released /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:01 XCPNG02 SM: [11551] vdi_deactivate {'sr_uuid': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed', 'subtask_of': 'DummyRef:|f500f41d-7c07-44fb-9845-732f9c3f0f6a|VDI.deactivate', 'vdi_ref': 'OpaqueRef:e7f286ca-bfe4-4cfe-8ae6-20a5c589ae2e', 'vdi_on_boot': 'persist', 'args': [], 'o_direct': False, 'vdi_location': '1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:7d3510c2-c1fc-415b-b154-64da3af76a16', 'device_config': {'SRmaster': 'true', 'serverpath': '/mnt/Pool01/Remote_VM_Images/xcpng_vm_images', 'server': 'TNC01.NEWT.newtcomputing.com'}, 'command': 'vdi_deactivate', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:b1f916b2-51fb-4726-bcbf-49ab05b0cf50', 'vdi_uuid': '1d5654c5-0fe4-4a07-b5fb-29b58922870f'}
      Jun  8 22:39:01 XCPNG02 SM: [11551] lock: opening lock file /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:01 XCPNG02 SM: [11551] blktap2.deactivate
      Jun  8 22:39:01 XCPNG02 SM: [11551] lock: acquired /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:01 XCPNG02 SM: [11551] ['/usr/sbin/tap-ctl', 'close', '-p', '11418', '-m', '4', '-t', '30']
      Jun  8 22:39:01 XCPNG02 SM: [11551]  = 0
      Jun  8 22:39:01 XCPNG02 SM: [11551] ['/usr/sbin/tap-ctl', 'detach', '-p', '11418', '-m', '4']
      Jun  8 22:39:01 XCPNG02 SM: [11551]  = 0
      Jun  8 22:39:01 XCPNG02 SM: [11551] ['/usr/sbin/tap-ctl', 'free', '-m', '4']
      Jun  8 22:39:01 XCPNG02 SM: [11551]  = 0
      Jun  8 22:39:01 XCPNG02 SM: [11551] tap.deactivate: Shut down Tapdisk(vhd:/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd, pid=11418, minor=4, state=R)
      Jun  8 22:39:01 XCPNG02 SM: [11551] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:01 XCPNG02 SM: [11551]   pread SUCCESS
      Jun  8 22:39:01 XCPNG02 SM: [11551] Removed host key host_OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190 for 1d5654c5-0fe4-4a07-b5fb-29b58922870f
      Jun  8 22:39:01 XCPNG02 SM: [11551] lock: released /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:01 XCPNG02 SM: [11615] lock: opening lock file /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:01 XCPNG02 SM: [11615] lock: acquired /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:39:01 XCPNG02 SM: [11615] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/1d5654c5-0fe4-4a07-b5fb-29b58922870f.vhd']
      Jun  8 22:39:01 XCPNG02 SM: [11615]   pread SUCCESS
      Jun  8 22:39:01 XCPNG02 SM: [11615] vdi_detach {'sr_uuid': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed', 'subtask_of': 'DummyRef:|430cf35c-c0d7-43e4-bc89-b014d3bf7754|VDI.detach', 'vdi_ref': 'OpaqueRef:e7f286ca-bfe4-4cfe-8ae6-20a5c589ae2e', 'vdi_on_boot': 'persist', 'args': [], 'o_direct': False, 'vdi_location': '1d5654c5-0fe4-4a07-b5fb-29b58922870f', 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:f17eba69-d8c9-4dcf-ab26-6c8c7dba380c', 'device_config': {'SRmaster': 'true', 'serverpath': '/mnt/Pool01/Remote_VM_Images/xcpng_vm_images', 'server': 'TNC01.NEWT.newtcomputing.com'}, 'command': 'vdi_detach', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:b1f916b2-51fb-4726-bcbf-49ab05b0cf50', 'vdi_uuid': '1d5654c5-0fe4-4a07-b5fb-29b58922870f'}
      Jun  8 22:39:01 XCPNG02 SM: [11615] lock: opening lock file /var/lock/sm/1d5654c5-0fe4-4a07-b5fb-29b58922870f/vdi
      Jun  8 22:39:01 XCPNG02 SM: [11615] lock: released /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:40:44 XCPNG02 SM: [12548] ['uuidgen', '-r']
      Jun  8 22:40:44 XCPNG02 SM: [12548]   pread SUCCESS
      Jun  8 22:40:44 XCPNG02 SM: [12548] lock: opening lock file /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:40:44 XCPNG02 SM: [12548] lock: acquired /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      Jun  8 22:40:44 XCPNG02 SM: [12548] vdi_create {'sr_uuid': 'bc2687ec-0cdf-03ed-7f90-e58edad07fed', 'subtask_of': 'DummyRef:|96f661f1-77c5-4d4d-ab2b-243d0de50732|VDI.create', 'vdi_type': 'user', 'args': ['30064771072', 'test08jun_vdi', 'test08jun', '', 'false', '19700101T00:00:00Z', '', 'false'], 'o_direct': False, 'host_ref': 'OpaqueRef:a1e9a8f3-0a79-4824-b29f-d81b3246d190', 'session_ref': 'OpaqueRef:7b64b6b4-4d66-4746-b75e-f13c086da122', 'device_config': {'SRmaster': 'true', 'serverpath': '/mnt/Pool01/Remote_VM_Images/xcpng_vm_images', 'server': 'TNC01.NEWT.newtcomputing.com'}, 'command': 'vdi_create', 'sr_ref': 'OpaqueRef:b1f916b2-51fb-4726-bcbf-49ab05b0cf50', 'vdi_sm_config': {}}
      Jun  8 22:40:44 XCPNG02 SM: [12548] ['/usr/sbin/td-util', 'create', 'vhd', '28672', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/604b51ed-53ea-45cf-a110-f4a51ec6101a.vhd']
      Jun  8 22:40:44 XCPNG02 SM: [12548]   pread SUCCESS
      Jun  8 22:40:44 XCPNG02 SM: [12548] ['/usr/sbin/td-util', 'query', 'vhd', '-v', '/var/run/sr-mount/bc2687ec-0cdf-03ed-7f90-e58edad07fed/604b51ed-53ea-45cf-a110-f4a51ec6101a.vhd']
      Jun  8 22:40:44 XCPNG02 SM: [12548]   pread SUCCESS
      Jun  8 22:40:44 XCPNG02 SM: [12548] lock: released /var/lock/sm/bc2687ec-0cdf-03ed-7f90-e58edad07fed/sr
      
      
      posted in Xen Orchestra
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      I rebooted the server that those I/O errors were occurring on - and that seems to have stopped the I/O errors. Yet the problem with
      SR_BACKEND_FAILURE_1200, No such Tapdisk. is still occurring and I still cannot create new VMs.

      Any ideas anyone?

      posted in Xen Orchestra
      G
      geoffbland
    • RE: Create VM Error SR_BACKEND_FAILURE_1200, No such Tapdisk.

      I am seeing the same failure listed in the xen source log on the master host of the pool and also on one of the other hosts - I assume the server chosen to host the VM.

      tail -1000f /var/log/xensource.log
      ...
      Jun  8 17:57:41 XCPNG02 xapi: [debug||103204 /var/lib/xcp/xapi||dummytaskhelper] task dispatch:pool.get_all D:690db7cf5c31 created by task D:8c6d5334b10b
      Jun  8 17:57:41 XCPNG02 xapi: [debug||103148 HTTPS 192.168.1.190->:::80|VM.start R:30a9d8c81d65|xmlrpc_client] stunnel pid: 8324 (cached = true) returned stunnel to cache
      Jun  8 17:57:41 XCPNG02 xapi: [ info||103148 HTTPS 192.168.1.190->:::80|VM.start R:30a9d8c81d65|xapi_session] Session.destroy trackid=c395bd1ebf2259112354d8f71d05fef1
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] VM.start R:30a9d8c81d65 failed with exception Server_error(SR_BACKEND_FAILURE_1200, [ ; No such Tapdisk(minor=5);  ])
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] Raised Server_error(SR_BACKEND_FAILURE_1200, [ ; No such Tapdisk(minor=5);  ])
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 1/19 xapi Raised at file ocaml/xapi-client/client.ml, line 7
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 2/19 xapi Called from file ocaml/xapi-client/client.ml, line 19
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 3/19 xapi Called from file ocaml/xapi-client/client.ml, line 6044
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 4/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 5/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 6/19 xapi Called from file ocaml/xapi/message_forwarding.ml, line 131
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 7/19 xapi Called from file ocaml/xapi/message_forwarding.ml, line 1159
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 8/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 9/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 10/19 xapi Called from file ocaml/xapi/message_forwarding.ml, line 1491
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 11/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 12/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 13/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 14/19 xapi Called from file ocaml/xapi/rbac.ml, line 231
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 15/19 xapi Called from file ocaml/xapi/server_helpers.ml, line 103
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 16/19 xapi Called from file ocaml/xapi/server_helpers.ml, line 121
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 17/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 24
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 18/19 xapi Called from file lib/xapi-stdext-pervasives/pervasiveext.ml, line 35
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace] 19/19 xapi Called from file lib/backtrace.ml, line 177
      Jun  8 17:57:41 XCPNG02 xapi: [error||103148 :::80||backtrace]
      Jun  8 17:57:41 XCPNG02 xapi: [debug||102859 :::80||dummytaskhelper] task dispatch:event.from D:bd097f7c7599 created by task D:bd1ca8d956f6
      ...
      

      I am also seeing these errors being reported in the daemon log file.

      grep -i error  /var/log/daemon.log
      ...
      Jun  8 17:58:53 XCPNG02 forkexecd: [error||0 ||forkexecd] 31201 (/opt/xensource/libexec/block_device_io -device /dev/sm/backend/f3d786ad-b524-...) exited with signal: SIGKILL
      Jun  8 18:02:22 XCPNG02 tapdisk[2282]: ERROR: errno -13 at __tapdisk_vbd_complete_td_request: req tap-1.0: write 0x0008 secs @ 0x0001e000 - Permission denied
      Jun  8 18:04:23 XCPNG02 tapdisk[2282]: ERROR: errno -13 at __tapdisk_vbd_request_timeout: req tap-1.0 timed out, retried 120 times
      Jun  8 18:04:23 XCPNG02 tapdisk[2282]: ERROR: errno -13 at __tapdisk_vbd_request_timeout: req tap-1.0 timed out, retried 120 times
      ...
      

      On the master host of the pool I am seeing these errors being reported constantly in the kernel log. This doesn't look good.

      ...
      tail -f  /var/log/kern.log
      ... 
      Jun  8 18:04:23 XCPNG02 kernel: [2884638.906148] print_req_error: I/O error, dev tdb, sector 122880
      Jun  8 18:04:23 XCPNG02 kernel: [2884638.906160] Buffer I/O error on dev tdb, logical block 15360, lost async page write
      ...
      

      What is the \dev\tdb disk used for and can I recover it somehow?
      Would this be causing the problems I am seeing?

      posted in Xen Orchestra
      G
      geoffbland