XCP-ng

    • Register
    • Login
    • Search
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    Testing ZFS with XCP-ng

    Development
    10
    80
    25293
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • nraynaud
      nraynaud XCP-ng Team ๐Ÿš€ last edited by olivierlambert

      I all,

      I spent some time trying to get ZFS to work.

      The main issue is that ZFS doesn't support O_DIRECT and that blktap only opens .vhd files with O_DIRECT.

      I patched blktap to remove the O_DIRECT and replace it with O_DSYNC and try to run ZFS on a file SR. Here are some changes: https://gist.github.com/nraynaud/b32bd612f5856d1efc1418233d3aeb0f

      The steps are the following:

      • remove the current blktap from your host
      • install the patched blktap
      • install the ZFS support RPMs
      • create the and mount the ZFS volume
      • create the file SR pointing to the mounted directory

      Here is a ZFS test script:

      XCP_HOST_UNDER_TEST=root@yourmachine
      rsync -r zfs blktap-3.5.0-1.12test.x86_64.rpm ${XCP_HOST_UNDER_TEST}:
      ssh ${XCP_HOST_UNDER_TEST} yum remove -y blktap
      ssh ${XCP_HOST_UNDER_TEST} yum install -y blktap-3.5.0-1.12test.x86_64.rpm
      ssh ${XCP_HOST_UNDER_TEST} yum install -y zfs/*.rpm
      ssh ${XCP_HOST_UNDER_TEST} depmod -a
      ssh ${XCP_HOST_UNDER_TEST} modprobe zfs
      ssh ${XCP_HOST_UNDER_TEST} zpool create -f -m /mnt/zfs tank /dev/mydev
      ssh ${XCP_HOST_UNDER_TEST} zfs set sync=disabled tank
      ssh ${XCP_HOST_UNDER_TEST} zfs set compression=lz4 tank
      ssh ${XCP_HOST_UNDER_TEST} zfs list
      
      SR_ZFS_UUID=`ssh ${XCP_HOST_UNDER_TEST} "xe sr-create host-uuid=${XCP_HOST_UNDER_TEST_UUID} name-label=test-zfs-sr type=file other-config:o_direct=false device-config:location=/mnt/zfs/test-zfs-sr"`
      

      In details

      ZFS options

      Obviously, you can create the ZFS volume as you need (mirror, RAID-Z, doesn't matter, it's entirely up to you!)

      Regarding options: due to O_DIRECT removal, it seems that the default sync mode is as if you got sync=always. To enjoy ZFS real perfs, it's better to use sync=disabled.

      Compression (using lz4) also offers a nice boost. Finally, if you are a bit kamikaze, you can try dedup, but frankly, not worth the risk.

      Note: please extend your dom0 memory to enjoy more perfs in general! 4GiB seems a minimum, remember that your host will now act as a cache for your storage.

      SR creation

      We added a specific flag to allow SR usage without O_DIRECT:

      xe sr-create host-uuid=<YOUR_HOST_UUID> name-label=ZFS-SR type=file other-config:o_direct=false device-config:location=/tank
      

      You can see the other-config:o_direct=false flag required.

      Download

      The files are here: https://cloud.vates.fr/index.php/s/zDSLcsJ4l5tYHwL?path=%2F

      I would be interested in someone else reproducing my work, fiddle with ZFS and tell us if it's worth pursuing.

      Thanks,

      Nico.

      1 Reply Last reply Reply Quote 0
      • borzel
        borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by borzel

        ๐Ÿ’ป I did a test ๐Ÿ™‚

        Environment: XCP-ng 7.4 with latest yum updates

        ZFS works as expected! (Note: ashift=12 --> I have 4k discs)
        0_1529342989011_b86f601e-ee4a-49ee-a010-9aa67e017828-grafik.png

        Test1 - File SR ontop ZFS-Folder

        • created a new folder zfs create test/sr which is located at /mnt/zfs/sr
        • created a new file-sr xe sr-create host-uuid=<host-uuid> type=file shared=false name-label=<sr-label> device-config:location=/mnt/zfs/sr

        Result: could not boot my ubuntu-vm

        Test2 - Ext-SR ontop ZFS Blockdevice

        • created a new blockdevice zfs create -V 100G test/ext-sr
        • created a new EXT-SR xe sr-create content-type=user host-uuid=<host-uuid> shared=false type=ext device-config:device=/dev/zvol/test/ext-sr name-label=<sr-label>

        Result: here my ubunt-vm also does not boot

        Test 3 - Boot ubuntu-vm on "old" lvm SR

        Result: could also not boot my ubuntu-vm

        Hints

        • local copy of the vm was done without errors

        • in all three cases the vm got stuck on following boot-screen:
          0_1529344323371_b8fa7113-ccca-4be3-8534-eff9df5bc6c4-grafik.png

        • turning off the stuck vm takes a long time

        Test 4

        (after reading carefully your script and using other-config:o_direct=false)

        Result: vm is booting ๐Ÿ™‚ ๐Ÿ™‚ ๐Ÿ™‚

        1 Reply Last reply Reply Quote 0
        • olivierlambert
          olivierlambert Vates ๐Ÿช Co-Founder๐Ÿฆธ CEO ๐Ÿง‘โ€๐Ÿ’ผ last edited by

          Feel free to do some test (with fio for example, with bs=4k to test IOPS and bs=4M to test max throughput)

          1 Reply Last reply Reply Quote 0
          • borzel
            borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by

            just now I watch the soccer games in russia, so just a little test ๐Ÿ˜‰

            dd inside vm

            root@ubuntu:~# dd if=/root/test.img of=/dev/null bs=4k
            812740+0 Datensรคtze ein
            812740+0 Datensรคtze aus
            3328983040 bytes (3,3 GB, 3,1 GiB) copied, 17,4961 s, 190 MB/s
            

            dd on host

            [root@xen sr3]# dd if=./83377032-f3d5-411f-b049-2e76048fd4a2.vhd of=/dev/null bs=4k 
            2307479+1 Datensรคtze ein
            2307479+1 Datensรคtze aus
            9451434496 Bytes (9,5 GB) kopiert, 38,5861 s, 245 MB/s
            

            A side note: My disc's are happy now with ZFS, I hope you can see it ๐Ÿ˜‰

            0_1529348589739_9603ad14-8a18-49a5-9b20-7e203acb35f4-grafik.png

            1 Reply Last reply Reply Quote 0
            • R
              r1 XCP-ng Team ๐Ÿš€ last edited by

              @nraynaud Nice Work! Will report on weekend with my tests.

              @olivierlambert I love fio. bonnie++ is also worth a shot.

              @borzel you have a floppy drive!!!

              1 Reply Last reply Reply Quote 0
              • nraynaud
                nraynaud XCP-ng Team ๐Ÿš€ last edited by

                thanks @r1

                I started a PR here: https://github.com/xapi-project/blktap/pull/253 it's not exactly the same code as the RPM above. Above, O_DIRECT has been completely removed altogether, so that people can compare ext3 without O_DIRECT and ZFS. In the PR, I am just removing O_DIRECT when an open() has failed.

                It is not sure it will be accepted, I have no clue what their policy is, and no idea what their QA process is, we are really at the draft stage.

                Nico.

                nraynaud created this issue in xapi-project/blktap

                closed make O_DIRECT optional #253

                1 Reply Last reply Reply Quote 0
                • R
                  r1 XCP-ng Team ๐Ÿš€ last edited by

                  @nraynaud may be an extra config parameter while sr-create will be a right way. Handling a fail may not seem appealing. My 2c.

                  1 Reply Last reply Reply Quote 0
                  • borzel
                    borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by borzel

                    Status update:

                    I installed XCP-ng 7.4 on a "Storage Server" with 16 x 2TB HDDs, gave DOM0 8GB of RAM and created a zpool (Sync off, Compress LZ4) and file-SR ontop.

                    In combination with a local XenOrchestra VM (and configured continious replication jobs) we now have a selfcontained "All-in-One Replication Server" without dependency on hardware raid controllers ๐Ÿ™‚

                    In case of an emergency in my production storage/virtualisation environment I can boot every VM I need with just this one server. Or power up my DHCP-VM if the other systems are down for service on chrismas/new years eve ๐Ÿ˜‰

                    I'm more than very happy with that ๐Ÿ™‚

                    Side note: This is of course not ready for "production" and doesn't replace our real backup...just in case someone asks ๐Ÿ˜‰

                    1 Reply Last reply Reply Quote 0
                    • olivierlambert
                      olivierlambert Vates ๐Ÿช Co-Founder๐Ÿฆธ CEO ๐Ÿง‘โ€๐Ÿ’ผ last edited by olivierlambert

                      How's the performance?

                      1 Reply Last reply Reply Quote 0
                      • borzel
                        borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by borzel

                        Currently hundred of gigabytes gets transfered ๐Ÿ™‚

                        I switched xenorchestra from the VM to our hardware backup server, so we can use the 10Gbit cards. (The VM run iperf about 1,6 Gbit/s, the hardware at around 6 GBit/s.)

                        The data flow is:
                        NFS-Storage(ZFS) -> XenServer -> XO -> XCP-ng (ZFS), all connected over 10GBit Ethernet.

                        I did not measure correct numbers, was just impressed and happy that it runs ๐Ÿ™‚

                        I will try to express my data and meanings more detailed later...

                        Setup

                        0_1530657646909_f6ca6db4-cd16-44bf-9db4-8dacfe3be0df-grafik.png

                        Settings

                        • Alle servers are connected via 10GBit ethernet cards (Emulex OnConnect)
                        • Xen1 and Xen2 are a pool, Xen1 is master
                        • XenOrchestra connects per http:// (no SSL to improve speed)
                        • DOM0 has 8GB RAM on XCP-ng AiO (so ZFS does use 4GB)

                        Numbers

                        • iperf between Backupserver and XCP-ng AiO: ~6 GBit's
                        • Transfer speed of Continous Replication Job from Xen-Pool to XCP-ng AiO (output of zpool iostat with 30 seconds sample rate on XCP-ng AiO)

                        0_1530658170537_7044bd63-39c3-4f93-aa3f-e97a596b6f26-grafik.png

                        The speed varies a bit because of lz4 compression, current Compress Ratio: 1.64x

                        Here a quick look at the graphs (on XCP-ng AiO)

                        0_1530658863169_485d9208-50d8-4051-9853-762e8e02c201-grafik.png

                        and a quick look at top (on XCP-ng AiO)

                        0_1530659073283_c3c22e43-e039-4ddf-8d7c-f4700d37f9ff-grafik.png

                        and arcstat.py (sample rate 1sec) (on XCP-ng AiO)

                        0_1530659347665_97dd73d9-2d83-4c6b-a57b-d8d5528e4b74-grafik.png

                        and arc_summary.py (on XCP-ng AiO)

                        0_1530659545510_0c723c27-1666-4943-992c-c203c79178a5-grafik.png

                        @olivierlambert I hope you can see somehting with these numbers ๐Ÿ’ป

                        1 Reply Last reply Reply Quote 0
                        • borzel
                          borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by

                          To avoid the problem with XenOrchestra runing as VM (and not using the full 10GBit of the host) I would install it directly inside DOM0. It could be preinstalled in XCP-ng as WEB-Gui ๐Ÿ™‚

                          Maybe I try this later on my AiO Server ...

                          1 Reply Last reply Reply Quote 0
                          • olivierlambert
                            olivierlambert Vates ๐Ÿช Co-Founder๐Ÿฆธ CEO ๐Ÿง‘โ€๐Ÿ’ผ last edited by

                            Not a good idea to embed XO inside Dom0โ€ฆ (also it will cause issue if you have multiple XOs, it's meat to run at one place only).

                            That's why the virtual appliance is better ๐Ÿ™‚

                            borzel 1 Reply Last reply Reply Quote 0
                            • borzel
                              borzel XCP-ng Center Team ๐Ÿš๏ธ @olivierlambert last edited by borzel

                              @olivierlambert My biggest Question: Why is this not a good Idea?

                              How do big datacenters solve the issue with the performance? Just install on Hardware? With virtual Appliance you are limited to GBit speed.

                              I run more than one XO's, one for Backups and one for Useraccess. Don't want users on my Backupserver.

                              Maybe this is something for a separat thread...

                              1 Reply Last reply Reply Quote 0
                              • olivierlambert
                                olivierlambert Vates ๐Ÿช Co-Founder๐Ÿฆธ CEO ๐Ÿง‘โ€๐Ÿ’ผ last edited by

                                VM speed is not limited to GBit speed. XO proxies will deal with big infrastructures to avoid bottlenecks.

                                borzel 1 Reply Last reply Reply Quote 0
                                • borzel
                                  borzel XCP-ng Center Team ๐Ÿš๏ธ @olivierlambert last edited by

                                  @olivierlambert ah, ok ๐Ÿ™‚

                                  1 Reply Last reply Reply Quote 0
                                  • borzel
                                    borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by borzel

                                    I did a little more testing with ZFS, tried to move a disk from localpool to srpool

                                    Setup

                                    • XCP-ng on single 500Gb disk, ZFS-Pool created on a separat partition on that disk (~420GB) -> localpool -> SR named localzfs
                                    • ZFS-Pool on 4 x 500GB Disks (Raid10) -> srpool -> SR named sr1

                                    Error

                                    0_1530972133820_4eefffb1-2464-4985-a46d-da41cc6403b3-grafik.png

                                    Output of /var/log/SMLog

                                     Jul  7 15:44:07 xen SM: [28282] ['uuidgen', '-r']
                                     Jul  7 15:44:07 xen SM: [28282]   pread SUCCESS
                                     Jul  7 15:44:07 xen SM: [28282] lock: opening lock file /var/lock/sm/4c47f4b0-2504-fe56-085c-1ffe2269ddea/sr
                                     Jul  7 15:44:07 xen SM: [28282] lock: acquired /var/lock/sm/4c47f4b0-2504-fe56-085c-1ffe2269ddea/sr
                                     Jul  7 15:44:07 xen SM: [28282] vdi_create {'sr_uuid': '4c47f4b0-2504-fe56-085c-1ffe2269ddea', 'subtask_of':      'DummyRef:|62abb588-91e6-46d7-8c89-a1be48c843bd|VDI.create', 'vdi_type': 'system', 'args': ['10737418240', 'xo-os', '', '', 'false', '19700101T00:00:00Z', '', 'false'], 'o_direct': False, 'host_ref': 'OpaqueRef:ada301d3-f232-3421-527a-d511fd63f8c6', 'session_ref': 'OpaqueRef:02edf603-3cdf-444a-af75-1fb0a8d71216', 'device_config': {'SRmaster': 'true', 'location': '/mnt/srpool/sr1'}, 'command': 'vdi_create', 'sr_ref': 'OpaqueRef:0c8c4688-d402-4329-a99b-6273401246ec', 'vdi_sm_config': {}}
                                     Jul  7 15:44:07 xen SM: [28282] ['/usr/sbin/td-util', 'create', 'vhd', '10240', '/var/run/sr-mount/4c47f4b0-2504-fe56-085c-1ffe2269ddea/11a040ba-53fd-4119-80b3-7ea5a7e134c6.vhd']
                                     Jul  7 15:44:07 xen SM: [28282]   pread SUCCESS
                                     Jul  7 15:44:07 xen SM: [28282] ['/usr/sbin/td-util', 'query', 'vhd', '-v', '/var/run/sr-mount/4c47f4b0-2504-fe56-085c-     1ffe2269ddea/11a040ba-53fd-4119-80b3-7ea5a7e134c6.vhd']
                                     Jul  7 15:44:07 xen SM: [28282]   pread SUCCESS
                                     Jul  7 15:44:07 xen SM: [28282] lock: released /var/lock/sm/4c47f4b0-2504-fe56-085c-1ffe2269ddea/sr
                                     Jul  7 15:44:07 xen SM: [28282] lock: closed /var/lock/sm/4c47f4b0-2504-fe56-085c-1ffe2269ddea/sr
                                     Jul  7 15:44:07 xen SM: [28311] lock: opening lock file /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:07 xen SM: [28311] lock: acquired /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:07 xen SM: [28311] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/319aebee-24f0-8232-0d9e-ad42a75d8154/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0.vhd']
                                     Jul  7 15:44:07 xen SM: [28311]   pread SUCCESS
                                     Jul  7 15:44:07 xen SM: [28311] vdi_attach {'sr_uuid': '319aebee-24f0-8232-0d9e-ad42a75d8154', 'subtask_of': 'DummyRef:|b0cce7fa-75bb-4677-981a-4383e5fe1487|VDI.attach', 'vdi_ref': 'OpaqueRef:9daa47a5-1cae-40c2-b62e-88a767f38877', 'vdi_on_boot': 'persist', 'args': ['false'], 'o_direct': False, 'vdi_location': 'ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0', 'host_ref': 'OpaqueRef:ada301d3-f232-3421-527a-d511fd63f8c6', 'session_ref': 'OpaqueRef:a27b0e88-cebe-41a7-bc0c-ae0eb46dd011', 'device_config': {'SRmaster': 'true', 'location': '/localpool/sr'}, 'command': 'vdi_attach', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:e4d0225c-1737-4496-8701-86a7f7ac18c1', 'vdi_uuid': 'ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0'}
                                     Jul  7 15:44:07 xen SM: [28311] lock: opening lock file /var/lock/sm/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0/vdi
                                     Jul  7 15:44:07 xen SM: [28311] result: {'o_direct_reason': 'SR_NOT_SUPPORTED', 'params': '/dev/sm/backend/319aebee-24f0-8232-0d9e-ad42a75d8154/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0', 'o_direct': True, 'xenstore_data': {'scsi/0x12/0x80': 'AIAAEmNhODhmOGRjLTRhYjYtNGIgIA==', 'scsi/0x12/0x83': 'AIMAMQIBAC1YRU5TUkMgIGNhODhmOGRjLTRhYjYtNGJlNy05MDg0LTFiNDFjYmM4YzFjMCA=', 'vdi-uuid': 'ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0', 'mem-pool': '319aebee-24f0-8232-0d9e-ad42a75d8154'}}
                                     Jul  7 15:44:07 xen SM: [28311] lock: closed /var/lock/sm/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0/vdi
                                     Jul  7 15:44:07 xen SM: [28311] lock: released /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:07 xen SM: [28311] lock: closed /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:08 xen SM: [28337] lock: opening lock file /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:08 xen SM: [28337] lock: acquired /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:08 xen SM: [28337] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/319aebee-24f0-8232-0d9e-ad42a75d8154/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0.vhd']
                                     Jul  7 15:44:08 xen SM: [28337]   pread SUCCESS
                                     Jul  7 15:44:08 xen SM: [28337] lock: released /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:08 xen SM: [28337] vdi_activate {'sr_uuid': '319aebee-24f0-8232-0d9e-ad42a75d8154', 'subtask_of': 'DummyRef:|1f6f53c2-0d1e-462c-aeea-5d109e55d44f|VDI.activate', 'vdi_ref': 'OpaqueRef:9daa47a5-1cae-40c2-b62e-88a767f38877', 'vdi_on_boot': 'persist', 'args': ['false'], 'o_direct': False, 'vdi_location': 'ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0', 'host_ref': 'OpaqueRef:ada301d3-f232-3421-527a-d511fd63f8c6', 'session_ref': 'OpaqueRef:392bd2b6-2a84-48ca-96f1-fe0bdaa96f20', 'device_config': {'SRmaster': 'true', 'location': '/localpool/sr'}, 'command': 'vdi_activate', 'vdi_allow_caching': 'false', 'sr_ref': 'OpaqueRef:e4d0225c-1737-4496-8701-86a7f7ac18c1', 'vdi_uuid': 'ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0'}
                                     Jul  7 15:44:08 xen SM: [28337] lock: opening lock file /var/lock/sm/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0/vdi
                                     Jul  7 15:44:08 xen SM: [28337] blktap2.activate
                                     Jul  7 15:44:08 xen SM: [28337] lock: acquired /var/lock/sm/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0/vdi
                                     Jul  7 15:44:08 xen SM: [28337] Adding tag to: ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0
                                     Jul  7 15:44:08 xen SM: [28337] Activate lock succeeded
                                     Jul  7 15:44:08 xen SM: [28337] lock: opening lock file /var/lock/sm/319aebee-24f0-8232-0d9e-ad42a75d8154/sr
                                     Jul  7 15:44:08 xen SM: [28337] ['/usr/sbin/td-util', 'query', 'vhd', '-vpfb', '/var/run/sr-mount/319aebee-24f0-8232-0d9e-ad42a75d8154/ca88f8dc-4ab6-4be7-9084-1b41cbc8c1c0.vhd']
                                     Jul  7 15:44:08 xen SM: [28337]   pread SUCCESS
                                    
                                    1 Reply Last reply Reply Quote 0
                                    • borzel
                                      borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by

                                      @borzel said in Testing ZFS with XCP-ng:

                                      Jul 7 15:44:07 xen SM: [28311] result: {'o_direct_reason': 'SR_NOT_SUPPORTED',

                                      Did find it after posting this and looking onto it... is this a known thing?

                                      1 Reply Last reply Reply Quote 0
                                      • nraynaud
                                        nraynaud XCP-ng Team ๐Ÿš€ last edited by

                                        that's weird: https://github.com/xapi-project/sm/blob/master/drivers/blktap2.py#L1002

                                        I don't even know how my tests did work before.

                                        1 Reply Last reply Reply Quote 0
                                        • olivierlambert
                                          olivierlambert Vates ๐Ÿช Co-Founder๐Ÿฆธ CEO ๐Ÿง‘โ€๐Ÿ’ผ last edited by

                                          Maybe because it wasn't triggered without a VDI live migration?

                                          1 Reply Last reply Reply Quote 0
                                          • borzel
                                            borzel XCP-ng Center Team ๐Ÿš๏ธ last edited by borzel

                                            did no livemigration ...just local copy of a vm
                                            ... ah, I should read before I answer ...

                                            1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post