XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Testing ZFS with XCP-ng

    Scheduled Pinned Locked Moved Development
    80 Posts 10 Posters 58.9k Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • borzelB Offline
      borzel XCP-ng Center Team
      last edited by borzel

      did no livemigration ...just local copy of a vm
      ... ah, I should read before I answer ...

      1 Reply Last reply Reply Quote 0
      • borzelB Offline
        borzel XCP-ng Center Team
        last edited by borzel

        @nraynaud Is there any info how to build blktap2 myself to test with my homelab?
        I did find it here: https://xcp-ng.org/forum/topic/122/how-to-build-blktap-from-sources/3

        1 Reply Last reply Reply Quote 0
        • borzelB Offline
          borzel XCP-ng Center Team
          last edited by borzel

          I tested the updated version of blktap and build it like described in https://xcp-ng.org/forum/post/1677, but copying a vm within the SR fails 😞

          Async.VM.copy R:f286a572f8aa|xapi] Error in safe_clone_disks: Server_error(VDI_COPY_FAILED, [ End_of_file ])
          
          1 Reply Last reply Reply Quote 0
          • borzelB Offline
            borzel XCP-ng Center Team
            last edited by borzel

            the function safe_clone_disks is located in https://github.com/xapi-project/xen-api/blob/72a9a2d6826e9e39d30fab0d6420de6a0dcc0dc5/ocaml/xapi/xapi_vm_clone.ml#L139

            it calls clone_single_vdi in https://github.com/xapi-project/xen-api/blob/72a9a2d6826e9e39d30fab0d6420de6a0dcc0dc5/ocaml/xapi/xapi_vm_clone.ml#L116

            which calls Client.Async.VDI.copy from a XAPI-OCaml Module Client which I can not find 😞

            1 Reply Last reply Reply Quote 0
            • borzelB Offline
              borzel XCP-ng Center Team
              last edited by borzel

              found something in /var/log/SMlog

              Jul 21 19:52:40 xen SM: [17060] result: {'o_direct_reason': 'SR_NOT_SUPPORTED', 'params': '/dev/sm/backend
              /4c8bc619-98bd-9342-85fe-1ea4782c0cf2/8e901c80-0ead-4764-b3d9-b04b4376e4c4',
              'o_direct': True, 'xenstore_data': {'scsi/0x12/0x80': 'AIAAEjhlOTAxYzgwLTBlYWQtNDcgIA==', 'scsi/0x12/0x83': 
              'AIMAMQIBAC1YRU5TUkMgIDhlOTAxYzgwLTBlYWQtNDc2NC1iM2Q5LWIwNGI0Mzc2ZTRjNCA=', 'vdi-uuid': 
              '8e901c80-0ead-4764-b3d9-b04b4376e4c4', 'mem-pool': '4c8bc619-98bd-9342-85fe-1ea4782c0cf2'}}
              

              Note the 'o_direct': True in this log entry. But I created the SR with other-config:o_direct=false

              stormiS 1 Reply Last reply Reply Quote 0
              • stormiS Offline
                stormi Vates 🪐 XCP-ng Team @borzel
                last edited by

                @borzel I'll let @nraynaud have a say, but first, to be sure, you built it with

                xcp-build --build-local . --define 'xcp_ng_section extras'
                

                and the resulting RPM contains .extras in its name?

                1 Reply Last reply Reply Quote 0
                • borzelB Offline
                  borzel XCP-ng Center Team
                  last edited by borzel

                  yes, I did 🙂 Checked it multiple times. Will check it again today to be 100% sure.

                  I also saw the right logoutput from @nraynaud new function where he probes the open with O_DIRECT and retries with O_DSYNC.

                  I have the "feel" that xapi or blktap copy's files in some situations without it's normal way. Maybe a special "file copy" without a open command?

                  Can I attach a debugger to blktap while in operration? This would also be needed for deep support of the whole server.

                  1 Reply Last reply Reply Quote 0
                  • stormiS Offline
                    stormi Vates 🪐 XCP-ng Team
                    last edited by

                    You can probably attach gdb to a running blktap, yes. Install the blktap-debuginfo package that was produced when you built blktap to make debug symbols available.

                    1 Reply Last reply Reply Quote 0
                    • nraynaudN Offline
                      nraynaud XCP-ng Team
                      last edited by

                      @borzel there is something fishy around here https://github.com/xapi-project/sm/blob/master/drivers/blktap2.py#L994 I am still unsure what to do.

                      I have not removed O_DIRECT everywhere in blktap, I was expecting to remove the remainder by the o_direct flag in python, but I guess I was wrong. we might patch the python.

                      1 Reply Last reply Reply Quote 0
                      • borzelB Offline
                        borzel XCP-ng Center Team
                        last edited by

                        Maybe we can create a new SR type "zfs" and copy the file SR implementation from https://github.com/xapi-project/sm/blob/master/drivers/FileSR.py
                        This would us later allow to create a more deeper integration of ZFS.

                        1 Reply Last reply Reply Quote 0
                        • borzelB Offline
                          borzel XCP-ng Center Team
                          last edited by borzel

                          Success! I changed in /opt/xensource/sm/blktap2.py

                           elif not ((self.target.vdi.sr.handles("nfs") or self.target.vdi.sr.handles("ext") or self.target.vdi.sr.handles("smb"))):
                          

                          to

                           elif not ((self.target.vdi.sr.handles("file")) or (self.target.vdi.sr.handles("nfs") or self.target.vdi.sr.handles("ext") or self.target.vdi.sr.handles("smb"))):
                          

                          Than I deleted /opt/xensource/sm/blktap2.pyc and /opt/xensource/sm/blktap2.pyo so that the *.py file is used.

                          Now I can copy from other SR types to my ZFS SR.

                          But copy within my ZFS SR does not work...

                          1 Reply Last reply Reply Quote 0
                          • borzelB Offline
                            borzel XCP-ng Center Team
                            last edited by

                            some info about the o_direct flag: https://xenserver.org/blog/entry/read-caching.html

                            1 Reply Last reply Reply Quote 0
                            • borzelB Offline
                              borzel XCP-ng Center Team
                              last edited by borzel

                              After reading some, I think we can not get it fully working in short time. Some of the tools might not work without o_direct support:

                              If the underlying file-system does not support O_DIRECT, utilities (e.g, vhd-util) may fail with error code 22 (EINVAL). Similarly, Xen may fail with a message as follows:
                              TapdiskException: ('create', '-avhd:/home/cklein/vms/vm-rubis-0/root.vhd') failed (5632 )

                              https://github.com/xapi-project/blktap/blob/master/README


                              Today there is just one fully working local ZFS configuration:

                              • create your ZFS-Pool: zpool create ...
                              • create a ZVOL (aka blockdevice): zfs create -V 50G pool/my_local_sr
                              • create an EXT3 based SR: xe sr-create host-uuid=<UUID_of_your_host> type=ext shared=false name-label=<Name_of_my_SR> device-config:device=/dev/zvol/<pool-name>/<zvol-name>

                              It's not optimal, but working.

                              1 Reply Last reply Reply Quote 0
                              • nraynaudN Offline
                                nraynaud XCP-ng Team
                                last edited by

                                I am working on the issue right now. I am trying to exactly nail the problem.
                                there are 2 cases:

                                • xe vdi-copy from another SR to a ZFS SR doesn't work
                                • xe vdi-copy on the same SR doesn't work.

                                I that a complete assessment of the issues you found?

                                A cursory test seems to show that ssh ${XCP_HOST_UNDER_TEST} sed -i.bak 's/# unbuffered = true/unbuffered = false/' /etc/sparse_dd.conf solves the issue of intra-ZFS copies, but I am still confirming that I have not done anything else on my test box.

                                Thanks,
                                Nicolas.

                                1 Reply Last reply Reply Quote 0
                                • borzelB Offline
                                  borzel XCP-ng Center Team
                                  last edited by borzel

                                  @nraynaud said in Testing ZFS with XCP-ng:

                                  xe vdi-copy from another SR to a ZFS SR doesn't work
                                  xe vdi-copy on the same SR doesn't work.

                                  Yes 🙂

                                  Additional to that the copy from ZFS SR to another SR is not working.

                                  1 Reply Last reply Reply Quote 0
                                  • nraynaudN Offline
                                    nraynaud XCP-ng Team
                                    last edited by

                                    ok, I really think changing /etc/sparse_dd.conf is the right path.

                                    #!/usr/bin/env bash
                                    
                                    # HOW TO create the passthrough: xe sr-create name-label="sda passthrough" name-description="Block devices" type=udev content-type=disk device-config:location=/dev/sda host-uuid=77b3f6ad-020b-4e48-b090-74b2a26c4f69
                                    
                                    set -ex
                                    
                                    MASTER_HOST=root@192.168.100.1
                                    PASSTHROUGH_VDI=a74d267e-bb14-4732-bd80-b9c445199e8a
                                    
                                    SNAPSHOT_UUID=19d3758e-eb21-f237-b8f7-6e2f638cc8e0
                                    VM_HOST_UNDER_TEST_UUID=13ec74c2-9b57-a327-962f-1ebd9702eec4
                                    XCP_HOST_UNDER_TEST_UUID=05c61e28-11cf-4131-b645-a0be7637c044
                                    XCP_HOST_UNDER_TEST_IP=192.168.100.151
                                    XCP_HOST_UNDER_TEST=root@${XCP_HOST_UNDER_TEST_IP}
                                    
                                    INCEPTION_VM_UUID=a7e37541-fb9a-4392-6b54-60cf7ce3d08a
                                    INCEPTION_VM_IP=192.168.100.32
                                    INCEPTION_VM=root@${INCEPTION_VM_IP}
                                    
                                    ssh ${MASTER_HOST} xe snapshot-revert snapshot-uuid=${SNAPSHOT_UUID}
                                    NEW_VBD=`ssh ${MASTER_HOST} xe vbd-create device=1 type=Disk mode=RW vm-uuid=${VM_HOST_UNDER_TEST_UUID} vdi-uuid=${PASSTHROUGH_VDI}`
                                    ssh ${MASTER_HOST} xe vm-start vm=${VM_HOST_UNDER_TEST_UUID}
                                    until ping -c1 ${XCP_HOST_UNDER_TEST_IP} &>/dev/null; do :; done
                                    sleep 20
                                    
                                    # try EXT3
                                    ssh ${XCP_HOST_UNDER_TEST} 'mkfs.ext3 /dev/sdb2 && echo /dev/sdb2 /mnt/ext3 ext3 >>/etc/fstab && mkdir -p /mnt/ext3 && mount /dev/sdb2 && df'
                                    SR_EXT3_UUID=`ssh ${XCP_HOST_UNDER_TEST} "xe sr-create host-uuid=${XCP_HOST_UNDER_TEST_UUID} name-label=test-ext3-sr type=file other-config:o_direct=false device-config:location=/mnt/ext3/test-ext3-sr"`
                                    TEST_EXT3_VDI=`ssh ${XCP_HOST_UNDER_TEST} xe vdi-create sr-uuid=${SR_EXT3_UUID} name-label=test-ext3-vdi virtual-size=214748364800`
                                    TEST_VBD=`ssh ${XCP_HOST_UNDER_TEST} xe vbd-create device=1 type=Disk mode=RW vm-uuid=${INCEPTION_VM_UUID} vdi-uuid=${TEST_EXT3_VDI}`
                                    
                                    
                                    ssh ${XCP_HOST_UNDER_TEST} reboot || true
                                    sleep 20
                                    until ping -c1 ${XCP_HOST_UNDER_TEST_IP} &>/dev/null; do :; done
                                    sleep 20
                                    
                                    ssh ${XCP_HOST_UNDER_TEST} xe vm-start vm=${INCEPTION_VM_UUID} on=${XCP_HOST_UNDER_TEST_UUID}
                                    sleep 2
                                    until ping -c1 ${INCEPTION_VM_IP} &>/dev/null; do :; done
                                    sleep 20
                                    ssh ${INCEPTION_VM} echo FROM BENCH
                                    ssh ${INCEPTION_VM} 'apk add gcc zlib-dev libaio libaio-dev make linux-headers git binutils musl-dev; git clone https://github.com/axboe/fio fio; cd fio; ./configure && make&& make install'
                                    ssh ${INCEPTION_VM} 'mkfs.ext3 /dev/xvdb && mount /dev/xvdb /mnt;df'
                                    ssh ${INCEPTION_VM} 'cd /mnt;sync;/usr/local/bin/fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=write --bs=4k --direct=1 --size=512M --numjobs=2 --runtime=30 --group_reporting' > ext3_write_result
                                    ssh ${INCEPTION_VM} 'cd /mnt;sync;/usr/local/bin/fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=write --bs=4k --direct=1 --size=512M --numjobs=2 --runtime=30 --group_reporting' >> ext3_write_result
                                    ssh ${INCEPTION_VM} 'cd /mnt;sync;/usr/local/bin/fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=read --bs=4k --direct=1 --size=512M --numjobs=2 --runtime=30 --group_reporting' > ext3_read_result
                                    ssh ${INCEPTION_VM} 'cd /mnt;sync;/usr/local/bin/fio --name=randwrite --ioengine=libaio --iodepth=1 --rw=read --bs=4k --direct=1 --size=512M --numjobs=2 --runtime=30 --group_reporting' >> ext3_read_result
                                    ssh ${XCP_HOST_UNDER_TEST} xe vm-shutdown uuid=${INCEPTION_VM_UUID}
                                    
                                    # try ZFS
                                    # install binaries that don't use O_DIRECT
                                    rsync -r zfs ${XCP_HOST_UNDER_TEST}:
                                    scp /Users/nraynaud/dev/xenserver-build-env/blktap-3.5.0-1.12test.x86_64.rpm ${XCP_HOST_UNDER_TEST}:
                                    ssh ${XCP_HOST_UNDER_TEST} yum remove -y blktap
                                    ssh ${XCP_HOST_UNDER_TEST} yum install -y blktap-3.5.0-1.12test.x86_64.rpm
                                    ssh ${XCP_HOST_UNDER_TEST} yum install -y zfs/*.rpm
                                    ssh ${XCP_HOST_UNDER_TEST} depmod -a
                                    ssh ${XCP_HOST_UNDER_TEST} modprobe zfs
                                    ssh ${XCP_HOST_UNDER_TEST} zpool create -f -m /mnt/zfs tank /dev/sdb1
                                    ssh ${XCP_HOST_UNDER_TEST} zfs set sync=disabled tank
                                    ssh ${XCP_HOST_UNDER_TEST} zfs set compression=lz4 tank
                                    ssh ${XCP_HOST_UNDER_TEST} zfs list
                                    
                                    
                                    SR_ZFS_UUID=`ssh ${XCP_HOST_UNDER_TEST} "xe sr-create host-uuid=${XCP_HOST_UNDER_TEST_UUID} name-label=test-zfs-sr type=file other-config:o_direct=false device-config:location=/mnt/zfs/test-zfs-sr"`
                                    TEST_ZFS_VDI=`ssh ${XCP_HOST_UNDER_TEST} xe vdi-create sr-uuid=${SR_ZFS_UUID} name-label=test-zfs-vdi virtual-size=214748364800`
                                    # this line avoids O_DIRECT in reads
                                    ssh ${XCP_HOST_UNDER_TEST} "sed -i.bak 's/# unbuffered = true/unbuffered = false/' /etc/sparse_dd.conf"
                                    # try various clone situations
                                    ssh ${XCP_HOST_UNDER_TEST} xe vdi-copy uuid=${TEST_ZFS_VDI} sr-uuid=${SR_ZFS_UUID}
                                    ssh ${XCP_HOST_UNDER_TEST} xe vdi-copy uuid=${TEST_ZFS_VDI} sr-uuid=${SR_EXT3_UUID}
                                    ssh ${XCP_HOST_UNDER_TEST} xe vdi-copy uuid=${TEST_EXT3_VDI} sr-uuid=${SR_ZFS_UUID}
                                    

                                    this script complete to the end without error.

                                    1 Reply Last reply Reply Quote 0
                                    • nraynaudN Offline
                                      nraynaud XCP-ng Team
                                      last edited by

                                      If other people can reproduce my results, I propose to directly change the parameter in the XCP-ng distribution RPM.

                                      borzelB 1 Reply Last reply Reply Quote 0
                                      • olivierlambertO Offline
                                        olivierlambert Vates 🪐 Co-Founder CEO
                                        last edited by

                                        Your package is experimental, so feel free to add the modification inside it 🙂

                                        1 Reply Last reply Reply Quote 0
                                        • borzelB Offline
                                          borzel XCP-ng Center Team @nraynaud
                                          last edited by borzel

                                          @nraynaud said in Testing ZFS with XCP-ng:

                                          If other people can reproduce my results

                                          with the change in /etc/sparse_dd.conf I can copy my VMs from:

                                          • EXT3 -> ZFS
                                          • ZFS -> ZFS
                                          • ZFS -> EXT3

                                          Yeha! 🙂

                                          Thanks for your work!

                                          By the way, my XCP-ng replication host at work is working just fine with ZFS-SR. All stable like ZFS should be.

                                          1 Reply Last reply Reply Quote 1
                                          • olivierlambertO Offline
                                            olivierlambert Vates 🪐 Co-Founder CEO
                                            last edited by

                                            Yay!! Thanks for testing 🙂

                                            1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better 💗

                                            Register Login
                                            • First post
                                              Last post