XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Need Help Understanding the VM Suspend Process

    Scheduled Pinned Locked Moved Solved Management
    11 Posts 2 Posters 77 Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      Hi,

      Suspend means all the VM memory (RAM) has to be written on the Suspend SR. More RAM == more time to suspend.

      K 1 Reply Last reply Reply Quote 0
      • K Offline
        kagbasi-wgsdac @olivierlambert
        last edited by

        @olivierlambert Aaah, I've always wondered what the Suspend SR on the Advanced tab of the Pool meant......now I know.

        So, the NIC capacity/pipe between the hosts and the SR does really matter here.

        1 Reply Last reply Reply Quote 0
        • K Offline
          kagbasi-wgsdac
          last edited by

          Sharing this so others might benefit from what I'm learning.

          So, I looked at the network performance on TrueNAS during the Smart Reboot of the second XCP-ng host (screenshot below). What I saw seems to suggest that I'm getting near wire speed during READ operations. However, WRITE operations seem to be hitting a ceiling and I have a feeling it might be due to me having SYNC enabled on the dataset.

          Screenshot 2025-07-22 122432.png

          1 Reply Last reply Reply Quote 0
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Indeed, sync is likely the cause of that. And since you have to write the RAM, it can be a lot of GiB for big VMs.

            You can experiment to disable sync temporarily and test again.

            Note that a future improvement will be to save the VM RAM while the VM is running (instead of pausing it), reducing the "downtime". But this won't change the fact you must write the RAM somewhere, and this takes time.

            K 1 Reply Last reply Reply Quote 0
            • K Offline
              kagbasi-wgsdac @olivierlambert
              last edited by

              @olivierlambert Yes, I do plan on testing with SYNC disabled and then again with several permutations of dataset changes on the TrueNAS side (like compression on/off, etc.).

              Do you guys have a best practices document for setting up an NFS SR using TrueNAS? I browsed through the published XCP-ng documentation site but didn't find anything specific to TrueNAS or maybe I missed it.

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                Nothing very specific, sync vs async should be the biggest change

                K 1 Reply Last reply Reply Quote 0
                • K Offline
                  kagbasi-wgsdac @olivierlambert
                  last edited by kagbasi-wgsdac

                  @olivierlambert Oh okay, thanks for responding.

                  So I turned off SYNC and COMPRESSION on the dataset and retested (by suspending 11 VMs), I immediately noticed a whopping performance improvement (essentially sustained wire speeds AND 50% faster completion time*) :

                  • Roughly 984 Mb/s sustained WRITE speeds (during VM suspension)
                  • Roughly 984 Mb/s sustained READ speeds (during VM resumption)
                  • Transfer time for both READ and WRITE is about 20 minutes (down from 40-45 mins)

                  Screenshot 2025-07-22 170106.png

                  Gonna retest with SYNC disabled and COMPRESSION re-enabled and see if it degrades performance; standby for another report.

                  K 1 Reply Last reply Reply Quote 0
                  • K Offline
                    kagbasi-wgsdac @kagbasi-wgsdac
                    last edited by

                    Here's the latest, and probably last, test. Disabling compression had no appreciable impact on performance. I am now fully convinced that SYNC is the major player here.

                    Screenshot 2025-07-23 012851.png

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      Yes it is and it's pretty logical since you don't have to wait confirmation before getting the block actually written on the drive.

                      K 1 Reply Last reply Reply Quote 0
                      • K Offline
                        kagbasi-wgsdac @olivierlambert
                        last edited by

                        @olivierlambert Yes sir, it is and I'm glad I confirmed this for myself. Thanks also for helping me understand how the VM Suspend process works. Hopefully this post helps other newbies with the same understanding in the future.

                        1 Reply Last reply Reply Quote 1
                        • olivierlambertO olivierlambert marked this topic as a question
                        • olivierlambertO olivierlambert has marked this topic as solved
                        • First post
                          Last post