XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Storage IOPS and throughput dropouts?

    Scheduled Pinned Locked Moved Management
    8 Posts 3 Posters 497 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • P Offline
      paco
      last edited by

      In my throughput/IO graphs on Xen Orchestra, I'm seeing these spikes of 0 IOPS, 0 throughput, 0 latency, 0 IOWait. What I don't know is whether this is some kind of graphing artifact (i.e., it's not really happening, but the graph makes it look like it's happening), or whether I'm experiencing some kind of underlying hard disk throughput issue.

      I'm in the middle of migrating a VM and it's estimating 12 hours to move 100G, which is long. I moved a couple 20G VMs this morning at 15-20 minutes each.

      My question is, where could I look in XCP-ng logs to investigate? How could I get a sense of whether I'm really running into problems here? I've looked at these graphs many times over the years, and these spikes are new. This hasn't always looked like this.

      Any thoughts?

      The underlying storage is a LSI 9266-8i SAS RAID controller with 3 hard disks in a RAID5 array (hardware raid). I don't think I'm experiencing any other major issues, but maybe I am and don't know how to see it?

      I'm running XCP-ng 8.2.1 and Xen Orchestra built from sources (well, from ronivay's Docker container) Xen Orchestra, commit e75cc Master, commit 70014.

      Screenshot 2024-08-09 at 16.04.07.png

      1 Reply Last reply Reply Quote 0
      • nikadeN Offline
        nikade Top contributor
        last edited by

        I have the same thing if im on the "Last 10 minutes" view, if I switch to any other view the graphs are more even. Im 100% sure I dont have a performance issue with my RAID6 SSD array, yet my graphs looks like yours with the "spikes":

        b1b6032c-12c5-4abe-9866-4405c7ecc746-bild.png

        P 1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          Hi,

          Do you have any functional issues in the first place?

          P 1 Reply Last reply Reply Quote 0
          • P Offline
            paco @nikade
            last edited by

            @nikade You misunderstand. I am calling them "spikes", which most people use that word to mean things going dramatically up. Your graph shows spikes going up. Look closely at my graph, including the point I have highlighted. These "spikes" are 0. That is, it's humming along and then suddenly there's a spike down to 0. That's unlikely because the system is plenty active. So your graph shows spikes up, which is normal. Mine is showing occassional drops to 0, which is what I'm asking about.

            nikadeN 1 Reply Last reply Reply Quote 0
            • P Offline
              paco @olivierlambert
              last edited by

              @olivierlambert Nothing dramatic. Just things "feeling slower than usual." That's qualititative. Just a gut hunch. So that's why I'm asking about ways to test or investigate.

              1 Reply Last reply Reply Quote 0
              • nikadeN Offline
                nikade Top contributor @paco
                last edited by

                @paco said in Storage IOPS and throughput dropouts?:

                @nikade You misunderstand. I am calling them "spikes", which most people use that word to mean things going dramatically up. Your graph shows spikes going up. Look closely at my graph, including the point I have highlighted. These "spikes" are 0. That is, it's humming along and then suddenly there's a spike down to 0. That's unlikely because the system is plenty active. So your graph shows spikes up, which is normal. Mine is showing occassional drops to 0, which is what I'm asking about.

                What happends if you choose another period? For example the hourly or daily?
                I can agree with the spiking up vs. down - I didnt even reflect about that, looks a bit strange.

                I've now checked every SR at work and I can only see a similar graph on one of our hosts, which is pretty idle. It is a local RAID6 and the VM's on there feel pretty snappy:

                55b9f7ff-47a5-4451-a73c-5ed216cdcc07-image.png

                Im starting to think if this has anything to do with write-cache on the RAID-controller. When I check our shared SR's there is nothing like this.

                P 1 Reply Last reply Reply Quote 0
                • P Offline
                  paco @nikade
                  last edited by

                  @nikade Short answer is that the spikes seem to have smoothed out. I'm not really sure what happened. I don't have them at any resolution now. Perhaps some kind of heisenbug.

                  nikadeN 1 Reply Last reply Reply Quote 0
                  • nikadeN Offline
                    nikade Top contributor @paco
                    last edited by

                    @paco said in Storage IOPS and throughput dropouts?:

                    @nikade Short answer is that the spikes seem to have smoothed out. I'm not really sure what happened. I don't have them at any resolution now. Perhaps some kind of heisenbug.

                    Happy to hear that 🙂

                    1 Reply Last reply Reply Quote 0
                    • First post
                      Last post