XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    VDI_IO_ERROR Continuous Replication on clean install.

    Scheduled Pinned Locked Moved Solved Xen Orchestra
    66 Posts 7 Posters 7.0k Views 8 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Y Offline
      yomono @olivierlambert
      last edited by

      @olivierlambert not really. This time is just local ext storage, SATA drives.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        In LVM or thin? It might be 2 different problems, so I'm trying to sort this out.

        Y Tristis OrisT 2 Replies Last reply Reply Quote 0
        • Y Offline
          yomono @olivierlambert
          last edited by

          @olivierlambert both! I have both mixed in my servers and I tried in both when I did the tests

          1 Reply Last reply Reply Quote 0
          • Tristis OrisT Online
            Tristis Oris Top contributor
            last edited by

            just remember i have one server with fresh 8.2.1 and nfs backups to TrueNAS. it working.
            will do other tests tomorrow.

            1 Reply Last reply Reply Quote 0
            • Tristis OrisT Online
              Tristis Oris Top contributor @olivierlambert
              last edited by

              @olivierlambert
              sr_not_supported that not a error and not a reason. That because of default multipath Dell config for 3xxx series. Persist at 8.2.0 where CR working, so that just a warning.
              As we have no any problems before, we never investigate to this setting. My bad again 😃 yay.

              Replaced it to official for 4xxx and this warning gone. I see at 8.3 it already more universal for any generation.

                      device {
                              vendor "DellEMC"
                              product "ME4"
                              path_grouping_policy "group_by_prio"
                              path_checker "tur"
                              hardware_handler "1 alua"
                              prio "alua"
                              failback immediate
                              path_selector "service-time 0"
                      }
              

              since it no default config for huawei, so we always used the official one.

                      device {
                              vendor                  "HUAWEI"
                              product                 "XSG1"
                              path_grouping_policy multibus
                              path_checker            tur
                              prio                    const
                              path_selector           "round-robin 0"
                              failback                immediate
                              fast_io_fail_tmo        5
                              dev_loss_tmo            30
                      }
              
              
              • 8.2.1:

              • CR not working:
                both huawei, dell iscsi - multipath enabled
                both huawei, dell iscsi - multipath disabled

              • working:
                nfs vm disk
                local thin\ext
                local thick\lvm

              • 8.3

              • working:
                both huawei, dell iscsi - multipath enabled
                local thick\lvm

              and now interesting. After i solved this false warning, detach extra hosts from pool, detach all additional links (trunk, backup) to decrease comunications and log itself - it's no any SMlog generated during backup task.

              MP enabled - with 2nd link for backup https://pastebin.com/URcnDckR
              MP enabled - only Mng link, no SMlog generated https://pastebin.com/RHw40uzg

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                🤔 I have the impression it's good news, but I'm not 100% sure to get it, can you rephrase a bit your conclusion?

                Tristis OrisT 1 Reply Last reply Reply Quote 0
                • Tristis OrisT Online
                  Tristis Oris Top contributor @olivierlambert
                  last edited by

                  if i have no smlog - xen\dom0 not related with backup task. right?
                  smlog that usualy i got during this 5min have no any errors anyway, only some locking operations.
                  And it always takes 5min, some hardcoded timings?

                  don't forget that problem also happens with FC connection, so it may concern any block based storage types.

                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by

                    I don't understand your sentence, can you take time to re-read it or rephrase it, because I doesn't make sense to me, sorry 😞

                    What do you mean by "if i have no smlog - xen\dom0 not related with backup task. right?"?

                    Tristis OrisT 1 Reply Last reply Reply Quote 0
                    • Tristis OrisT Online
                      Tristis Oris Top contributor @olivierlambert
                      last edited by

                      i mean it could be XO issue, since it not communicate with xen. Otherwise it should write some logs.

                      1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by

                        I don't see the logical connection with XO, since it works on some SR and not on others. XO has no idea (or doesn't care) about the underlying storage.

                        1 Reply Last reply Reply Quote 0
                        • Tristis OrisT Online
                          Tristis Oris Top contributor
                          last edited by

                          well, i'm just made some tests and got some result. Have no idea how it should work)

                          1 Reply Last reply Reply Quote 0
                          • Tristis OrisT Online
                            Tristis Oris Top contributor
                            last edited by

                            i don't understand what happens.
                            Reinstalled xen to 8.2.0, CR was succeed for few times, but now i got this error again.

                            Tried few tests - 2-3 fails in row then it succeed again.
                            Only way to never use this pool for CR.

                            1 Reply Last reply Reply Quote 0
                            • EddieCh08666741E Offline
                              EddieCh08666741
                              last edited by

                              I also have some fresh installation 8.2.1 with similar error at 5 mins 2seconds 5 min 1 sec 😞

                              1 Reply Last reply Reply Quote 0
                              • Y Offline
                                yomono
                                last edited by

                                On my side, yesterday I did the only test I haven't done so far: Installing XenOrchestra in a NON xcp-ng server.
                                Basically, since always, I had a separated XCP-NG server with just a single VM inside: The XO VM (Just in case, that VM was Ubuntu, Centos, Debian over time, so the base OS has nothing to do with this).
                                My solution for this was simple: Bare metal Linux. So the problem wasn't the XCP version on the source server, nor the destination server. It was the host server of the XO VM itself
                                Why? I have no idea, but it's definitely working now since I started a CR task yesterday, of a 1TB VM, with a destination server over internet, and is still exporting after 14 hours without any issues:
                                7f769e0c-6880-4e21-8977-7e1688db24f5-imagen.png

                                Y 1 Reply Last reply Reply Quote 0
                                • Y Offline
                                  yomono @yomono
                                  last edited by

                                  @yomono And to clarify, for this XO host, I tested 8.2.0, 8.2.1, and 8.3.0 fresh installs and all failed at exactly 5 minutes

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    This means your Node version was still using the default timeout.

                                    Y 1 Reply Last reply Reply Quote 0
                                    • Y Offline
                                      yomono @olivierlambert
                                      last edited by

                                      @olivierlambert when you say "node" you mean node.js? How that timeout can be changed? Thanks

                                      olivierlambertO 1 Reply Last reply Reply Quote 0
                                      • EddieCh08666741E Offline
                                        EddieCh08666741
                                        last edited by

                                        changing the node will fix this vdi error ?

                                        e86eee8c-b1da-4d69-9776-11fd126348cd-image.png

                                        EddieCh08666741E 1 Reply Last reply Reply Quote 0
                                        • EddieCh08666741E Offline
                                          EddieCh08666741 @EddieCh08666741
                                          last edited by

                                          @EddieCh08666741 The one which works is the fresh install without any updates.

                                          1 Reply Last reply Reply Quote 0
                                          • olivierlambertO Offline
                                            olivierlambert Vates 🪐 Co-Founder CEO @yomono
                                            last edited by

                                            @yomono NodeJS yes. In Node 18, they made a breaking change to put by default a 5 minutes timeout.

                                            We fixed that by adding a specific config to get a longer timeout, see https://github.com/vatesfr/xen-orchestra/commit/f6fd1db1ef12633cc5bb8ec8ab5bc84682dd3fe7

                                            Without this piece of config, you'll end any HTTP stream after exactly 5 minutes.

                                            0 julien-f committed to vatesfr/xen-orchestra
                                            feat(xo-server): increase HTTP server request timeout to 1 day
                                            
                                            Fixes #6590
                                            Y 1 Reply Last reply Reply Quote 0
                                            • First post
                                              Last post