XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XO task watcher issue/CR broken

    Scheduled Pinned Locked Moved Solved Xen Orchestra
    71 Posts 6 Posters 12.7k Views 7 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Offline
      Andrew Top contributor @julien-f
      last edited by

      @julien-f Any new thoughts on this issue?

      julien-fJ 1 Reply Last reply Reply Quote 0
      • julien-fJ Offline
        julien-f Vates 🪐 Co-Founder XO Team @Andrew
        last edited by

        @Andrew Yes, I'm working on this and should have something to test tomorrow 🙂

        1 Reply Last reply Reply Quote 0
        • julien-fJ Offline
          julien-f Vates 🪐 Co-Founder XO Team @julien-f
          last edited by

          @Andrew @Gheppy, I have something new test, it's an important change concerning a low-level component of XO, hopefully I did not break anything 🤞

          The cr-issue branch has been rebased, make sure to reset it properly, re-install dependencies and rebuild:

          cd xen-orchestra/
          git checkout cr-issue
          git fetch
          git reset --hard origin/cr-issue
          yarn
          yarn build
          

          Let me know if you have any issues 🙂

          GheppyG A 2 Replies Last reply Reply Quote 0
          • GheppyG Online
            Gheppy @julien-f
            last edited by Gheppy

            @julien-f
            I'll tested in a short time, I am in the middle of CR for next 6 hours

            1 Reply Last reply Reply Quote 0
            • A Offline
              Andrew Top contributor @julien-f
              last edited by Andrew

              @julien-f Broken. Job failed. Lots of Error: Premature close. Data not replicated.

              Job report: job.txt (text format)

              Journal XO output: out.txt.gz (GZ format).

              1 Reply Last reply Reply Quote 0
              • GheppyG Online
                Gheppy
                last edited by

                same error

                CR log
                CR-log.txt

                XOCE log
                Srv-log.txt

                julien-fJ 1 Reply Last reply Reply Quote 0
                • julien-fJ Offline
                  julien-f Vates 🪐 Co-Founder XO Team @Gheppy
                  last edited by

                  I still don't understand exactly the issue, not sure if it comes from XO or XCP-ng/XenServer, but latest version does integrate a work-around, if you encounter Premature close error during CR, you can add the following to your xo-server's configuration file (usually /etc/xo-server/config.toml) :

                  [xapiOptions]
                  ignorePrematureClose = true
                  

                  It's not enabled by default until completely understand the root cause and it's properly fixed.

                  @Gheppy Thanks a lot for your tests and feedbacks 🙂

                  @Andrew Thank you very much for the test appliance, it was an invaluable help investigating this. If you can keep it online for the time being I'll probably have further tests to do with it next week 🙏

                  A 1 Reply Last reply Reply Quote 0
                  • A Offline
                    Andrew Top contributor @julien-f
                    last edited by

                    @julien-f I updated XO Source to current master and added the new ignorePrematureClose=true option. Backup ran the CR correctly again.

                    Yes, I can leave the XOA test tunnel up for testing. I'm happy to help you help me!

                    1 Reply Last reply Reply Quote 0
                    • olivierlambertO Offline
                      olivierlambert Vates 🪐 Co-Founder CEO
                      last edited by

                      @Andrew do you still have the issue without the ignorePrematureClose?

                      A 1 Reply Last reply Reply Quote 0
                      • A Offline
                        Andrew Top contributor @olivierlambert
                        last edited by

                        @olivierlambert Yes. Still problems on the new code without the option set. 90% of the VMs fail 10% finish correctly on CR.

                        1 Reply Last reply Reply Quote 1
                        • olivierlambertO Offline
                          olivierlambert Vates 🪐 Co-Founder CEO
                          last edited by

                          Thanks for your precious feedback 👍

                          1 Reply Last reply Reply Quote 0
                          • GheppyG Online
                            Gheppy
                            last edited by Gheppy

                            Just for info
                            I installed the latest version of XOCE, commit 083db67df9e1645a2f8fe2fac564b3aecf30d55e
                            CR is ok and can be used in case of disaster, with ignorePrematureClose = true.
                            I managed to start the VM without problems, the VM itself ( that is the CR copy ) .
                            At the moment I have a very slow copy problem of the VM that is created with CR on the seccond server ( 18Mb max ), the problem is that the same connection goes with 400Mb for a CR copy. I want to make a copy of VM-CR and start it

                            M 1 Reply Last reply Reply Quote 0
                            • GheppyG Gheppy referenced this topic on
                            • M Offline
                              magicker @Gheppy
                              last edited by

                              @Gheppy weird.. DR is now working for me but CR is still never ending

                              GheppyG 1 Reply Last reply Reply Quote 0
                              • GheppyG Online
                                Gheppy @magicker
                                last edited by olivierlambert

                                @magicker
                                You need to add this on config file.
                                Location for me is /opt/xen-orchestra/packages/xo-server/.xo-server.toml

                                [xapiOptions]
                                ignorePrematureClose = true
                                
                                M 1 Reply Last reply Reply Quote 0
                                • M Offline
                                  magicker @Gheppy
                                  last edited by

                                  @Gheppy Yes.. putting that in place seemed to fix DR.. but not CR.

                                  1 Reply Last reply Reply Quote 0
                                  • olivierlambertO Offline
                                    olivierlambert Vates 🪐 Co-Founder CEO
                                    last edited by

                                    🤔 Any log or something @magicker ? Are you sure it's a CR started after modifying the config and restarting xo-server?

                                    M 2 Replies Last reply Reply Quote 0
                                    • M Offline
                                      magicker @olivierlambert
                                      last edited by

                                      @olivierlambert actually.. just tried CR again... and this time it is working.. will test with a few more vms.

                                      1 Reply Last reply Reply Quote 1
                                      • M Offline
                                        magicker @olivierlambert
                                        last edited by

                                        @olivierlambert spoke too soon.. CR works in one direction but not the other
                                        https://images.dx3webs.com/kgSC4E.png

                                        julien-fJ 1 Reply Last reply Reply Quote 0
                                        • julien-fJ Offline
                                          julien-f Vates 🪐 Co-Founder XO Team @magicker
                                          last edited by

                                          @magicker Any differences between the two hosts? XCP-ng versions maybe?

                                          JamfoFLJ 1 Reply Last reply Reply Quote 0
                                          • JamfoFLJ Offline
                                            JamfoFL @julien-f
                                            last edited by

                                            Jumping in as another "victim"... I have lost almost all ability to run any kinds of backups.

                                            System information:

                                            XO from Sources
                                            XO-Server: 5.109.0
                                            XO-Web: 5.111.0
                                            Commit: d4bbf

                                            I had no issues running any kinds of backups prior to applying the latest commits two weeks ago, Monday, February 6, 2023. After applying that commit, everything seemed to be OK until late that evening. At that time, one of my CR jobs started and never completed. All of the subsequent jobs failed, of course, because another job was already running. I was able to delete the job and restarted the toolstacks and rebuilt a new job. This started to spread until, eventually, all of my CR jobs would fail. That issue has now spread and my DR jobs are now failing, too. Out of the seven backup jobs that were working with no issues prior to that date, I am now only able to run one job successfully, and that is a DR job.

                                            The DR and CR jobs are written to different repositories (the CR jobs are written to the storage repository on one of the other XCP-ng servers and the DR jobs are written to an NFS share).

                                            I tried running the cr-issue branch update that @julien-f recommended, and modified my config.toml file with the ignorePrematureClose = true code that was also recommended. Nothing works.

                                            If I try to run a DR job now, even a brand-new one, it looks like it wants to start and create the snapshot, but then fails and actually shows it will start in 53.086 years.

                                            Anything I can get, just let me know what to pull and I'll add it here... hopefully that will help. Right now, I'm able to only back up a single VM!

                                            julien-fJ 1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post