XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    XOA Failing on Check-in

    Scheduled Pinned Locked Moved Xen Orchestra
    19 Posts 6 Posters 782 Views 5 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D Offline
      DustinB
      last edited by

      Is anyone else having issues with XOA failing on check-in, see below. Everything is functioning as expected besides this, not sure if I should be worried or not.

      fd283691-7815-4881-9a8d-f35c0dbf9d8b-image.png

      1 Reply Last reply Reply Quote 0
      • DanpD Offline
        Danp Pro Support Team
        last edited by

        Not for me. Maybe a peering issue with your ISP. Have you tried running a tracert?

        1 Reply Last reply Reply Quote 0
        • olivierlambertO Offline
          olivierlambert Vates 🪐 Co-Founder CEO
          last edited by

          No issues here neither.

          1 Reply Last reply Reply Quote 0
          • A Offline
            acomav
            last edited by

            Hi,
            I have also started having this issue.

            My error:

            ✖ 15/16 - Internet connectivity: AggregateError [ETIMEDOUT]: 
                at internalConnectMultiple (node:net:1118:18)
                at internalConnectMultiple (node:net:1186:5)
                at Timeout.internalConnectMultipleTimeout (node:net:1712:5)
                at listOnTimeout (node:internal/timers:583:11)
                at process.processTimers (node:internal/timers:519:7) {
              code: 'ETIMEDOUT',
              url: 'http://xen-orchestra.com/',
              [errors]: [
                Error: connect ETIMEDOUT 185.78.159.93:80
                    at createConnectionError (node:net:1648:14)
                    at Timeout.internalConnectMultipleTimeout (node:net:1707:38)
                    at listOnTimeout (node:internal/timers:583:11)
                    at process.processTimers (node:internal/timers:519:7) {
                  errno: -110,
                  code: 'ETIMEDOUT',
                  syscall: 'connect',
                  address: '185.78.159.93',
                  port: 80
                },
                Error: connect ENETUNREACH 2a01:240:ab08::4:80 - Local (:::0)
                    at internalConnectMultiple (node:net:1182:16)
                    at Timeout.internalConnectMultipleTimeout (node:net:1712:5)
                    at listOnTimeout (node:internal/timers:583:11)
                    at process.processTimers (node:internal/timers:519:7) {
                  errno: -101,
                  code: 'ENETUNREACH',
                  syscall: 'connect',
                  address: '2a01:240:ab08::4',
                  port: 80
                }
              ]
            }
            
            

            I have two XOA appliance running in different locations. One works fine but the XOA version is: 5.95.1
            The one that has started failing is running the latest version: 5.100.2

            Traceroutes from the working XOA get to: (I'm in Australia hence the long response times)

            ...
            16 prs-b1-link.ip.twelve99.net (62.115.125.167) 282.574 ms 282.700 ms freeprosas-ic-367227.ip.twelve99-cust.net (80.239.167.129) 303.985 ms
            17 freeprosas-ic-367227.ip.twelve99-cust.net (80.239.167.129) 302.850 ms 302.835 ms be1.er02.lyo03.jaguar-network.net (85.31.194.151) 309.182 ms
            18 cpe-et008453.cust.jaguar-network.net (85.31.197.135) 310.999 ms be1.er02.lyo03.jaguar-network.net (85.31.194.151) 308.157 ms 308.477 ms
            19 * cpe-et008453.cust.jaguar-network.net (85.31.197.135) 318.785 ms 309.982 ms

            From the non-working XOA:

            ...
            10 * be803.lsr01.prth.wa.vocus.network (103.1.76.147) 106.498 ms be803.lsr01.stpk.wa.vocus.network (103.1.76.145) 109.750 ms
            11 * * *
            12 * * *
            13 * * *
            14 mei-b5-link.ip.twelve99.net (62.115.134.228) 244.552 ms mei-b5-link.ip.twelve99.net (62.115.113.2) 243.988 ms mei-b5-link.ip.twelve99.net (62.115.124.123) 258.259 ms
            15 freeprosas-ic-373578.ip.twelve99-cust.net (62.115.35.93) 256.427 ms * *
            16 be1.er02.lyo03.jaguar-network.net (85.31.194.151) 279.685 ms 276.070 ms *

            On the new XOA, I can manually telnet to 185.78.159.93 on port 80 and get a response so I am at a loss.
            It is not affecting day to day work.
            I was going to download the latest version of the XOA appliance and import my config and see if that does the trick......unless anyone here has any other tests to run?

            A T 2 Replies Last reply Reply Quote 0
            • A Offline
              acomav @acomav
              last edited by

              Replying to myself here for an update.

              I reinstalled the XOA appliance and imported my config. (On a different host in a different pool)
              That took me back to XOA v5.98.1. Internet connectivity was fine.
              I stayed on the Stable Channel and went up to 5.99.1. Internet connectivity was fine.

              I have a Pool issue where the XOA was and I can't fix it until tonight. (I need to upgrade and reboot the master.)

              A 1 Reply Last reply Reply Quote 0
              • A Offline
                acomav @acomav
                last edited by acomav

                @acomav Replying to myself again. After working for a few days, the issue restarted. I'll raise a ticket.

                1 Reply Last reply Reply Quote 1
                • T Offline
                  turnt @acomav
                  last edited by

                  @acomav Same issue here. Upgraded to 5.100.2 and this has appeared. Like you, doesn't appear to be affecting anything and I changed nothing on any of my systems/network for this to appear, the only change was the XOA upgrade.

                  1 Reply Last reply Reply Quote 1
                  • D Offline
                    DustinB
                    last edited by

                    Well then I guess I'm glad to say I'm not insane, I was thinking maybe it was network latency or something else (which is completely plausible) but as far as I could tell this seems like XOA is unable to ping the domain.

                    Which doesn't make any sense, it's not like the website has gone offline..

                    1 Reply Last reply Reply Quote 0
                    • A Offline
                      acomav
                      last edited by

                      I was able to fix it in mine by disabling IPv6. (Which we don't run).

                      sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1
                      sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1
                      In order to verify that IPv6 is disabled, run:

                      cat /proc/sys/net/ipv6/conf/all/disable_ipv6
                      If the output is 1, we can say IPv6 is in disable state.

                      This is a temp fix until next reboot. Read here for a permanent solution:

                      https://bobcares.com/blog/debian-12-disable-ipv6/

                      After disabling IPv6, 'xoa check' immediately started working.

                      1 Reply Last reply Reply Quote 0
                      • olivierlambertO Offline
                        olivierlambert Vates 🪐 Co-Founder CEO
                        last edited by

                        So it could be a peering issue in IPv6. On my side, no problem, where are you based?

                        A 1 Reply Last reply Reply Quote 0
                        • A Offline
                          acomav @olivierlambert
                          last edited by

                          @olivierlambert

                          I am in Australia.

                          1 Reply Last reply Reply Quote 0
                          • DanpD Offline
                            Danp Pro Support Team
                            last edited by

                            Isn't this the same issue caused by the Node upgrade? https://github.com/nodejs/node/issues/54359

                            notr1ch created this issue in nodejs/node

                            closed Happy eyeballs implementation times out prematurely #54359

                            A 1 Reply Last reply Reply Quote 0
                            • A Offline
                              acomav @Danp
                              last edited by

                              @Danp Interesting. That will be it. Thanks for linking this.
                              In the mean time, I've put in a request with the Australian government to move us closer to Europe.
                              😄

                              1 Reply Last reply Reply Quote 0
                              • H Offline
                                HamiltonWDS
                                last edited by HamiltonWDS

                                I can concur that I have had the same issue described above and have a fix for it as well.
                                XOA did work after deploying. But then after updating to the latest version "5.101.0 - XOA build: 20241004" and rebooted, Updates were no longer possible, along with some other minor things.

                                I applied the fix from the "Happy eyeballs implementation times out prematurely" link above by editing the "/etc/systemd/system/env" file.
                                Commands:

                                sudo nano /etc/systemd/system/env
                                

                                Added in the NODE_OPTIONS line after the HOME line as shown below:

                                HOME=/tmp
                                NODE_OPTIONS='--network-family-autoselection-attempt-timeout=500'
                                

                                WRITE (CTRL-C) and EXIT (CRTL-X)
                                Reboot XOA

                                1 Reply Last reply Reply Quote 1
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by

                                  So it's the case in an IPv6 context I assume, right?

                                  D H 2 Replies Last reply Reply Quote 0
                                  • D Offline
                                    DustinB @olivierlambert
                                    last edited by

                                    @olivierlambert in my case I found an errant DNS setting that was configured before I worked here that was causing some issues, but this issue became obvious after some XOA update and was likely exacerbated.

                                    After I fixed up my DNS issue, performance network wide has greatly improved, and XOA hasn't been reporting this specific issue.

                                    1 Reply Last reply Reply Quote 0
                                    • olivierlambertO Offline
                                      olivierlambert Vates 🪐 Co-Founder CEO
                                      last edited by

                                      @julien-f what do you think?

                                      1 Reply Last reply Reply Quote 0
                                      • H Offline
                                        HamiltonWDS @olivierlambert
                                        last edited by HamiltonWDS

                                        @olivierlambert said in XOA Failing on Check-in:

                                        So it's the case in an IPv6 context I assume, right?

                                        I think it is only the time out setting as the primary cause of the problem with IPv6 as a secondary cause.
                                        I am also in Australia, and just using IPv4 to Ping the address that I am seeing XOA Updater use, takes about on average 320ms.
                                        As from that "Happy eyeballs" link the initial message has, "Node tries to connect to the A address with only 250ms timeout, insufficient for many real-world cases (cellular/satellite links, poorly connected ISPs, far away servers, packet loss, etc). This times out, so node proceed to the last candidate which is supposed to have a longer timeout, however the last candidate is an AAAA address and the host has no IPv6 connectivity so it immediately fails"

                                        So it seems that when timeout occurs for those networks outside of the 250ms range, Node then uses an IPv6 address. And for those using only IPv4 networks, well, the connection never occurs.

                                        The resolution, but looks to not be implemented (by Node.js), would be for Node to have the last address to be IPv4, or at least have the last IPv4 address with the longer timeout.

                                        1 Reply Last reply Reply Quote 1
                                        • olivierlambertO Offline
                                          olivierlambert Vates 🪐 Co-Founder CEO
                                          last edited by

                                          Thank you very much for the clear explanation @HamiltonWDS ! This will be really helpful so we find a solution for that. We might default to IPv4 for a lot higher latency, and in the future maybe having closer location mirrors. Let me discuss this internally and schedule an action for that.

                                          1 Reply Last reply Reply Quote 0
                                          • First post
                                            Last post