XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    HA failover reaction time question

    Scheduled Pinned Locked Moved Compute
    14 Posts 3 Posters 1.7k Views 2 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • D Offline
      dsmteam @olivierlambert
      last edited by

      @olivierlambert Thanks a lot.
      We have not SPOF and full fiber 100Gb network spine/leaf infrastructure so I will give it a go (currently we are only on a test plateform so I do as much as I need 🙂 )

      1 Reply Last reply Reply Quote 1
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Great, keep us posted!

        D 1 Reply Last reply Reply Quote 0
        • D Offline
          dsmteam @olivierlambert
          last edited by dsmteam

          @olivierlambert Just tried but there is no change in reaction time.
          After googling this parameter I found this page you wrote (small world) on xcp-ng.org website https://xcp-ng.org/blog/2024/08/22/xcp-ng-high-availability-a-guide/ which indicates that this timeout purpose is for self fencing in case of loss of network/storage (I actually had this page opened already in my browser but missed this line)
          Doesn't seem to influence restart timer in case of full host failure.

          DanpD 1 Reply Last reply Reply Quote 0
          • DanpD Offline
            Danp Pro Support Team @dsmteam
            last edited by

            @dsmteam Did you try disabling and then enabling HA again to be sure that the new setting was being used?

            D 1 Reply Last reply Reply Quote 1
            • D Offline
              dsmteam @Danp
              last edited by dsmteam

              @Danp Oh..................
              Indeed, much faster now. Down from 2:00 minutes to 1:20 minutes
              Less than 10 seconds might be too aggressive.
              This is closer to what we expect.
              I can see in the GUI that when I bring a host down, the pool still takes a minute to consider the host down. Any way to decrease this timer further or there are too many dependencies ?

              1 Reply Last reply Reply Quote 0
              • olivierlambertO Offline
                olivierlambert Vates 🪐 Co-Founder CEO
                last edited by

                That's a good progress 😄 For the other number, let me ask around 🙂

                D 1 Reply Last reply Reply Quote 0
                • D Offline
                  dsmteam @olivierlambert
                  last edited by olivierlambert

                  @olivierlambert I think I found what I need in the following documentation
                  https://xapi-project.github.io/features/HA/HA.html
                  Various parameters which must be the same of every hosts in /etc/xensource/xhad.conf

                  <parameters>
                        <HeartbeatInterval>4</HeartbeatInterval>
                        <HeartbeatTimeout>30</HeartbeatTimeout>
                        <StateFileInterval>4</StateFileInterval>
                        <StateFileTimeout>30</StateFileTimeout>
                        <HeartbeatWatchdogTimeout>30</HeartbeatWatchdogTimeout>
                        <StateFileWatchdogTimeout>45</StateFileWatchdogTimeout>
                        <BootJoinTimeout>90</BootJoinTimeout>
                        <EnableJoinTimeout>90</EnableJoinTimeout>
                        <XapiHealthCheckInterval>60</XapiHealthCheckInterval>
                        <XapiHealthCheckTimeout>10</XapiHealthCheckTimeout>
                        <XapiRestartAttempts>1</XapiRestartAttempts>
                        <XapiRestartTimeout>30</XapiRestartTimeout>
                        <XapiLicenseCheckTimeout>30</XapiLicenseCheckTimeout>
                      </parameters>
                  
                  1 Reply Last reply Reply Quote 0
                  • olivierlambertO Offline
                    olivierlambert Vates 🪐 Co-Founder CEO
                    last edited by

                    Explanations here: https://github.com/xapi-project/xen-api/pull/4169

                    No idea about how to tinker it. But happy to hear your experiments 🙂

                    lindig opened this pull request in xapi-project/xen-api

                    closed Improve HA parameter derived from timeout #4169

                    D 1 Reply Last reply Reply Quote 0
                    • D Offline
                      dsmteam @olivierlambert
                      last edited by

                      @olivierlambert Unfortunately, the parameters are reverted back to their default value when I turn on HA. Might be hard coded somewhere.

                      D 1 Reply Last reply Reply Quote 0
                      • D Offline
                        dsmteam @dsmteam
                        last edited by

                        @dsmteam Still trying to browse the web and various xo forum but it looks like those parameters are in the .c and other precompile file so the build in xcp-ng are probably using those default parameters.

                        1 Reply Last reply Reply Quote 0

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post