XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Adding new host to pool fails - Stunnel SSL certiticate verification failure

    Scheduled Pinned Locked Moved XCP-ng
    12 Posts 4 Posters 251 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • olivierlambertO Offline
      olivierlambert Vates 🪐 Co-Founder CEO
      last edited by

      Ping @Team-OS-Platform-Release

      1 Reply Last reply Reply Quote 1
      • semarieS Offline
        semarie Vates 🪐 XCP-ng Team XAPI & Network Team
        last edited by

        Just my 2 cents, but with SSL involved time is important: could you check the date is accurate on the two hosts ?

        having the output of the following commands might help too:

        • stat /etc/stunnel/xapi-stunnel-ca-bundle.pem
        • openssl x509 -in /etc/stunnel/xapi-stunnel-ca-bundle.pem -noout -text
        B 1 Reply Last reply Reply Quote 2
        • B Offline
          Bryanvh @semarie
          last edited by

          @semarie

          Maybe this points at an issue. It looks like the cert file is empty? And this is after I ran that command to refresh the cert. I get this same output for both the pool master and the host I am trying to add.

          84fc3624-7777-4f6a-b81f-c09586a63d05-image.jpeg

          Then the openssl x509 command says it's unable to load the cert or read it. I assume that's because it's empty?

          As for the time and date, yes the pool master and this server are in sync. At first, I had forgotten to set the new host to use the NTP pool during setup and Xen Orchestra helpfully yelled at me about that. Haha

          1 Reply Last reply Reply Quote 0
          • semarieS Offline
            semarie Vates 🪐 XCP-ng Team XAPI & Network Team
            last edited by

            Yes, if the file is empty, it is expected to the openssl x509 command to fail.
            Does is it the same on the master ?

            B LucienLassalleL 2 Replies Last reply Reply Quote 0
            • B Offline
              Bryanvh @semarie
              last edited by

              @semarie
              Yes. This screenshot is from the pool master. But, both it and the new host had the same output.

              For clarity's sake, I have never applied an SSL cert to these hosts. This seems to be whatever built-in certs the system is using and signing.

              Is there a way to fix these certs? Was the xe host-refresh-server-certificate host=hostname command not the correct command to fix this?

              1 Reply Last reply Reply Quote 0
              • semarieS Offline
                semarie Vates 🪐 XCP-ng Team XAPI & Network Team
                last edited by

                Sorry, but it is outside my competence zone. I prefer to not tell you to try something that I don't know the exact consequences of.

                Does someone else could reply ?

                1 Reply Last reply Reply Quote 0
                • LucienLassalleL Online
                  LucienLassalle Vates 🪐 XCP-ng Team @semarie
                  last edited by

                  @semarie I'll try to investigate to help you.

                  Is it possible to run:

                  • stat /etc/xensource/xapi-pool-tls.pem
                  • openssl x509 -in /etc/xensource/xapi-pool-tls.pem -noout -text
                  • stat /etc/xensource/xapi-ssl.pem
                  • openssl x509 -in /etc/xensource/xapi-ssl.pem -noout -text

                  (This file must exist; if not, I'd like the output of cat /etc/stunnel/xapi.conf.)
                  And I'd like the same output for /etc/xensource/xapi-ssl.pem.

                  If the certificate for /etc/xensource/xapi-pool.tls.pem has expired or it's empty, you can run:
                  xe host-refresh-server-certificate host=$(hostname)
                  If the certificate for /etc/xensource/xapi-ssl.pem has expired or it's empty, you can run:
                  xe host-emergency-reset-server-certificate

                  After running one of the two commands above, I recommend to do: xe-toolstack-restart
                  (This should indeed restart the stunnel@xapi.service)

                  I hope this helps.

                  B 1 Reply Last reply Reply Quote 0
                  • B Offline
                    Bryanvh @LucienLassalle
                    last edited by

                    @LucienLassalle

                    Here's the output from the pool master.
                    The xapi-pool-tls cert at least isn't empty.
                    2c80406a-8398-45bc-aa49-ce16acae9912-image.jpeg

                    And it still appears to be valid
                    0db145ec-4a02-41e2-9b82-aa79524ba966-image.jpeg

                    The xapi-ssl cert also looks correct and un-expired
                    b2004ec3-cd55-4c4a-a9c8-58cb2f01deb2-image.jpeg
                    5e0a2bc4-54b7-42f1-ba54-5007f670550c-image.jpeg

                    LucienLassalleL 1 Reply Last reply Reply Quote 0
                    • LucienLassalleL Online
                      LucienLassalle Vates 🪐 XCP-ng Team @Bryanvh
                      last edited by

                      @Bryanvh Thank you for your feedback,
                      Your previous certificates look correct. I have not been able to reproduce the issue on my side, but I will try to diagnose it based on the code.

                      [MASTER]
                      I have a few preliminary commands. The first one is to retrieve the MASTER_UUID:
                      cat /etc/xensource-inventory | grep INSTALLATION_UUID | cut -d'=' -f2 | tr -d "'"

                      Then we can compare fingerprints between the master certificate and the one stored for the pool:
                      openssl x509 -in /etc/xensource/xapi-pool-tls.pem -noout -fingerprint -sha256
                      openssl x509 -in /etc/stunnel/certs-pool/{MASTER_UUID}.pem -noout -fingerprint -sha256
                      (please replace {MASTER_UUID} with the value retrieved above)

                      Normally, both fingerprints should match.
                      Also check that the CA bundle exists and is not empty:
                      ls -l /etc/stunnel/xapi-pool-ca-bundle.pem

                      If you previously ran:
                      xe host-refresh-server-certificate
                      you should probably run:
                      xe pool-certificate-sync

                      [JOINER]
                      Based on the code, the first phase has already been completed. You should therefore have files under /etc/stunnel/certs-pool/, including the master certificate:
                      openssl x509 -in /etc/stunnel/certs-pool/{MASTER_UUID}.pem -noout -fingerprint -sha256

                      [Additional checks]
                      Are all hosts synchronized to the same NTP server? date & timedatectl
                      Are all hosts fully updated to XCP-ng 8.3 and rebooted after updates?
                      Do you see the same error when joining the pool using XCP-ng (via Console or CLI) instead of Xen Orchestra?
                      Is there any more detailed error in /var/log/xensource.log ?
                      How many hosts are in your pool?
                      Is stunnel running correctly on all hosts? systemctl status stunnel@xapi

                      Do certificate chains validate correctly?
                      openssl verify -CAfile /etc/stunnel/xapi-pool-ca-bundle.pem /etc/stunnel/certs-pool/{MASTER_UUID}.pem

                      Respectfully,

                      B 1 Reply Last reply Reply Quote 0
                      • B Offline
                        Bryanvh @LucienLassalle
                        last edited by

                        @LucienLassalle
                        I'm not sure if this points toward an issue but, when running the openssl command to check the pool cert using the UUID checked first here, I get this error
                        b65969a9-ec32-4b94-9aef-6ed1fe1e202a-image.jpeg

                        I get the same error when trying to check for the pool cert on the host that is trying to join the pool. Even if the pool cert was copied to the joining host, if this points to an issue with that cert, then I suppose that might be the cause of the error?

                        For the additional questions:
                        Yes, they are time synchronized and are all using pool.ntp.org
                        Yes, they are all up to date. 3 of the hosts (the existing pool) were previously on 8.2 but were updated to 8.3 and the new host I am trying to join was set up fresh on 8.3.
                        Yes, the stunnel service reports that it is running correctly.

                        And, as expected based on the previous error, verifying the cert fails with the same error as shown when trying to check the pool's cert fingerprint.

                        Here's what I see in the logs after trying to join the host to the pool:
                        Pool Master
                        26265d3f-bb1b-44cb-b8b4-901a30c0a18e-image.jpeg
                        Joining Host
                        be868a34-1efb-4809-90bb-c199982231ea-image.jpeg

                        LucienLassalleL 1 Reply Last reply Reply Quote 0
                        • LucienLassalleL Online
                          LucienLassalle Vates 🪐 XCP-ng Team @Bryanvh
                          last edited by

                          @Bryanvh I think I've managed to reproduce the issue. The fact that the master's certificate is missing from /etc/stunnel/certs-pool/ seems to be the problem.

                          On the master, run xe host-refresh-server-certificate host=$(hostname) and then xe pool-certificate-sync.

                          Then, if you run ls -l /etc/stunnel/certs-pool, you should see a certificate with the same name as your master's UUID. It should end with .pem. If it ends with .new.pem, I recommend copying the certificate, removing the .new (which can apparently cause problems).

                          You should then be able to join the pool from your host.

                          I hope this worked. Please let me know if it works.
                          Respectfully,

                          1 Reply Last reply Reply Quote 0

                          Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                          Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                          With your input, this post could be even better 💗

                          Register Login
                          • First post
                            Last post