XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Adding new host to pool fails - Stunnel SSL certiticate verification failure

    Scheduled Pinned Locked Moved Solved XCP-ng
    16 Posts 4 Posters 393 Views 4 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • semarieS Offline
      semarie Vates 🪐 XCP-ng Team XAPI & Network Team
      last edited by

      Sorry, but it is outside my competence zone. I prefer to not tell you to try something that I don't know the exact consequences of.

      Does someone else could reply ?

      1 Reply Last reply Reply Quote 0
      • LucienLassalleL Offline
        LucienLassalle Vates 🪐 XCP-ng Team Security Team @semarie
        last edited by

        @semarie I'll try to investigate to help you.

        Is it possible to run:

        • stat /etc/xensource/xapi-pool-tls.pem
        • openssl x509 -in /etc/xensource/xapi-pool-tls.pem -noout -text
        • stat /etc/xensource/xapi-ssl.pem
        • openssl x509 -in /etc/xensource/xapi-ssl.pem -noout -text

        (This file must exist; if not, I'd like the output of cat /etc/stunnel/xapi.conf.)
        And I'd like the same output for /etc/xensource/xapi-ssl.pem.

        If the certificate for /etc/xensource/xapi-pool.tls.pem has expired or it's empty, you can run:
        xe host-refresh-server-certificate host=$(hostname)
        If the certificate for /etc/xensource/xapi-ssl.pem has expired or it's empty, you can run:
        xe host-emergency-reset-server-certificate

        After running one of the two commands above, I recommend to do: xe-toolstack-restart
        (This should indeed restart the stunnel@xapi.service)

        I hope this helps.

        B 1 Reply Last reply Reply Quote 0
        • B Offline
          Bryanvh @LucienLassalle
          last edited by

          @LucienLassalle

          Here's the output from the pool master.
          The xapi-pool-tls cert at least isn't empty.
          2c80406a-8398-45bc-aa49-ce16acae9912-image.jpeg

          And it still appears to be valid
          0db145ec-4a02-41e2-9b82-aa79524ba966-image.jpeg

          The xapi-ssl cert also looks correct and un-expired
          b2004ec3-cd55-4c4a-a9c8-58cb2f01deb2-image.jpeg
          5e0a2bc4-54b7-42f1-ba54-5007f670550c-image.jpeg

          LucienLassalleL 1 Reply Last reply Reply Quote 0
          • LucienLassalleL Offline
            LucienLassalle Vates 🪐 XCP-ng Team Security Team @Bryanvh
            last edited by

            @Bryanvh Thank you for your feedback,
            Your previous certificates look correct. I have not been able to reproduce the issue on my side, but I will try to diagnose it based on the code.

            [MASTER]
            I have a few preliminary commands. The first one is to retrieve the MASTER_UUID:
            cat /etc/xensource-inventory | grep INSTALLATION_UUID | cut -d'=' -f2 | tr -d "'"

            Then we can compare fingerprints between the master certificate and the one stored for the pool:
            openssl x509 -in /etc/xensource/xapi-pool-tls.pem -noout -fingerprint -sha256
            openssl x509 -in /etc/stunnel/certs-pool/{MASTER_UUID}.pem -noout -fingerprint -sha256
            (please replace {MASTER_UUID} with the value retrieved above)

            Normally, both fingerprints should match.
            Also check that the CA bundle exists and is not empty:
            ls -l /etc/stunnel/xapi-pool-ca-bundle.pem

            If you previously ran:
            xe host-refresh-server-certificate
            you should probably run:
            xe pool-certificate-sync

            [JOINER]
            Based on the code, the first phase has already been completed. You should therefore have files under /etc/stunnel/certs-pool/, including the master certificate:
            openssl x509 -in /etc/stunnel/certs-pool/{MASTER_UUID}.pem -noout -fingerprint -sha256

            [Additional checks]
            Are all hosts synchronized to the same NTP server? date & timedatectl
            Are all hosts fully updated to XCP-ng 8.3 and rebooted after updates?
            Do you see the same error when joining the pool using XCP-ng (via Console or CLI) instead of Xen Orchestra?
            Is there any more detailed error in /var/log/xensource.log ?
            How many hosts are in your pool?
            Is stunnel running correctly on all hosts? systemctl status stunnel@xapi

            Do certificate chains validate correctly?
            openssl verify -CAfile /etc/stunnel/xapi-pool-ca-bundle.pem /etc/stunnel/certs-pool/{MASTER_UUID}.pem

            Respectfully,

            B 1 Reply Last reply Reply Quote 0
            • B Offline
              Bryanvh @LucienLassalle
              last edited by

              @LucienLassalle
              I'm not sure if this points toward an issue but, when running the openssl command to check the pool cert using the UUID checked first here, I get this error
              b65969a9-ec32-4b94-9aef-6ed1fe1e202a-image.jpeg

              I get the same error when trying to check for the pool cert on the host that is trying to join the pool. Even if the pool cert was copied to the joining host, if this points to an issue with that cert, then I suppose that might be the cause of the error?

              For the additional questions:
              Yes, they are time synchronized and are all using pool.ntp.org
              Yes, they are all up to date. 3 of the hosts (the existing pool) were previously on 8.2 but were updated to 8.3 and the new host I am trying to join was set up fresh on 8.3.
              Yes, the stunnel service reports that it is running correctly.

              And, as expected based on the previous error, verifying the cert fails with the same error as shown when trying to check the pool's cert fingerprint.

              Here's what I see in the logs after trying to join the host to the pool:
              Pool Master
              26265d3f-bb1b-44cb-b8b4-901a30c0a18e-image.jpeg
              Joining Host
              be868a34-1efb-4809-90bb-c199982231ea-image.jpeg

              LucienLassalleL 1 Reply Last reply Reply Quote 0
              • LucienLassalleL Offline
                LucienLassalle Vates 🪐 XCP-ng Team Security Team @Bryanvh
                last edited by

                @Bryanvh I think I've managed to reproduce the issue. The fact that the master's certificate is missing from /etc/stunnel/certs-pool/ seems to be the problem.

                On the master, run xe host-refresh-server-certificate host=$(hostname) and then xe pool-certificate-sync.

                Then, if you run ls -l /etc/stunnel/certs-pool, you should see a certificate with the same name as your master's UUID. It should end with .pem. If it ends with .new.pem, I recommend copying the certificate, removing the .new (which can apparently cause problems).

                You should then be able to join the pool from your host.

                I hope this worked. Please let me know if it works.
                Respectfully,

                B 1 Reply Last reply Reply Quote 1
                • B Offline
                  Bryanvh @LucienLassalle
                  last edited by

                  @LucienLassalle

                  Thanks for the quick response and the effort in recreating the issue!

                  It all played out exactly as you laid it out, even the cert showing up as a .new.pem at first.

                  Out of curiosity, what in your testing did result in causing this issue? Is it possible that my upgrade from 8.2 to 8.3 may have caused the underlying issue?

                  LucienLassalleL 1 Reply Last reply Reply Quote 1
                  • LucienLassalleL Offline
                    LucienLassalle Vates 🪐 XCP-ng Team Security Team @Bryanvh
                    last edited by

                    @Bryanvh Looking at the code, I saw that an exchange was taking place via this certificate.

                    So when you told me that the master certificate was missing, I tried to put myself in the same situation as you (by removing the certificate) and trying to join the pool.
                    Having encountered the same error as you, I determined that running these commands fixed the problem.

                    Indeed, I think the upgrade from 8.2 to 8.3 is the cause. To be more precise, a change occurred in the XAPI during the certificate exchange in version 8.2, and I think it's possible that your 8.2 host wasn't up to date when it upgraded to 8.3 (I'm not sure).

                    In any case, I'm glad your problem is solved.

                    B 1 Reply Last reply Reply Quote 2
                    • B Offline
                      Bryanvh @LucienLassalle
                      last edited by

                      @LucienLassalle

                      Interesting. I'm not sure I was all the way up to date when I upgraded to 8.3 and it's possible I was a month or two behind. I only upgraded because I ran across a need for the virtualized TPM support (which is cool to see implemented!).

                      Thanks again for all the effort in looking at this!

                      LucienLassalleL 1 Reply Last reply Reply Quote 0
                      • LucienLassalleL LucienLassalle marked this topic as a question
                      • LucienLassalleL LucienLassalle has marked this topic as solved
                      • LucienLassalleL Offline
                        LucienLassalle Vates 🪐 XCP-ng Team Security Team @Bryanvh
                        last edited by

                        @Bryanvh No problem 🙂

                        The issue you encountered wasn't very clear. Therefore, I've proposed a change to the XAPI to make the error more explicit (this will likely be implemented in future XAPI releases).

                        So instead of SSL Certification failure the message will be: POOL_JOINING_MASTER_CERTIFICATE_NOT_IN_POOL_BUNDLE.

                        Thank you very much for your patience and for bringing this issue to our attention.

                        References:
                        https://github.com/xapi-project/xen-api/pull/7112

                        LucienLassalle opened this pull request in xapi-project/xen-api

                        closed xapi: Improve error reporting when pool join fails on TLS verification #7112

                        1 Reply Last reply Reply Quote 0

                        Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                        Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                        With your input, this post could be even better 💗

                        Register Login
                        • First post
                          Last post