XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. psafont
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 0
    • Posts 29
    • Groups 2

    psafont

    @psafont

    Vates 🪐 XAPI & Network Team
    32
    Reputation
    26
    Profile views
    29
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online
    Website github.com/psafont

    psafont Unfollow Follow
    XAPI & Network Team Vates 🪐

    Best posts made by psafont

    • RE: IPv6 support in XCP-ng for the management interface - feedback wanted

      @bnerickson
      About 1, That's something that I'd like to support in the future, but it will take some time to get there. About 2, currently LLAs are not allowed to be the used as the management IPs, and I wouldn't like them to be used as such: it can lead to hosts not being to connect to one another if they're not in the same physical network. They still could be configured for the interfaces and shown to the user, but not as the only IP for the interface, or the main one for that matter.

      posted in News
      psafontP
      psafont
    • RE: What metadata restore really do?

      @Tristis-Oris said in What metadata restore really do?:

      My pool died so i decied to restore it from backup.

      • fresh xen install, retore metadata, everything looks fine, but i can't join any new hosts into this pool.
        https://paste.vates.tech/?ca81f43fd2d10456#Ay6yKRNTS8LDDSHTkin7eLppP33eAMS5grLrevuqf5rL

      It looks like the metadata restored has a certificate with name 'sdn-controller-ca', but this certificate does not exist in the filesystem (/etc/stunnel/certs/sdn-controller-ca.pem). I believe this is installed by Xen Orchestra whenever the SDN controller is turned on.
      To remove it from the database run xe pool-uninstall-ca-certificate name=sdn-controller-ca --force

      The force flag should allow the certificate to be removed from the database even if the file is missing

      posted in Backup
      psafontP
      psafont
    • RE: Native Ceph RBD SM driver for XCP-ng

      @benapetr You're right. Unfortunately, there's no VDI revert that allows the revert to happen '. This is shown in the documentation: https://xapi-project.github.io/new-docs/toolstack/features/snapshots/index.html (see revert section)

      There's an old proposal to do add this: https://xapi-project.github.io/new-docs/design/snapshot-revert/index.html

      But the effort fizzed out because currently the imports do not set the snapshot_of correctly, and the operation needs to work even if the field is not set correctly, as it is now. (falling back to the current code seems sensible) https://github.com/xapi-project/xen-api/pull/2058

      This needs some effort to get fixed, I'll set up some ticketing so it can be prioritized accordingly.

      djs55 opened this pull request in xapi-project/xen-api

      closed VDI.revert pull request + extra bits #2058

      posted in Development
      psafontP
      psafont
    • RE: Native Ceph RBD SM driver for XCP-ng

      @benapetr This is driven by hacky logic from 16 years ago:

      • on revert, unserialize the previous state, and update the VM record with its saved values. As we do not want to modify that each time we add a field in the datamodel, use some low-level database functions to iterate over the fields of a record. Not very nice as it makes some assumptions on the database layer, but seems to work allright and I don't think that database layer will change a lot in the future.

      I think it might be a good idea to add a revert rpc call to the storage interface that xapi can call to, with a backup to use the current logic if necessary; xapi should be able to clean up the database afterwards. I'll ask other maintainers about this or possible alternatives, but since SMAPIv1 is considered deprecated, I doubt it will happen.

      I have to say that SMAPIv3 was finally fixed upstream on June by Xenserver (migrations were finally done!) and XCP-ng should get the update that fixes it in the coming weeks. Given this, I would encourage you to take all the learnings you've acquired while doing the driver and porting it to SMAPIv3. SMAPIv1 just simply has too many problems, some of them are architectural, so in general xenserver and xcp-ng maintainers would like to see it finally go away.

      for now I am still targetting XCP-ng 8.2 as that's what I use in production, and I haven't seen many SMAPIv3 drivers there.

      8.2 is out of support for xenserver, and for xcp-ng yesterday was the last day it was supported, you really should update 😛

      posted in Development
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @olivierlambert

      So we probably need to tell XO team the "right way" to enable HA because there's no way to know from "outside it's not meant to, xapi makes the call automatically.

      I don't think so, it's xapi's responsibility to make that call

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik I have a test build that you can test, it will hopefully provide better error messages by raising an internal error with a reason.

      The code is based on the newest builds, so I recommend updating to the latest version of XCP beforehand:

      yum update
      reboot
      

      Once that is done, the test packages can be installed by creating the file /etc/yum.repos.d/xcp-test.repo:

      [xcp-ng-psafont1]
      name=xcp-ng-psafont1
      baseurl=https://koji.xcp-ng.org/repos/user/8/8.3/psafont1/x86_64/
      enabled=0
      gpgcheck=1
      repo_gpgcheck=1
      metadata_expire=0
      gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-xcpng
      

      then updating the host using the test developer repo

      yum update --enablerepo=xcp-ng-psafont1
      

      and finally restarting all the daemons

      xe-toolstack-restart
      

      Note: the repository will only be available for a limited amount of time, after which I will repurpose it and delete the instructions so it's not used anymore by accident.

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):

      @andriy.sultanov @psafont
      https://drive.google.com/file/d/1aJyCYSAuRIBb0X-23gJ6ORtrHSciYH8a/view?usp=sharing
      Here is the log file

      It's not crystal clear the condition that causes the exception, but I can see some unprotected exception being raised in that path host.ha_join_liveset when trying to recover the host uuid and it's not found. I'll investigate

      posted in XCP-ng
      psafontP
      psafont
    • RE: XCP-ng 8.3 betas and RCs feedback 🚀

      @rmaclachlan This looks awfully similar to https://github.com/xapi-project/xen-api/pull/5451

      freddy77 opened this pull request in xapi-project/xen-api

      closed CP-47754: Do not report errors attempting to read PCI vendor:product #5451

      posted in News
      psafontP
      psafont
    • RE: XCP-ng 8.3 betas and RCs feedback 🚀

      @stormi said in XCP-ng 8.3 beta 🚀:

      @psafont Will a 8.2 to 8.3 upgrade (through the installation ISO) leave TLS verification disabled, or will it enable it by default?

      It's not enabled by default, enabling it by default is not possible with the current update procedure where the pool coordinator is updated before the other pool member because these do not expose the new certificates to xmlrpc clients before upgrading, breaking communications.

      In other words: must we expect any user who upgrades from 8.2 or lower and then later wants to add a new host to the pool to see this error (and likely ask for help, even if we document it properly - and we would of course)?

      Yes

      posted in News
      psafontP
      psafont
    • RE: XCP-ng 8.3 betas and RCs feedback 🚀

      @gsrfan01 The error happens because the joining host has TLS certificate checking enabled for pool connections while the joined host don't.

      This mismatch happens because on fresh installs TLS certificate checking is enabled, while for updates from previous versions is not.

      To enable TLS certificate checking in a pool simply running xe pool-enable-tls-verification.

      The emergency command is not needed in this case, it's useful to re-enable certificate checking in a single host after is has been disabled using the emergency disable

      posted in News
      psafontP
      psafont

    Latest posts made by psafont

    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik ah, indeed. Do you know which server / interface holds the IP 192.168.30.13? I suspect is still VMHost13, but a different interface.

      Until the members have configured their master as 192.168.30.13, you'll have this error. This can be done by a call, but since it's a delicate operation, it's better if there are no operations running on the pool. SSH into the VMHost13, and run

      xe host-list name-label=VMHost13 --minimal | xargs -I _ xe pool-designate-new-master host-uuid=_
      

      This should write the new IP to the files of all the pool members and stop blocking this issue from enabling HA

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik The IPs match, and now I don't have an explanation on why is this happening, I'll take another look at the codepath, but that'll have to take a while, as work is piling up

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik Could you collect the file contents of /etc/xensource/pool.conf from all the other hosts? The command is failing in one of them, not on the master host.

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @olivierlambert

      So we probably need to tell XO team the "right way" to enable HA because there's no way to know from "outside it's not meant to, xapi makes the call automatically.

      I don't think so, it's xapi's responsibility to make that call

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @olivierlambert The call is indeed hidden from the docs, and only callable from inside a pool... it's called as part as Pool.enable_ha

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik
      So the problem goes like this:

      • HA uses a local-only database to avoid depending on the database
      • This database contains a mapping from UUID to the IP host_address for all hosts in an HA cluster / pool. This information should be gathered right before HA is enabled, from the normal database.
      • When trying to enable HA, the host fetches the coordinator's address from the filesystem. Then it uses the previous mapping and the coordinator address to find the coordinator's UUID. This step fails.

      I'm not sure what has actually happening, but some scenarios come to mind:

      • XO isn't calling the API function Host.preconfigure_ha, which means the local database is not created (unlikely)
      • The coordinator's address has somehow changed between the local database being written and the HA being enabled

      things to check out:

      • inspect the values that the failing host has about the host_address of the coordinator / master host, both on:
        1. the normal database. You can SSH into the failing host and run the following command, replacinf POOL_UUID with the actual uuid, this can be done deleting POOL_UUID , placing the cursor after the = and pressing tab twice.
      xe pool-param-get uuid=POOL_UUID param-name=master | xargs -I _ xe host-param-get uuid=_ param-name=address
      
      1. and the pool role file, similar to the previous command, SSH in the failing host and run
      cat /etc/xensource/pool.conf
      

      Let us know how it goes. If the IPs don't match, there's a problem with the configuration of the member, and otherwise it's because the local database is outdated and should be refreshed before enabling HA. I don't know how XO handles it.

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik I have a test build that you can test, it will hopefully provide better error messages by raising an internal error with a reason.

      The code is based on the newest builds, so I recommend updating to the latest version of XCP beforehand:

      yum update
      reboot
      

      Once that is done, the test packages can be installed by creating the file /etc/yum.repos.d/xcp-test.repo:

      [xcp-ng-psafont1]
      name=xcp-ng-psafont1
      baseurl=https://koji.xcp-ng.org/repos/user/8/8.3/psafont1/x86_64/
      enabled=0
      gpgcheck=1
      repo_gpgcheck=1
      metadata_expire=0
      gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-xcpng
      

      then updating the host using the test developer repo

      yum update --enablerepo=xcp-ng-psafont1
      

      and finally restarting all the daemons

      xe-toolstack-restart
      

      Note: the repository will only be available for a limited amount of time, after which I will repurpose it and delete the instructions so it's not used anymore by accident.

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):

      @andriy.sultanov @psafont
      https://drive.google.com/file/d/1aJyCYSAuRIBb0X-23gJ6ORtrHSciYH8a/view?usp=sharing
      Here is the log file

      It's not crystal clear the condition that causes the exception, but I can see some unprotected exception being raised in that path host.ha_join_liveset when trying to recover the host uuid and it's not found. I'll investigate

      posted in XCP-ng
      psafontP
      psafont
    • RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)

      @jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):

      @olivierlambert

      [18:15 vmhost13 ~]# xe pool-ha-enable heartbeat-sr-uuids=381caeb2-5ad9-8924-365d-4b130c67c064
      The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the problem.
      message: Not_found

      That message is created by an exception. It's commonly raised by List.find and List.assoc, in this case the exception wasn't caught.

      It's usually difficult to find out which one, since these functions are frequently used and catching the exception can happen in a caller of the function that uses it.

      Could you provide the xenserver.log, as Andriy has asked? Otherwise I don't think we'll be able to find the exact cause.

      posted in XCP-ng
      psafontP
      psafont
    • RE: ISO Importing Results in .img Files

      @acebmxer I tried to force it by uploading the same iso twice, and couldn't reproduce the issue. XO shouldn't allow to upload an ISO to an SR with the same name 🙂

      posted in Management
      psafontP
      psafont