@bnerickson
About 1, That's something that I'd like to support in the future, but it will take some time to get there. About 2, currently LLAs are not allowed to be the used as the management IPs, and I wouldn't like them to be used as such: it can lead to hosts not being to connect to one another if they're not in the same physical network. They still could be configured for the interfaces and shown to the user, but not as the only IP for the interface, or the main one for that matter.
psafont
@psafont
Best posts made by psafont
-
RE: IPv6 support in XCP-ng for the management interface - feedback wanted
-
RE: What metadata restore really do?
@Tristis-Oris said in What metadata restore really do?:
My pool died so i decied to restore it from backup.
- fresh xen install, retore metadata, everything looks fine, but i can't join any new hosts into this pool.
https://paste.vates.tech/?ca81f43fd2d10456#Ay6yKRNTS8LDDSHTkin7eLppP33eAMS5grLrevuqf5rL
It looks like the metadata restored has a certificate with name 'sdn-controller-ca', but this certificate does not exist in the filesystem (/etc/stunnel/certs/sdn-controller-ca.pem). I believe this is installed by Xen Orchestra whenever the SDN controller is turned on.
To remove it from the database runxe pool-uninstall-ca-certificate name=sdn-controller-ca --forceThe force flag should allow the certificate to be removed from the database even if the file is missing
- fresh xen install, retore metadata, everything looks fine, but i can't join any new hosts into this pool.
-
RE: Native Ceph RBD SM driver for XCP-ng
@benapetr You're right. Unfortunately, there's no VDI revert that allows the revert to happen '. This is shown in the documentation: https://xapi-project.github.io/new-docs/toolstack/features/snapshots/index.html (see revert section)
There's an old proposal to do add this: https://xapi-project.github.io/new-docs/design/snapshot-revert/index.html
But the effort fizzed out because currently the imports do not set the snapshot_of correctly, and the operation needs to work even if the field is not set correctly, as it is now. (falling back to the current code seems sensible) https://github.com/xapi-project/xen-api/pull/2058
This needs some effort to get fixed, I'll set up some ticketing so it can be prioritized accordingly.
-
RE: Native Ceph RBD SM driver for XCP-ng
@benapetr This is driven by hacky logic from 16 years ago:
- on revert, unserialize the previous state, and update the VM record with its saved values. As we do not want to modify that each time we add a field in the datamodel, use some low-level database functions to iterate over the fields of a record. Not very nice as it makes some assumptions on the database layer, but seems to work allright and I don't think that database layer will change a lot in the future.
I think it might be a good idea to add a revert rpc call to the storage interface that xapi can call to, with a backup to use the current logic if necessary; xapi should be able to clean up the database afterwards. I'll ask other maintainers about this or possible alternatives, but since SMAPIv1 is considered deprecated, I doubt it will happen.
I have to say that SMAPIv3 was finally fixed upstream on June by Xenserver (migrations were finally done!) and XCP-ng should get the update that fixes it in the coming weeks. Given this, I would encourage you to take all the learnings you've acquired while doing the driver and porting it to SMAPIv3. SMAPIv1 just simply has too many problems, some of them are architectural, so in general xenserver and xcp-ng maintainers would like to see it finally go away.
for now I am still targetting XCP-ng 8.2 as that's what I use in production, and I haven't seen many SMAPIv3 drivers there.
8.2 is out of support for xenserver, and for xcp-ng yesterday was the last day it was supported, you really should update

-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
So we probably need to tell XO team the "right way" to enable HA because there's no way to know from "outside it's not meant to, xapi makes the call automatically.
I don't think so, it's xapi's responsibility to make that call
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik I have a test build that you can test, it will hopefully provide better error messages by raising an internal error with a reason.
The code is based on the newest builds, so I recommend updating to the latest version of XCP beforehand:
yum update rebootOnce that is done, the test packages can be installed by creating the file
/etc/yum.repos.d/xcp-test.repo:[xcp-ng-psafont1] name=xcp-ng-psafont1 baseurl=https://koji.xcp-ng.org/repos/user/8/8.3/psafont1/x86_64/ enabled=0 gpgcheck=1 repo_gpgcheck=1 metadata_expire=0 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-xcpngthen updating the host using the test developer repo
yum update --enablerepo=xcp-ng-psafont1and finally restarting all the daemons
xe-toolstack-restartNote: the repository will only be available for a limited amount of time, after which I will repurpose it and delete the instructions so it's not used anymore by accident.
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):
@andriy.sultanov @psafont
https://drive.google.com/file/d/1aJyCYSAuRIBb0X-23gJ6ORtrHSciYH8a/view?usp=sharing
Here is the log fileIt's not crystal clear the condition that causes the exception, but I can see some unprotected exception being raised in that path
host.ha_join_livesetwhen trying to recover the host uuid and it's not found. I'll investigate -
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@rmaclachlan This looks awfully similar to https://github.com/xapi-project/xen-api/pull/5451
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@stormi said in XCP-ng 8.3 beta
:@psafont Will a 8.2 to 8.3 upgrade (through the installation ISO) leave TLS verification disabled, or will it enable it by default?
It's not enabled by default, enabling it by default is not possible with the current update procedure where the pool coordinator is updated before the other pool member because these do not expose the new certificates to xmlrpc clients before upgrading, breaking communications.
In other words: must we expect any user who upgrades from 8.2 or lower and then later wants to add a new host to the pool to see this error (and likely ask for help, even if we document it properly - and we would of course)?
Yes
-
RE: XCP-ng 8.3 betas and RCs feedback 🚀
@gsrfan01 The error happens because the joining host has TLS certificate checking enabled for pool connections while the joined host don't.
This mismatch happens because on fresh installs TLS certificate checking is enabled, while for updates from previous versions is not.
To enable TLS certificate checking in a pool simply running
xe pool-enable-tls-verification.The emergency command is not needed in this case, it's useful to re-enable certificate checking in a single host after is has been disabled using the emergency disable
Latest posts made by psafont
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik ah, indeed. Do you know which server / interface holds the IP
192.168.30.13? I suspect is still VMHost13, but a different interface.Until the members have configured their master as
192.168.30.13, you'll have this error. This can be done by a call, but since it's a delicate operation, it's better if there are no operations running on the pool. SSH into the VMHost13, and runxe host-list name-label=VMHost13 --minimal | xargs -I _ xe pool-designate-new-master host-uuid=_This should write the new IP to the files of all the pool members and stop blocking this issue from enabling HA
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik The IPs match, and now I don't have an explanation on why is this happening, I'll take another look at the codepath, but that'll have to take a while, as work is piling up
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik Could you collect the file contents of /etc/xensource/pool.conf from all the other hosts? The command is failing in one of them, not on the master host.
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
So we probably need to tell XO team the "right way" to enable HA because there's no way to know from "outside it's not meant to, xapi makes the call automatically.
I don't think so, it's xapi's responsibility to make that call
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@olivierlambert The call is indeed hidden from the docs, and only callable from inside a pool... it's called as part as Pool.enable_ha
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik
So the problem goes like this:- HA uses a local-only database to avoid depending on the database
- This database contains a mapping from UUID to the IP host_address for all hosts in an HA cluster / pool. This information should be gathered right before HA is enabled, from the normal database.
- When trying to enable HA, the host fetches the coordinator's address from the filesystem. Then it uses the previous mapping and the coordinator address to find the coordinator's UUID. This step fails.
I'm not sure what has actually happening, but some scenarios come to mind:
- XO isn't calling the API function Host.preconfigure_ha, which means the local database is not created (unlikely)
- The coordinator's address has somehow changed between the local database being written and the HA being enabled
things to check out:
- inspect the values that the failing host has about the host_address of the coordinator / master host, both on:
- the normal database. You can SSH into the failing host and run the following command, replacinf POOL_UUID with the actual uuid, this can be done deleting POOL_UUID , placing the cursor after the
=and pressing tab twice.
- the normal database. You can SSH into the failing host and run the following command, replacinf POOL_UUID with the actual uuid, this can be done deleting POOL_UUID , placing the cursor after the
xe pool-param-get uuid=POOL_UUID param-name=master | xargs -I _ xe host-param-get uuid=_ param-name=address- and the pool role file, similar to the previous command, SSH in the failing host and run
cat /etc/xensource/pool.confLet us know how it goes. If the IPs don't match, there's a problem with the configuration of the member, and otherwise it's because the local database is outdated and should be refreshed before enabling HA. I don't know how XO handles it.
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik I have a test build that you can test, it will hopefully provide better error messages by raising an internal error with a reason.
The code is based on the newest builds, so I recommend updating to the latest version of XCP beforehand:
yum update rebootOnce that is done, the test packages can be installed by creating the file
/etc/yum.repos.d/xcp-test.repo:[xcp-ng-psafont1] name=xcp-ng-psafont1 baseurl=https://koji.xcp-ng.org/repos/user/8/8.3/psafont1/x86_64/ enabled=0 gpgcheck=1 repo_gpgcheck=1 metadata_expire=0 gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-xcpngthen updating the host using the test developer repo
yum update --enablerepo=xcp-ng-psafont1and finally restarting all the daemons
xe-toolstack-restartNote: the repository will only be available for a limited amount of time, after which I will repurpose it and delete the instructions so it's not used anymore by accident.
-
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):
@andriy.sultanov @psafont
https://drive.google.com/file/d/1aJyCYSAuRIBb0X-23gJ6ORtrHSciYH8a/view?usp=sharing
Here is the log fileIt's not crystal clear the condition that causes the exception, but I can see some unprotected exception being raised in that path
host.ha_join_livesetwhen trying to recover the host uuid and it's not found. I'll investigate -
RE: Unable to enable High Availability - INTERNAL_ERROR(Not_found)
@jmannik said in Unable to enable High Availability - INTERNAL_ERROR(Not_found):
[18:15 vmhost13 ~]# xe pool-ha-enable heartbeat-sr-uuids=381caeb2-5ad9-8924-365d-4b130c67c064
The server failed to handle your request, due to an internal error. The given message may give details useful for debugging the problem.
message: Not_foundThat message is created by an exception. It's commonly raised by List.find and List.assoc, in this case the exception wasn't caught.
It's usually difficult to find out which one, since these functions are frequently used and catching the exception can happen in a caller of the function that uses it.
Could you provide the xenserver.log, as Andriy has asked? Otherwise I don't think we'll be able to find the exact cause.
-
RE: ISO Importing Results in .img Files
@acebmxer I tried to force it by uploading the same iso twice, and couldn't reproduce the issue. XO shouldn't allow to upload an ISO to an SR with the same name
