@Biggen At the moment, xcp-ng center provides some better views and overviews not yet available in XO.. Hoping next major version fixes this
Best posts made by Forza
-
RE: [WARNING] XCP-ng Center shows wrong CITRIX updates for XCP-ng Servers - DO NOT APPLY - Fix released
-
RE: Best CPU performance settings for HP DL325/AMD EPYC servers?
Sorry for spamming the thread.
I have two identical servers (srv01 and srv02) with AMD EPYC 7402P 24 Core CPUs. On srv02 I enabled the
LLC as NUMA Node
.I've done some quick benchmarks with
Sysbench
on Ubuntu 20.10 with 12 assigned cores. Command line:sysbench cpu run --threads=12
It would seem that in this test the NUMA option is much faster, 194187 events vs 103769 events. Perhaps I am misunderstanding how sysbench works?
With 7-zip the gain is much less, but still meaningful. A little slower in single-threaded performance but quite a bit faster in multi-threaded mode.
-
RE: Host stuck in booting state.
Problem was a stale connection with the NFS server. A reboot of the NFS server fixed the issue.
-
RE: Restoring a downed host ISNT easy
@xcprocks said in Restoring a downed host ISNT easy:
So, we had a host go down (OS drive failure). No big deal right? According to instructions, just reinstall XCP on a new drive, jump over into XOA and do a metadata restore.
Well, not quite.
First during installation, you really really must not select any of the disks to create an SR as you could potentially wipe out an SR.
Second, you have to do the sr-probe and sr-introduce and pbd-create and pbd-plug to get the SRs back.
Third, you then have to use XOA to restore the metadata which according to the directions is pretty simple looking. According to: https://xen-orchestra.com/docs/metadata_backup.html#performing-a-restore
"To restore one, simply click the blue restore arrow, choose a backup date to restore, and click OK:"
But this isn't quite true. When we did it, the restore threw an error:
"message": "no such object d7b6f090-cd68-9dec-2e00-803fc90c3593",
"name": "XoError",Panic mode sets in... It can't find the metadata? We try an earlier backup. Same error. We check the backup NFS share--no its there alright.
After a couple of hours scouring the internet and not finding anything, it dawns on us... The object XOA is looking for is the OLD server not a backup directory. It is looking for the server that died and no longer exists. The problem is, when you install the new server, it gets a new ID. But the restore program is looking for the ID of the dead server.
But how do you tell XOA, to copy the metadata over to the new server? It assumes that you want to restore it over an existing server. It does not provide a drop down list to pick where to deploy it.
In an act of desperation, we copied the backup directory to a new location and named it with the ID number of the newly recreated server. Now XOA could restore the metadata and we were able to recover the VMs in the SRs without issue.
This long story is really just a way to highlight the need for better host backup in three ways:
A) The first idea would be to create better instructions. It ain't nowhere as easy as the documentation says it is and it's easy to mess up the first step so bad that you can wipe out the contents of an SR. The documentation should spell this out.
B) The second idea is to add to the metadata backup something that reads the states of SR to PBD mappings and provides/saves a script to restore them. This would ease a lot of the difficulty in the actual restoring of a failed OS after a new OS can be installed.
C) The third idea is provide a dropdown during the restoration of the metadata that allows the user to target a particular machine for the restore operation instead of blindly assuming you want to restore it over a machine that is dead and gone.
I hope this helps out the next person trying to bring a host back from the dead, and I hope it also helps make XOA a better product.
Thanks for a good description of the restore process.
I was wary of the metadata-backup option. It sounds simple and good to have, but as you said it is in no way a comprehensive restore of a pool.
I'd like to add my own oppinion here. A full pool restore, including network, re-attaching SRs and everything else that is needed to quickly get back up and running. Also a restore pool backup should be available on the boot media. It could look for a NFS/CIFS mount or a USB disk with the backup files on. This would avoid things like issues with bonded networks not working.
-
RE: Remove VUSB as part of job
Might a different solution be to use a USB network bridge instead of direct attached USB? Something like this https://www.seh-technology.com/products/usb-deviceserver/utnserver-pro.html (There are different options available)... We use my-utn-50a with hardware USB keys and it has shown to be very reliable over the years.
-
RE: Citrix or XCP-ng drivers for Windows Server 2022
@dinhngtu Thank you. I think it is clear for me now.
The docs at https://xcp-ng.org/docs/guests.html#windows could be improved to cover all three options but also to be a little more concise to make it easier to read.
-
RE: ZFS for a backup server
Looks like you want disaster recovery option. It creates a ready-to-use VM on a separate XCP-ng server. If your main server fails you can start the vm directly off the second server.
In any case, backups can be restored with XO to any server and storage available in XCP-ng.
-
RE: Need some advice on retention
@rtjdamen you could simply make two backup jobs, one for daily backups and one for monthly backups.
-
RE: All NFS remotes started to timeout during backup but worked fine a few days ago
Since the nfs shares can be mounted on other hosts, I'd guess a fsid/clientid mismatch.
In the share, always specify fsid export option. If you do not use it, the nfs server tries to determine a suitable id from the underlying mounts. It may not always be reliable, for example after an upgrade or other changes. Now, if you combine this with a client that uses
hard
mount option and the fsid changes, it will not be possible to recover the mount as the client keeps asking for the old id.Nfs3 uses rpcbind and nfs4 doesn't, though this shouldn't matter if your nfs server supports both protocols. With nfs4 you should not export the same directory twice. That is do not export the root directory /mnt/datavol if /mnt/datavol/dir1 and /mnt/datavol/dir2 are exported.
So to fix this, you can adjust your exports (fsid, nesting) and the nfs mount option (to soft) , reboot the nfs server and client and see if it works.
-
RE: Netdata package is now available in XCP-ng
@andrewm4894 said in Netdata package is now available in XCP-ng:
Qq, what would be the best way for me to try spin up a sort of test or dev XCP-ng env for me to try things out on? Or is there sort of hardware involved such that this might not be so easy. In my mind I'm imagining spinning up a VM lol which probably shows my level of naivety
You can run XCP-ng inside a VM, as long as the hypervisor underneath exposes nested virtualisation. The actual installation of XCP-ng is very easy. Mostly click and run.
Latest posts made by Forza
-
RE: Epyc VM to VM networking slow
@TeddyAstie That is interesting. I had a look. The default seems to be
cubic
, butbbr
is available usingmodprobe tcp_bbr
. I also wonder if different queuing disciplines (tc qdisc
) can help. For example mqprio that spreads packes across the available NIC HW queues? -
RE: Migrating an offline VM disk between two local SRs is slow
@olivierlambert said in Migrating an offline VM disk between two local SRs is slow:
80/100MiB/s for one storage migration is already pretty decent. You might go faster by migrating more disks at once.
I'm not sure to understand what difference are you referring too? It's always has been in that ballpark, per disk.
This is not over a Network, only between local ext4 SRs on the same server. I tried the same migration using XCP-ng center and it is at the moment double as fast:
Can't really see any difference though. It is the same sparse_dd and nbd connection Perhaps it's a fragmentation issue. Though, doing a copy of the same VHD file gives close to 500MB/s.
-
RE: Migrating an offline VM disk between two local SRs is slow
@DustinB said in Migrating an offline VM disk between two local SRs is slow:
Separate question, why are you opting to use RAID 0, purely for the performance gain?
Yes, performance for bulk/temp data stuff.
-
RE: Migrating an offline VM disk between two local SRs is slow
@DustinB, no these are local disks not networked disks. I used to get around 400-500MB/s or so for plain migration between the two SRs.
-
RE: Backup folder and disk names
@jebrown you could export xva or ova copies. But then those are not exactly identical to backups.
-
Migrating an offline VM disk between two local SRs is slow
Hi!
I had to migrate a VM from one local SR (SSD) to another SR (4x HDD HW-RAID0 with cache) and it is very slow. It is not often I do this, but I think that in the past this migration was a lot faster.
When I look in
iotop
I can see roughly 80-100MiB/s.
I think that the issue is the sparse_dd connecting to localhost (10.12.9.2) over NBD/IP that makes it slow?
/usr/libexec/xapi/sparse_dd -machine -src /dev/sm/backend/55dd0f16-4caf-xxxxx2/46e447a5-xxxx -dest http://10.12.9.2/services/SM/nbd/a761cf8a-c3aa-6431-7fee-xxxx -size 64424509440 -good-ciphersuites ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES256-GCM-SHA384:AES256-SHA256:AES128-SHA256 -prezeroed
I know I have migrated disks before on this server with multiple hundreds of MB/s, so I am curious to what is the difference.
This is XOA stable channel on XCP-ng 8.2.
-
RE: XCP-NG 9, Dom0 considerations
My wishlist for a new XCP-ng:
- Recent kernel for dom0.
- Guest trim support -> i.e. guest trim translates to shrunk VDI file.
- Better VM console support: Spice and/or RDP support with client USB access, shared clipboard and file transfers. Note, I do not mean remoting into the guest itself, but providing console access via XCP-ng like the current console VM access is with XO and XCP-ng Center.
- Implement VirtIO support: virtio-net, virtio-gpu, virtio-blk/scsi. (There are also serial and socket virtio devices, but i do not personally use these much, but could be important for management of some types if VMs).
- Support multiqueue i guests for net and blk.(would be possible with virtio)
- VirGL support. This is important as an alternative to GPU pass through or SR-IOV. It would in theory support migration too.
-
RE: CBT: the thread to centralize your feedback
@Tristis-Oris We've had the same problem, so are not using CBT for now.
-
RE: Socket topology in a pool
@mauzilla I do not think numa is exposed to the guests, so they will only see the number of cores assigned. I. E., you can migrate them just fine.
-
RE: Epyc VM to VM networking slow
It's a good discovery that having XOA outside the pool can make the backup performance much better.
How is the problem solving going for the root cause? We too have quite poor network performance and would really like to see the end of this. Can we get a summary of the actions taken so far and what the prognosis is for a solution?
Did anyone try plain Xen on a new 6.x kernel to see if the networking is the same there?