Workload Balancing
-
Hi,
can anyone please give me a hint, what I'm doing wrong? I try to use workload balancing on a test cluster of two XCP-ng 8 nodes running HA-lizard and HA-iSCSI to make them really HA capable. In XOCE I've set up a load-balancer v0.3.2 plan with the threshhold of 90% cpu use and 15001 free memory of 24 GB total.
I've got 11 VMs on the first node, 0 VMs on the second node. The VMs use up 17 GB memory, but syslog says:
Dec 9 07:52:00 xoc-test xo-server[704]: [load-balancer]Execute plans!
Dec 9 07:52:00 xoc-test xo-server[704]: [load-balancer]No hosts to optimize.Now I create load by running
cat /dev/urandom > /dev/null
I see that the load increases to 100% cpu usage and syslog saysDec 9 07:58:00 xoc-test xo-server[704]: [load-balancer]Try to optimize Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 07:58:02 xoc-test xo-server[704]: 2019-12-09T07:58:02.255Z xo:perf INFO blocked for 892ms
Dec 9 07:58:02 xoc-test xo-server[704]: 2019-12-09T07:58:02.938Z xo:perf INFO blocked for 581ms
Dec 9 07:58:04 xoc-test xo-server[704]: 2019-12-09T07:58:04.154Z xo:perf INFO blocked for 832ms
Dec 9 07:58:04 xoc-test xo-server[704]: [load-balancer]Performance mode: 0 optimizations for Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 07:58:47 xoc-test xo-server[704]: 2019-12-09T07:58:47.676Z xo:main INFO - WebSocket connection (::ffff:192.168.205.146)
Dec 9 07:59:00 xoc-test xo-server[704]: [load-balancer]Execute plans!
Dec 9 07:59:00 xoc-test xo-server[704]: [load-balancer]Try to optimize Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 07:59:00 xoc-test xo-server[704]: [load-balancer]Performance mode: 0 optimizations for Host (749ccd3a-2613-4e83-96f0-042dd5464e51).No VMs are relocated to the empty second node.
What do I miss?
Best regards,
Alexander -
Getting weird. No some VMs were migrated, including the XOCE VM. But why now?
Dec 9 08:43:00 xoc-test xo-server[704]: 2019-12-09T08:43:00.911Z xo:perf INFO blocked for 536ms
Dec 9 08:43:01 xoc-test xo-server[704]: [load-balancer]Try to optimize Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:43:01 xoc-test xo-server[704]: [load-balancer]Performance mode: 0 optimizations for Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:44:00 xoc-test xo-server[704]: [load-balancer]Execute plans!
Dec 9 08:44:00 xoc-test xo-server[704]: [load-balancer]Try to optimize Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:44:01 xoc-test xo-server[704]: 2019-12-09T08:44:01.103Z xo:perf INFO blocked for 856ms
Dec 9 08:44:02 xoc-test xo-server[704]: 2019-12-09T08:44:02.033Z xo:perf INFO blocked for 728ms
Dec 9 08:44:02 xoc-test xo-server[704]: [load-balancer]Migrate VM (9a2c9dfc-02e0-62b8-5a1f-f02d1be64af8) to Host (5fc41a81-6985-4421-a91b-0aeef3a7ca97) from Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:44:02 xoc-test xo-server[704]: [load-balancer]Migrate VM (8cdaa9de-ce6f-9259-3961-e201adc2b931) to Host (5fc41a81-6985-4421-a91b-0aeef3a7ca97) from Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:44:02 xoc-test xo-server[704]: [load-balancer]Migrate VM (d51423c9-b2a7-2bd2-e79a-ff6a63fd3bd8) to Host (5fc41a81-6985-4421-a91b-0aeef3a7ca97) from Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:44:02 xoc-test xo-server[704]: [load-balancer]Migrate VM (3f202851-cea8-1242-3aa3-96fe479a00ba) to Host (5fc41a81-6985-4421-a91b-0aeef3a7ca97) from Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:45:34 xoc-test kernel: [236817.924318] Freezing user space processes ... (elapsed 0.005 seconds) done.
Dec 9 08:45:34 xoc-test kernel: [236817.929674] OOM killer disabled.
Dec 9 08:45:34 xoc-test kernel: [236817.929675] Freezing remaining freezable tasks ... (elapsed 0.005 seconds) done.
Dec 9 08:45:34 xoc-test kernel: [236817.967276] suspending xenstore...
Dec 9 08:45:34 xoc-test kernel: [236818.014068] Xen Platform PCI: I/O protocol version 1
Dec 9 08:45:34 xoc-test kernel: [236818.014113] xen:grant_table: Grant tables using version 1 layout
Dec 9 08:45:34 xoc-test kernel: [236818.014284] xen: --> irq=9, pirq=16
Dec 9 08:45:34 xoc-test kernel: [236818.014316] xen: --> irq=8, pirq=17
Dec 9 08:45:34 xoc-test kernel: [236818.014347] xen: --> irq=12, pirq=18
Dec 9 08:45:34 xoc-test kernel: [236818.014378] xen: --> irq=1, pirq=19
Dec 9 08:45:34 xoc-test kernel: [236818.014408] xen: --> irq=6, pirq=20
Dec 9 08:45:34 xoc-test kernel: [236818.014439] xen: --> irq=4, pirq=21
Dec 9 08:45:34 xoc-test kernel: [236818.014469] xen: --> irq=7, pirq=22
Dec 9 08:45:34 xoc-test kernel: [236818.014499] xen: --> irq=23, pirq=23
Dec 9 08:45:34 xoc-test kernel: [236818.014531] xen: --> irq=28, pirq=24
Dec 9 08:45:34 xoc-test kernel: [236818.051408] usb usb1: root hub lost power or was reset
Dec 9 08:45:34 xoc-test kernel: [236818.219361] ata2.01: configured for MWDMA2
Dec 9 08:45:34 xoc-test kernel: [236818.408719] usb 1-2: reset full-speed USB device number 2 using uhci_hcd
Dec 9 08:45:34 xoc-test kernel: [236818.648494] OOM killer enabled.
Dec 9 08:45:34 xoc-test kernel: [236818.648500] Restarting tasks ... done.
Dec 9 08:45:34 xoc-test kernel: [236818.670135] Setting capacity to 20971520
Dec 9 08:45:34 xoc-test systemd-networkd[603]: eth0: Lost carrier
Dec 9 08:45:34 xoc-test systemd-networkd[603]: eth0: Gained carrier
Dec 9 08:45:34 xoc-test xo-server[704]: 2019-12-09T08:45:34.636Z xo:perf INFO blocked for 659ms
Dec 9 08:45:34 xoc-test systemd-networkd[603]: eth0: Configured
Dec 9 08:45:35 xoc-test xe-daemon: Trigger refresh after system resume
Dec 9 08:45:41 xoc-test xo-server[704]: [load-balancer]Performance mode: 4 optimizations for Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:46:00 xoc-test xo-server[704]: [load-balancer]Execute plans!
Dec 9 08:46:00 xoc-test xo-server[704]: [load-balancer]Try to optimize Host (749ccd3a-2613-4e83-96f0-042dd5464e51).
Dec 9 08:46:00 xoc-test xo-server[704]: [load-balancer]Performance mode: 0 optimizations for Host (749ccd3a-2613-4e83-96f0-042dd5464e51).Best regards,
Alexander -
You probably need to wait a bit longer so it can migrate.
Also please edit your post and use Markdown syntax for your blocks of logs, it's far easier to read it then
-
Unfortunately the time for editing is over. No, waiting does not help. Is there anyone who knows, how it should work? Migrating some VMs after 45 minutes is not what you expect from a load balancer. Additionally I may have been initiated by working on another VM on that host.
-
Time for editing is over? What? You can't edit your post anymore?
Also, if you need this for a production scenario, please create a support ticket, here this is community support with "best effort" on free time people have.
-
You can only edit posts within 3600 seconds. It tells you, when you want to submit your changes. The tests I do currently are part of the evaluation for production use. Is there a difference in functionality between XOCE and XOA regarding workload balancing?
-
XOA is professionally supported, not XO from the sources. It means you can create support tickets where we can remote access, even in XOA Free.
-
So is there anyone who would be so kind to tell me whether it is worth it trying XOA instead of XOCE regarding workload balancing? Anyone who knows if there are differences?