Posts made by billcouper | XCP-ng and XO forum

billcouper

If you run backups outside of business hours, any impact on pool hosts cpu/memory performance is likely irrelevant (and limited by how many resources the XO is provisioned with anyway). The bigger potential impact is likely on your production storage, which again could be irrelevant outside of business hours.

However, if you want to perform backups more frequently and/or during business hours, in my experience the storage performance is the more likely to suffer noticeable impact. Unless your hosts are very highly utilized the additional cpu/memory load on a single VM shouldn't tip the scales. And at 500MB/sec your network shouldn't struggle either (I am assuming 10+Gbps links to get that speed).

And regardless of backing up during or outside business hours, or how long your backup window is, always consider the restore times! Performance of backup storage is always low priority until something needs to be restored If your backup takes 8 hours your restore will take 8 hours. Or longer. Don't cut any corners on backup storage, it is very important!

billcouper

@TS79 In your backup/replication notes you say that remote to remote backup replication doesn't put any load on the pools, except that it does? The XO VM is doing the heavy lifting and will use CPU and Network resources, provided by the pool, to accomplish the replication. There are no "agent" like service running ON the remote that can offload the transfer.

If you haven't locked in your production storage, and have <25ms latency between your sites, consider PureStorage Flash Array //X. We use these for production storage (and backups on //E arrays)... Long story short, if you deploy a PureStorage Flash Array into both sites, you can set them up in ActiveCluster, like we have done. You can then split your production cluster between the two sites. Hosts in both sites have paths to storage on each side, but use the array closest to them - the arrays then replicate the writes between themselves. It's truly active-active synchronous replication. This allows a design where the failover between sites is automated. If one of the sites goes down, the workloads will automatically restart in the other site. There's a lot of network design that goes into this, but it's definitely possible as we're doing it.

billcouper

@simonebertucelli SNMP? No. Xen Orchestra via API. Zabbix Agent on hosts for additional detail.

Edit: It works just like monitoring a vCenter Server via API - all of the hosts and VMs appear in Zabbix. I am using this Zabbix template https://github.com/bufanda/zabbix--template-xenorchestra/tree/main

billcouper

@olivierlambert This post was created more than six months ago to find out about exposing what I thought was existing functionality in the xcp-ng hosts. XCP-ng Center windows app allows you to configure SMTP settings for alerts, and it fires off emails about things like multipath count etc. I assumed the alert mechanism is already built into xcp-ng and 'center' is just configuring it. If that is true, then it would be nice to expose this existing functionality via xen orchestra, so we no longer feel the need to keep xcp-ng center hanging around.

@olivierlambert Please don't go and develop a completely new monitoring solution based on my request. However, IF you were to undertake such a mammoth task, then I am more than happy to put forth my wish list of what I would like to see alerts for (almost none of which are performance related).

@splastunov I looked at netdata but it's primarily performance metrics. I'm not too interested in performance metrics at the host level (other than memory consumption). Plus, I don't see a way to create triggers/thresholds and fire alerts off.

In regards to more comprehensive monitoring that can fire off alerts, I have worked on my own methods. I am happy at the moment using Zabbix to monitor Xen Orchestra, with agents on the XCP-ng hosts for additional monitoring. I am happy to provide info about the Zabbix if anyone wants it?

billcouper

@olivierlambert There already seem to be some alerts configured inside the xcp-ng servers? Once I configure email alerts using XCP-ng center (just setting the smtp server and email addresses) I do get notifications about items like multipath and bond status (eg path lost, nic disconnected, bond status changed, etc). I had thought that Initially it would be good if we can simply enable the built in alerts using xoa, the same way that xcp-ng center enables them.

If you were going to develop a standalone alarm/alert system in XOA, that's a giant kettle of fish.

These are just some of the things I'd like to see alerts for (and be able to trigger some type of action, eg 'send email' or an api call).

NIC down (that was previously up)
Bond link count changed
Host unresponsive
Host patches required
Host restart required
Host memory usage
Pool memory usage
SR path lost (one of multiple paths went down)
SR all paths down
SR not connected to all pool hosts
SR capacity (with configurable thresholds as % or GB)
SR latency
Remote storage connectivity lost
Remote storage capacity (% or GB)
VM CPU co-stop (or xen equivelant)
VM Management Agent detected
VM Snapshot count
VDI remains attached to control domain after backup

Being able to apply the monitoring granularly and hierarchically are very important. For example, I might set a global latency alarm that applies to all SRs. But maybe one of my pools is 'special' so I apply a stricter latency alarm at that pool level (the rest of the pools can use the global threshold). Then inside that pool, one SR in particular might be unique and need it's own latency alarm configured at the SR level.
So the inheritance of the alarms for SRs should be Global > Pool > SR

So thinking of this type of granular requirement with hierarchical inheritance, XO6 seems like it could be possible, due to the new tree view which could potentially allow setting alert policies at each level.

billcouper

I am trying to use the API to produce a list of backup jobs and their most recent status (success/fail/warn/etc).

====================

So far, I am going to this API node

https://xo.fqdn/rest/v0/backup/jobs/vm

Then following all those references to find the name of each job

https://xo.fqdn/rest/v0/backup/jobs/vm/510be304-3777-4765-825d-5f1d78e98bb0

Then getting the backup job logs

https://xo.fqdn/rest/v0/backups/logs

Then following all those references to find the logs where the id matches the list of jobs I know about

https://xo.fqdn/rest/v0/backup/logs/1717495512575

And from that output pulling the status value and compiling it all together

====================

What I really need is something like the output of this query

https://xo.fqdn/rest/v0/backup/logs?fields=id,status,jobName&limit=11

But the problem with this query is that I have to limit the result to the number of jobs and it only works if each job runs once a day etc etc. Right now, I have 11 jobs and they all only run once a day, so I believe that the query will work for me (assuming the most recent log entries are at the start of the list)... but as soon as I add or remove a job, or a job runs more than once a day, it breaks.

@julien-f Is there a more elegant method to retrieve ALL backup jobs and their last 'status' value? I'd like to do this in a single, fast, efficient query if it's possible

billcouper

@rtjdamen I'm doing something similar using the "xo-cli" utility on the XO server directly. I run this command remotely and it returns JSON array. Using this output I can identify which restore points exist for each VM

xo-cli backupNg.listVmBackups --json remotes="json:'[`"02fcd7e4-7172-46ec-a43f-2eb79fd00009`",`"f2c432dc-eddb-41bc-828f-67121529904c`"]'"

Note that I am limiting the output to only show me backups that exist in specific "remotes" but you may not need to do this.

I am interested if there is a REST API call that produces the same output? @julien-f

billcouper

I have done some more testing with this.

I am able to generate test alerts/alarms from the xcp-ng cli:

xe message-create pool-uuid=<my-pool-uuid> name="More Test" body="More Test Message. Very Message. Much Test." priority=3

I found that if I use XCP-ng Center and configure a pool to send email notifications, that configuration is saved to each host in the pool - the email alerts are sent even when XCP-ng Center application is no longer running. I get the notifications that I am after.

As I am instructing my engineers to move away from using XCP-ng Center and instead do everything via XOA, it would be great if this pool-wide alert email notifications configuration can be achieved using XOA.

billcouper

Hi all,

I can only see performance and backup alerts in XOA. How do I get alerts/alarms for health related events?

For example, I would like to know when "the bad things" happen in my environment, such as... Network configuration issue... Multipath Alert... Bond status... etc.

I see in XCP-ng Center that it raises Alerts for these type of events, and can send emails too... however I believe I need to leave the software running for this to work? If I close the app, log off the machine, or it reboots for updates or something, then I no longer have email alerts for these important health events... is that correct?

Regards,
Bill