Moving management network to another adapter and backups now fail

syscon-chrisl

This post is deleted!

nikade

One thing I learned over the years with XenServer and now XCP-NG:
Don't change anything, if you have a pool all hosts needs to have the same specs on their NIC, make sure to sync the clock with NTP, dont use DHCP and dont mess with iptables or similar since it will mess upp XAPI.

From the error message I'd say you have missed some vital part, like what does a traceroute or ping from the XCP-NG to the host look like? Can you telnet the destination and port?
I know the traceroute binary isn't present but you can install it from epel repo's.

syscon-chrisl

@nikade That's what it took... restoring the management interface to the original
Backups work now, stats now work as well. Thank you

CodeMercenary

I'm bummed to hear that it isn't tolerant of changes. When I set up xcp originally, I gave it the first 10Gb port as the management interface and that's on the main LAN, no VLAN. Now I was wanting to move management off of the main LAN and onto a dedicated VLAN on the second 10Gb port. I've been nervous to make that change because I don't want to break something, it seems that concern was well founded. I was actually planning on posting today to ask about how to best move the management interface into a VLAN on a separate port.

Feels like I just have to live with everything on the same port and I won't be able to isolate the management or backup traffic like I want to. Maybe I could move the backups onto a separate VLAN or does that happen through the management interface? I think I need to dive back into the docs.

Danp

Have you checked to see if there is a backup network configured on the pool (bottom of the pool's Advanced tab in XO)?

CodeMercenary

This post is deleted!

nikade

Changing the mgmt interface on a pool is nothing I have good experience from, but on a stand alone host it is rather easy:

Make sure that the new interface is connected and has the correct VLAN, if you're using DHCP it will automatically grab an IP. If not you'll have to go on the console and configure the IP manually after changing the interface.

syscon-chrisl

@nikade Yeah, this was a standalone host. It had an IP which was working. The only thing I did was change the mgmt interface from eth3 to eth2, both 10G Fiber, and it stopped backing up and stopped giving me stats. I tried rebooting, restarting toolstacks, etc with no luck.

nikade

@syscon-chrisl OK - So you just changed the mgmt interface to another physical NIC?
That should definately work, did you have any ip configuration on any other interface?

syscon-chrisl

@nikade Yes, that's all I did. Changed from one subnet to a different one on the other NIC. There were no other IPs configured on the host.
It was actually 2 hosts (2 different pools) that I did the same thing on. Both just swapped NICs and they stopped working.

nikade

@syscon-chrisl OK that's another story, totally weird.
It should totally work if the networking is correct, we've done this at work many times (Migrating from 1G to 10G NIC for example).

xrm

@syscon-chrisl Did you find a solution for this?

I am asking because I ran into the very same error by switching my storage network. I tested a few different setups and found that I can modify the interface for all non-master nodes in my pool to the new interface without any issues; but as soon as I switch the master node, the backups will stop working while normal operation continues without any issues.

I also tried changing the pool master afterwards to a node that has the modified storage interface, but the error would then reappear.

It also doesn't seem to make a difference if I change the interface from XO (by setting the new interface to the same IP settings and then disabling the old interface and removing its IP) or through XCP-ng Center, the result seems to be same both times.

I set up a new XOA as well as a new XO (from sources) instance (and ran different versions of XO there) as I thought it was something with my current XO instance, but to no avail ...

xrm

Also, the actual error in the detailed report is in my case:

...
  "tasks": [
    {
      "data": {
        "type": "VM",
        "id": "74ed4e04-3bd9-1c68-f1e0-3c7a0fb567f7"
      },
      "id": "1748127565992",
      "message": "backup VM",
      "start": 1748127565992,
      "status": "failure",
      "end": 1748127565993,
      "result": {
        "call": {
          "duration": 91,
          "method": "session.login_with_password",
          "params": "* obfuscated *"
        },
        "message": "unexpect statusCode 302",
        "name": "Error",
        "stack": "Error: unexpect statusCode 302\n    at file:///opt/xen-orchestra/packages/xen-api/transports/json-rpc.mjs:24:13\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)"
      }
    },
...

tjkreidl

@syscon-chrisl Note that when changing the management interface, it's highly recommended to reduce the PMI down to just one NIC on all your hosts before you make the change. That said, it's always a scary thing to do and as others have stated, best avoided if at all possible. Making sure all hosts are at the same hotfix levelasand that their NICs are all in the same order and speeds are essential requirements.