Seem to have lost a host along the way.

pnunn

Hi Guys,
I seem to have generated a problem for myself.

In the process of trying to migrate VM's between hosts with private networks attached, it seems that I have managed to knock a host out of the poll or something and I can no longer do anything with it.

One of the VM's crashed in that we had to use the command line to turn it off (it became non-responsive totally to XEO). Now it seems that it exists on both hosts at the same time or something like that.

I restarted the toolchain on both hosts. Rebooted the troublesome one, and still can't do anything. I tried making it the pool master.. all sorts with no joy.

Currently it shows as "disconnected" and has apparently been up for 22 days despite having just been rebooted. (see attached).

So.... where too from here? It has nothing important on it yet.. but there must be a better way to move machines :).

Ta
Peter.
Screenshot from 2019-06-21 11-27-44.png

OK, more information. I've managed to connect to the host via its ipmi interface. It seems there are NO network interfaces at all on the host any more.

Configure Management Interface tells me there are no interfaces.... Dispaly Nics is blank..

Yet.. I can ssh in and they all seem to be there


ssh me@191.167.11.188                                                                            28.4m  Fri 21 Jun 2019 11:17:41 AEST
root@192.168.33.154's password: 
Last login: Fri Jun 21 11:12:07 2019 from 1y.x.44.88
[root@xcp-ng-ME1H2 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP qlen 1000
    link/ether 0c:c4:7a:ff:12:14 brd ff:ff:ff:ff:ff:ff
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP qlen 1000
    link/ether 0c:c4:7a:ff:12:15 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP qlen 1000
    link/ether 3c:fd:fe:77:26:70 brd ff:ff:ff:ff:ff:ff
5: eth3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master ovs-system state DOWN qlen 1000
    link/ether 3c:fd:fe:77:26:71 brd ff:ff:ff:ff:ff:ff
6: eth4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq master ovs-system state DOWN qlen 1000
    link/ether 3c:fd:fe:77:26:72 brd ff:ff:ff:ff:ff:ff
7: eth5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq master ovs-system state UP qlen 1000
    link/ether 3c:fd:fe:77:26:73 brd ff:ff:ff:ff:ff:ff
8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1
    link/ether c6:cc:0f:db:34:7e brd ff:ff:ff:ff:ff:ff
9: xenbr3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1
    link/ether 3c:fd:fe:77:26:71 brd ff:ff:ff:ff:ff:ff
10: xenbr5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN qlen 1
    link/ether 3c:fd:fe:77:26:73 brd ff:ff:ff:ff:ff:ff
    inet 192.xxx.yy..9/24 brd 192.xxx.yy.255 scope global xenbr5
       valid_lft forever preferred_lft forever
11: xenbr2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1
    link/ether 3c:fd:fe:77:26:70 brd ff:ff:ff:ff:ff:ff
12: xenbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1
    link/ether 0c:c4:7a:ff:12:14 brd ff:ff:ff:ff:ff:ff
    inet 192.aaa.zz.154/24 brd 192.aaa.zz.255 scope global dynamic xenbr0
       valid_lft 5326sec preferred_lft 5326sec
13: xenbr4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1
    link/ether 3c:fd:fe:77:26:72 brd ff:ff:ff:ff:ff:ff
14: xenbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1
    link/ether 0c:c4:7a:ff:12:15 brd ff:ff:ff:ff:ff:ff

Emergency Network Reset?? What's the way forward from here?

These are popping up on the console. No idea if they help


root@xcp-ng-ME1H2 ~]# 
Broadcast message from systemd-journald@xcp-ng-ME1H2 (Fri 2019-06-21 13:40:12 AEST):

xapi-nbd[28751]: main: Failed to log in via xapi's Unix domain socket in 300.000000 seconds


Broadcast message from systemd-journald@xcp-ng-ME1H2 (Fri 2019-06-21 13:40:12 AEST):

xapi-nbd[28751]: main: Caught unexpected exception: (Failure


Broadcast message from systemd-journald@xcp-ng-ME1H2 (Fri 2019-06-21 13:40:12 AEST):

xapi-nbd[28751]: main:   "Failed to log in via xapi's Unix domain socket in 300.000000 seconds")