XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Network Management lost, No Nic display Consol

    Scheduled Pinned Locked Moved Management
    14 Posts 9 Posters 1.4k Views 7 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F Offline
      FMOTrust
      last edited by

      Good evening Legends.

      Im Truly sorry, but im in need of some assistance, Minor, Major mess i made.

      After an emergency Network Reset i am unable to see any NICS displayed in on the XE Console, Going into the shell i can see the Network cards are there using ip a and ifconfig -a

      if i ip link set <Interface> up. It comes up and im able to assign an IP to this interface, this then also allows me to Ping another hypervisor Host within the same subnet.

      Yet still no Management interface and no Nics to display in the console.

      i have attempted to do the xe-management-reconfiguration uuid. and it just keeps stating that it cannot get an IP from the Master.

      This has me truly stumped and i have no idea where to start looking, i had done toolstack restart, I had confirmed XAPI is active on the system yet still not able to create a management interface to assign it back to the system pool.

      What would i have to do.

      K C 2 Replies Last reply Reply Quote 0
      • K Offline
        kassebasse @FMOTrust
        last edited by

        @FMOTrust Hey, I think I also am suffering from the same issue that you do. Could you please verify that you have the same issue as me? https://xcp-ng.org/forum/topic/10687/after-an-update-the-nic-s-has-disappeared-but-still-works-somehow?_=1743438466779

        It might be a bug or something wrong with an update?

        1 Reply Last reply Reply Quote 0
        • C Offline
          coolsport00 @FMOTrust
          last edited by

          @FMOTrust I had this issue as well on a Slave host in my test environment. I was able to do Network Reset from xsconsole, but then after the Host rebooted to finish the reset, the Host reverted back to not having/seeing the Mgmt network, and...like you, I couldn't see any NICs either.

          Since this was for testing, and I had no VMs on it, I just reinstalled XCP. I think there's a way to reset XCP I saw in another Forums post by removing the Local SR? Olivier shared that in the post..but I don't remember what post it was.

          Not sure if this is a test or prod Host/environment for you...but if you can, unless someone from the Vates team comments, I think all that can be done is reinstall. 😕

          1 Reply Last reply Reply Quote 0
          • M Offline
            msgerbs
            last edited by

            I know this is old, but I also had this issue. I updated my pool from 8.2.1 to 8.3 using the installer ISO. Both hosts came back up, I did an update on the pool master and then the slave, and now the slave has lost its network interfaces. In ifconfig, I see eth0, lo, and xenbro with no IP, along with a "xentemp" which has the IP address I assigned the management address. I did an xe-toolstack-restart and emergency network reset, which did not seem to help. xe pif-list, network-list and all other xe commands simply hang forever.

            1 Reply Last reply Reply Quote 0
            • nikadeN Offline
              nikade Top contributor
              last edited by

              This is a classic issue with XAPI, once you have hosts in a pool and the slave cannot reach the master it will go crazy. Never seen this issue with standalone hosts tho.

              We usually had this issue when upgrading xenserver, so we simply stopped doing that and then never had any issues. We went to "new" versions by simply standing up a new pool and migrate all the vm's over to it 🙂

              A 1 Reply Last reply Reply Quote 0
              • A Offline
                acebmxer @nikade
                last edited by

                @nikade Can you elaborate on that more for us? Do you create a new pool with existing host? if so how? or do you pull a host make new pool from that host and migrate the rest over?

                nikadeN 1 Reply Last reply Reply Quote 0
                • nikadeN Offline
                  nikade Top contributor @acebmxer
                  last edited by

                  @acebmxer It's not pretty, but its failsafe. The proceedure looked like this in our case:

                  1. Disable HA in the "old pool"
                  2. Put a host in the "old pool" into maintenance mode
                  3. Reinstall that host and connect it to XOA and then patch it
                  4. Create a "new pool" from that host
                  5. Create a new LUN or NFS share in the SAN for "new pool" and attach it to "new pool"
                  6. Live migrate VM's over from "old pool" to "new pool"
                  7. Once you've freed up another host you repeate step 2 and 3 and then join that host to "new pool". It is important that you patch it before joining it to the pool, that is done by going to Settings -> Servers in XOA and connect to it manually.

                  And then just continue untill you're done. Live migration is pretty reliable now days, so this works pretty good and since we had 10G network its not taking as long as it used to do with 1G network.
                  We did this after a major incident on our primary production site where 2 out of 4 hosts in a pool "suddenly" lost their NIC's after updating them. Since then we never updated the pools again. Standalone hosts are fine tho, they never did this.

                  Luckily we had 2 other pools where we could migrate the VM's to, but we couldn't realy trust the updating after that.

                  DustyArmstrongD 1 Reply Last reply Reply Quote 1
                  • DustyArmstrongD Offline
                    DustyArmstrong @nikade
                    last edited by DustyArmstrong

                    @nikade sorry to drag this up but, is there a particular process or methodology to avoid this in the first place? Just had it happen on two brand new hosts, I had to re-install XCP from scratch. Bit worried if I reboot one of them now for any reason this will happen again. It happened on both the pool master and the slave, network completely wiped out on both.

                    Genuinely one of the most bizarre series of events I've ever experienced with server infrastructure, I could not understand what was going on until I found this thread.

                    P nikadeN 2 Replies Last reply Reply Quote 0
                    • P Offline
                      Pilow @DustyArmstrong
                      last edited by Pilow

                      @DustyArmstrong on your slave host, do a

                      # cat /etc/xensource/pool.conf
                      slave:xxx.xxx.xxx.xxx
                      

                      you should see IP address of the master. If not, correct it.
                      the master must be pingable and accessible from management of the slaves in order for the slaves to have correct network propagation.

                      you can try the command on MASTER host, you should see

                      master
                      

                      if you corrected the file on slave host, reboot it, it should come back normally

                      DanpD 1 Reply Last reply Reply Quote 0
                      • DanpD Offline
                        Danp Pro Support Team @Pilow
                        last edited by

                        @Pilow said in Network Management lost, No Nic display Consol:

                        you can try the command on MASTER host, you should see

                        master:ip_of_the_master

                        Slight correction. On the master, you should only have

                        master
                        

                        without the colon and IP address.

                        P 1 Reply Last reply Reply Quote 1
                        • P Offline
                          Pilow @Danp
                          last edited by

                          @Danp ha thanks for the correction, I was so certain to have seen it, that I didn't check on my master

                          1 Reply Last reply Reply Quote 0
                          • nikadeN Offline
                            nikade Top contributor @DustyArmstrong
                            last edited by

                            @DustyArmstrong said in Network Management lost, No Nic display Consol:

                            @nikade sorry to drag this up but, is there a particular process or methodology to avoid this in the first place? Just had it happen on two brand new hosts, I had to re-install XCP from scratch. Bit worried if I reboot one of them now for any reason this will happen again. It happened on both the pool master and the slave, network completely wiped out on both.

                            Genuinely one of the most bizarre series of events I've ever experienced with server infrastructure, I could not understand what was going on until I found this thread.

                            What exactly happend? Could you try and explain in 1-2-3 steps?

                            DustyArmstrongD 1 Reply Last reply Reply Quote 0
                            • DustyArmstrongD Offline
                              DustyArmstrong @nikade
                              last edited by

                              @nikade Sure.

                              I have 2 mini machines running as XCP hosts, decided to upgrade them to newer hardware as they were struggling. I have:

                              XC1 - Brand new pool master on pool XC1
                              XC2 - Brand new pool slave on pool XC1

                              XCP1 - Old pool master on pool XCP1
                              XCP2 - Old pool master on pool XCP2

                              Installed XCP on both the new hosts, imported to Xen Orchestra, designated the first as pool master, added the second to the pool. Performed updates, did some nominal checks and called them good. Wasn't ready to migrate my VMs yet so powered both off.

                              Came yesterday, I unplugged my old hosts and set them aside, plugged in my new hosts (this is onto my UPS supply) power and network. Plugged in my old hosts to my temporary working area to migrate VMs. Powered everything on (old then new). XO stayed online the whole time, though DNS is on a VM that went down.

                              Only a single host came up (XCP2 - slave of XCP1). XCP1 had to be rebooted 3 or 4 times before it finally came alive but at a big ping delay (4ms, usually <1ms). XC1 and XC2 never came alive, their NICs had physical activity but got nothing. Eventually plugged in a monitor (awkward so didn't do it immediately), rebooted them both and saw no network configuration on either, just empty. Found this thread and decided, as they were basically fresh, I would just re-install and start over. Eventually got everything back.

                              I am assuming based on this thread that XC2 may have come up before XC1 and thus couldn't connect so it obliterated itself. Don't know why XC1 also did the same.

                              During testing I powered both XC1 and XC2 on and off multiple times without this happening, it was only when I powered everything on after moving them that it occurred. Thought I was going nuts or had caused a major network loop somehow.

                              nikadeN 1 Reply Last reply Reply Quote 0
                              • nikadeN Offline
                                nikade Top contributor @DustyArmstrong
                                last edited by

                                @DustyArmstrong thats super-strange, i actually have the same setup at home, 2 hp z240 machines running xcp-ng in a small pool.
                                xcp1 is always up and running, xcp2 is powered down when I dont need it, everything important is running on xcp1, maybe that's the reason I don't run into these issues.

                                1 Reply Last reply Reply Quote 0
                                • First post
                                  Last post