XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    Lost access to all servers

    Scheduled Pinned Locked Moved Compute
    36 Posts 6 Posters 14.1k Views 6 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • F Offline
      fred974 @fred974
      last edited by

      Log from master

      [11:57 uk ~]# tail /var/log/SMlog
      Apr 27 11:56:41 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:42 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:43 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:44 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:45 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:47 uk SM: message repeated 2 times: [ [9991] Raising exception [150, Failed to initialize XMLRPC connection]]
      Apr 27 11:56:48 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:49 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:50 uk SM: [9991] Raising exception [150, Failed to initialize XMLRPC connection]
      Apr 27 11:56:50 uk SM: [9991] Connecting from config to LINSTOR controller using: 172.16.10.47
      
      [11:58 uk ~]# tail /var/log/VMSSlog
      Apr 27 09:45:01 uk VMSS: [12832] ===Kicking cron job for VMSS===
      Apr 27 09:45:01 uk VMSS: [12832] VMSS policy not enabled for this pool, Exiting cron job.
      Apr 27 10:00:01 uk VMSS: [28354] ===Kicking cron job for VMSS===
      Apr 27 10:00:01 uk VMSS: [28354] VMSS policy not enabled for this pool, Exiting cron job.
      Apr 27 10:15:10 uk VMSS: [12571] ===Kicking cron job for VMSS===
      Apr 27 10:15:10 uk VMSS: [12571] VMSS policy not enabled for this pool, Exiting cron job.
      Apr 27 10:30:11 uk VMSS: [28160] ===Kicking cron job for VMSS===
      Apr 27 10:30:11 uk VMSS: [28160] VMSS policy not enabled for this pool, Exiting cron job.
      Apr 27 10:45:01 uk VMSS: [12520] ===Kicking cron job for VMSS===
      Apr 27 10:45:01 uk VMSS: [12520] VMSS policy not enabled for this pool, Exiting cron job.
      
      F 1 Reply Last reply Reply Quote 0
      • F Offline
        fred974 @fred974
        last edited by fred974

        1b3dc9f6-2029-41e8-9146-2f600e284daa-image.png @fred974

        AtaxyaNetworkA 1 Reply Last reply Reply Quote 0
        • AtaxyaNetworkA Offline
          AtaxyaNetwork Ambassador @fred974
          last edited by

          @fred974 Hi !

          You can try a xe-toolstack-restart on the master, it will not harm your running VMs

          F 1 Reply Last reply Reply Quote 0
          • F Offline
            fred974 @AtaxyaNetwork
            last edited by

            @AtaxyaNetwork said in Lost access to all servers:

            @fred974 Hi !
            You can try a xe-toolstack-restart on the master, it will not harm your running VMs

            I am doing it now but I am not getting the cursor back
            Also got this:

            [12:09 uk ~]# xe host-is-in-emergency-mode
            true
            [12:09 uk ~]# xe pool-recover-slaves
            The server could not join the liveset because the HA daemon could not access the heartbeat disk.
            
            F 1 Reply Last reply Reply Quote 0
            • F Offline
              fred974 @fred974
              last edited by

              Forgot to say the cluster has HA enable

              F 1 Reply Last reply Reply Quote 0
              • F Offline
                fred974 @fred974
                last edited by

                Should I follow this?
                https://support.citrix.com/article/CTX131127/unable-to-connect-to-high-availability-enabled-xensever-pool-and-all-servers-in-pool-are-in-emergency-mode

                1 Reply Last reply Reply Quote 0
                • olivierlambertO Offline
                  olivierlambert Vates 🪐 Co-Founder CEO
                  last edited by

                  "Forget to say HA was enabled": that's the main information here 😆

                  Yes, disable HA first 🙂

                  F 1 Reply Last reply Reply Quote 0
                  • F Offline
                    fred974 @olivierlambert
                    last edited by

                    @olivierlambert I disabled HA and set host 2 as new master and the NIC are showing again but I cannot ssh or access any VM. Including XO. In xcp-ng centre, all the host seem to be in maintenance mode.
                    633c06a7-f79e-4523-8430-bcaaaa982e74-image.png

                    F 1 Reply Last reply Reply Quote 0
                    • F Offline
                      fred974 @fred974
                      last edited by

                      [12:37 uk ~]# xe task-list
                      uuid ( RO)                : c8fc2549-9939-8ced-2ab6-cd2b5b1d6a7d
                                name-label ( RO): server_init
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      
                      uuid ( RO)                : 30e9fb68-8326-df55-505d-39a5de71f9cd
                                name-label ( RO): host.call_plugin
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      
                      uuid ( RO)                : 662298de-07e3-c3a2-6559-61e5d79c6d31
                                name-label ( RO): server_init
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      
                      uuid ( RO)                : 24b360e3-dc0c-b05b-61b2-1352f23d4b44
                                name-label ( RO): server_init
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      
                      uuid ( RO)                : 24d698b2-dec9-2c20-81a2-9d75a3118705
                                name-label ( RO): host.call_plugin
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      
                      uuid ( RO)                : 11fa18e5-3d4a-7036-7eb8-d91dc7a399c0
                                name-label ( RO): host.call_plugin
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      
                      uuid ( RO)                : d8651251-306a-22d6-fd74-231cfb570d12
                                name-label ( RO): server_init
                          name-description ( RO):
                                    status ( RO): pending
                                  progress ( RO): 0.000
                      
                      F 1 Reply Last reply Reply Quote 0
                      • F Offline
                        fred974 @fred974
                        last edited by

                        @olivierlambert I got all the system back up and running now with HA disabled. I think that I need HA enable to get my XOSTOR SR to work again but I am not happy not understanding what happened today as xcp-ng gone wrong out of the blue. Cold you please tell me what I need to do to investigate the source of the issue?

                        AtaxyaNetworkA 1 Reply Last reply Reply Quote 0
                        • AtaxyaNetworkA Offline
                          AtaxyaNetwork Ambassador @fred974
                          last edited by

                          @fred974 You can check in /var/crash if you have something, /var/log/xensource.log and /var/log/SMlog
                          And maybe a dmesg

                          F 1 Reply Last reply Reply Quote 0
                          • F Offline
                            fred974 @AtaxyaNetwork
                            last edited by

                            @AtaxyaNetwork the crashed happened at 11am and here are the relevant extract for the timeframe
                            /var/crash is empty

                            more /var/log/SMlog -f

                            Apr 27 10:57:41 uk SM: [26334] sr_update {'sr_uuid': 'a20ee08c-40d0-9818-084f-282bbca1f217', 'subtask_of': 'DummyRef:|129646ab-9048-4d66-b873-789ffd07fb00|SR.stat', 'args': [], 'host_ref': 'OpaqueRef:359a920d-
                            7bb1-4088-8b3e-42254f111f51', 'session_ref': 'OpaqueRef:69e52662-3118-4cf4-8b03-9741dbf3b312', 'device_config': {'group-name': 'linstor_group/thin_device', 'redundancy': '3', 'hosts': 'uk.dc1.xcp-ng-hyper1,uk.
                            dc1.xcp-ng-hyper2,uk.dc1.xcp-ng-hyper3,uk.dc1.xcp-ng-hyper4', 'SRmaster': 'true', 'provisioning': 'thin'}, 'command': 'sr_update', 'sr_ref': 'OpaqueRef:f62acb08-116b-42e4-90df-e7d2153ed610', 'local_cache_sr':
                            '28b8eb58-a6a2-c2fa-ad1e-b339b531330f'}
                            Apr 27 10:57:41 uk SM: [25812]   pread SUCCESS
                            Apr 27 10:57:41 uk SM: [25812] lock: released /var/lock/sm/.nil/lvm
                            Apr 27 10:57:41 uk SM: [25812] Updating metadata : {'objtype': 'sr', 'name_description': 'iSCSI Storage on TrueNAS Core - HDD', 'name_label': 'TrueStoreHDD_iSCSI'}
                            Apr 27 10:57:41 uk SM: [25812] entering updateSR
                            Apr 27 10:57:41 uk SM: [25812] lock: released /var/lock/sm/f7d16827-19e0-c57d-a720-c7fba180d4af/sr
                            Apr 27 10:57:41 uk SMGC: [26291] GC process exiting, no work left
                            Apr 27 10:57:41 uk SM: [26291] lock: released /var/lock/sm/a20ee08c-40d0-9818-084f-282bbca1f217/gc_active
                            Apr 27 10:57:41 uk SMGC: [26291] In cleanup
                            Apr 27 10:57:41 uk SMGC: [26291] SR a20e ('XOSTOR') (23 VDIs in 16 VHD trees): no changes
                            Apr 27 10:57:41 uk SMGC: [26291] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
                            Apr 27 10:57:41 uk SMGC: [26291]          ***********************
                            Apr 27 10:57:41 uk SMGC: [26291]          *  E X C E P T I O N  *
                            Apr 27 10:57:41 uk SMGC: [26291]          ***********************
                            Apr 27 10:57:41 uk SMGC: [26291] gc: EXCEPTION <class 'XenAPI.Failure'>, ['UUID_INVALID', 'VDI', 'DELETED_267dfbbd-bc85-4f61-92ad-0fb2703fdd49']
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 3413, in gc
                            Apr 27 10:57:41 uk SMGC: [26291]     _gc(None, srUuid, dryRun)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 3298, in _gc
                            Apr 27 10:57:41 uk SMGC: [26291]     _gcLoop(sr, dryRun)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 3209, in _gcLoop
                            Apr 27 10:57:41 uk SMGC: [26291]     if not sr.hasWork():
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 1652, in hasWork
                            Apr 27 10:57:41 uk SMGC: [26291]     if self.findLeafCoalesceable():
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 1734, in findLeafCoalesceable
                            Apr 27 10:57:41 uk SMGC: [26291]     self.gatherLeafCoalesceable(candidates)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 1766, in gatherLeafCoalesceable
                            Apr 27 10:57:41 uk SMGC: [26291]     if vdi.getConfig(vdi.DB_ONBOOT) == vdi.ONBOOT_RESET:
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 531, in getConfig
                            Apr 27 10:57:41 uk SMGC: [26291]     config = self.sr.xapi.getConfigVDI(self, key)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 385, in getConfigVDI
                            Apr 27 10:57:41 uk SMGC: [26291]     cfg = self.session.xenapi.VDI.get_on_boot(vdi.getRef())
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 527, in getRef
                            Apr 27 10:57:41 uk SMGC: [26291]     self._vdiRef = self.sr.xapi.getRefVDI(self)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 356, in getRefVDI
                            Apr 27 10:57:41 uk SMGC: [26291]     return self._getRefVDI(vdi.uuid)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/opt/xensource/sm/cleanup.py", line 353, in _getRefVDI
                            Apr 27 10:57:41 uk SMGC: [26291]     return self.session.xenapi.VDI.get_by_uuid(uuid)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/usr/lib/python2.7/site-packages/XenAPI.py", line 264, in __call__
                            Apr 27 10:57:41 uk SMGC: [26291]     return self.__send(self.__name, args)
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/usr/lib/python2.7/site-packages/XenAPI.py", line 160, in xenapi_request
                            Apr 27 10:57:41 uk SMGC: [26291]     result = _parse_result(getattr(self, methodname)(*full_params))
                            Apr 27 10:57:41 uk SMGC: [26291]   File "/usr/lib/python2.7/site-packages/XenAPI.py", line 238, in _parse_result
                            Apr 27 10:57:41 uk SMGC: [26291]     raise Failure(result['ErrorDescription'])
                            Apr 27 10:57:41 uk SMGC: [26291]
                            Apr 27 10:57:41 uk SMGC: [26291] *~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*
                            Apr 27 10:57:41 uk SMGC: [26291] * * * * * SR a20ee08c-40d0-9818-084f-282bbca1f217: ERROR
                            Apr 27 10:57:41 uk SMGC: [26291]
                            Apr 27 10:57:41 uk SM: [26334] Failed to join node(s): set([u'uk.dc1.xcp-ng-hyper3'])
                            Apr 27 10:57:41 uk SM: [26334] Synchronize metadata...
                            Apr 27 10:57:41 uk SM: [26334] LinstorSR.update for a20ee08c-40d0-9818-084f-282bbca1f217
                            Apr 27 11:02:20 uk SM: [2783] lock: opening lock file /var/lock/sm/a20ee08c-40d0-9818-084f-282bbca1f217/sr
                            Apr 27 11:02:20 uk SM: [2783] lock: acquired /var/lock/sm/a20ee08c-40d0-9818-084f-282bbca1f217/sr
                            Apr 27 11:02:22 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:24 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:25 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:26 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:28 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:02:29 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:30 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:32 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:02:32 uk SM: [2783] Connecting from config to LINSTOR controller using: 172.16.10.48
                            Apr 27 11:02:36 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:37 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:39 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:02:40 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:41 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:43 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:02:44 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:45 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:45 uk SM: [2783] Got exception: Unable to find controller uri.... Retry number: 0
                            Apr 27 11:02:45 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:46 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:47 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:48 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:49 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:51 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:52 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:54 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:02:55 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:55 uk SM: [2783] Connecting from config to LINSTOR controller using: 172.16.10.49
                            Apr 27 11:02:55 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:56 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:02:59 uk SM: message repeated 3 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:00 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:01 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:03 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:04 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:04 uk SM: [2783] Got exception: Unable to find controller uri.... Retry number: 0
                            Apr 27 11:03:04 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:06 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:09 uk SM: message repeated 3 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:10 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:11 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:14 uk SM: message repeated 3 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:14 uk SM: [2783] Connecting from config to LINSTOR controller using: 172.16.10.47
                            Apr 27 11:03:15 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:16 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:18 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:19 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:21 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:22 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:24 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:25 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:25 uk SM: [2783] Got exception: Unable to find controller uri.... Retry number: 0
                            Apr 27 11:03:25 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:26 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:29 uk SM: message repeated 3 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:30 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:31 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:33 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:34 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:34 uk SM: [2783] Connecting from config to LINSTOR controller using: 172.16.10.46
                            Apr 27 11:03:34 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:36 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:39 uk SM: message repeated 3 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:40 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:41 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:42 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:43 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:44 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:44 uk SM: [2783] Got exception: Unable to find controller uri.... Retry number: 0
                            Apr 27 11:03:44 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:45 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:46 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:48 uk SM: message repeated 2 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:49 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:50 uk SM: [2783] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:03:54 uk SM: message repeated 3 times: [ [2783] Raising exception [150, Failed to initialize XMLRPC connection]]
                            Apr 27 11:03:54 uk SM: [2783] Raising exception [47, The SR is not available [opterr=No valid controller URI to attach/detach from config]]
                            Apr 27 11:03:54 uk SM: [2783] lock: released /var/lock/sm/a20ee08c-40d0-9818-084f-282bbca1f217/sr
                            Apr 27 11:03:54 uk SM: [2783] ***** generic exception: vdi_attach_from_config: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=No valid controller URI to attach/detach from config]
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
                            Apr 27 11:03:54 uk SM: [2783]     return self._run_locked(sr)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/SRCommand.py", line 153, in _run_locked
                            Apr 27 11:03:54 uk SM: [2783]     target = sr.vdi(self.vdi_uuid)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/LinstorSR", line 634, in wrap
                            Apr 27 11:03:54 uk SM: [2783]     return load(self, *args, **kwargs)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/LinstorSR", line 504, in load
                            Apr 27 11:03:54 uk SM: [2783]     opterr='No valid controller URI to attach/detach from config'
                            Apr 27 11:03:54 uk SM: [2783]
                            Apr 27 11:03:54 uk SM: [2783] ***** LINSTOR resources on XCP-ng: EXCEPTION <class 'SR.SROSError'>, The SR is not available [opterr=No valid controller URI to attach/detach from config]
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/SRCommand.py", line 378, in run
                            Apr 27 11:03:54 uk SM: [2783]     ret = cmd.run(sr)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/SRCommand.py", line 110, in run
                            Apr 27 11:03:54 uk SM: [2783]     return self._run_locked(sr)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/SRCommand.py", line 153, in _run_locked
                            Apr 27 11:03:54 uk SM: [2783]     target = sr.vdi(self.vdi_uuid)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/LinstorSR", line 634, in wrap
                            Apr 27 11:03:54 uk SM: [2783]     return load(self, *args, **kwargs)
                            Apr 27 11:03:54 uk SM: [2783]   File "/opt/xensource/sm/LinstorSR", line 504, in load
                            Apr 27 11:03:54 uk SM: [2783]     opterr='No valid controller URI to attach/detach from config'
                            Apr 27 11:03:54 uk SM: [2783]
                            Apr 27 11:03:59 uk SM: [4037] Warning: vdi_[de]activate present for dummy
                            Apr 27 11:04:00 uk SM: [4164] lock: opening lock file /var/lock/sm/a20ee08c-40d0-9818-084f-282bbca1f217/sr
                            Apr 27 11:04:00 uk SM: [4164] lock: acquired /var/lock/sm/a20ee08c-40d0-9818-084f-282bbca1f217/sr
                            Apr 27 11:04:00 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:01 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:02 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:03 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:04 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:05 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:06 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:07 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:09 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:10 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:10 uk SM: [4164] Connecting from config to LINSTOR controller using: 172.16.10.48
                            Apr 27 11:04:10 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:11 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:12 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:13 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:14 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:15 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:16 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:17 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:18 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:19 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:19 uk SM: [4164] Got exception: Unable to find controller uri.... Retry number: 0
                            Apr 27 11:04:19 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:20 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:21 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:22 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:24 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:25 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:26 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:27 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:28 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:29 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:29 uk SM: [4164] Connecting from config to LINSTOR controller using: 172.16.10.49
                            Apr 27 11:04:29 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:30 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:31 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:32 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:33 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:34 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:35 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:36 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:37 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:39 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:39 uk SM: [4164] Got exception: Unable to find controller uri.... Retry number: 0
                            Apr 27 11:04:39 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:40 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:41 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]
                            Apr 27 11:04:42 uk SM: [4164] Raising exception [150, Failed to initialize XMLRPC connection]                     
                            
                            F 1 Reply Last reply Reply Quote 0
                            • F Offline
                              fred974 @fred974
                              last edited by

                              /var/log/xensource.log

                              https://pastebin.com/NUqGsSk6

                              I am not sure what to look for so I hope this is righ

                              F 1 Reply Last reply Reply Quote 0
                              • F Offline
                                fred974 @fred974
                                last edited by

                                Hope someone can help me understand what the issue is

                                1 Reply Last reply Reply Quote 0
                                • olivierlambertO Offline
                                  olivierlambert Vates 🪐 Co-Founder CEO
                                  last edited by olivierlambert

                                  Ronan is in vacation now, but he'll take a look when he's back 🙂 (tomorrow maybe, Monday I'm pretty sure)

                                  F 1 Reply Last reply Reply Quote 0
                                  • F Offline
                                    fred974 @olivierlambert
                                    last edited by

                                    @olivierlambert Thank you very much for letting me know

                                    F 1 Reply Last reply Reply Quote 0
                                    • F Offline
                                      fred974 @fred974
                                      last edited by

                                      @ronan-a are you able to help me with this problem? I added more info and log file on this thread too.

                                      ronan-aR 1 Reply Last reply Reply Quote 0
                                      • ronan-aR Offline
                                        ronan-a Vates 🪐 XCP-ng Team @fred974
                                        last edited by

                                        @fred974 Hi, well first, how many hosts do you have?
                                        We recommend to use at least 3 hosts, (4 is more robust). And also what's your replication count on your LINSTOR SR?
                                        I ask these questions because it's possible that a problem on a host has caused reboots on the whole pool and finally the emergency state.

                                        Now: can you share the kern.log files of each host? And execute this command (on each machine) please:

                                        drbdsetup status xcp-persistent-database
                                        
                                        F 1 Reply Last reply Reply Quote 0
                                        • F Offline
                                          fred974 @ronan-a
                                          last edited by fred974

                                          @ronan-a said in Lost access to all servers:

                                          well first, how many hosts do you have?

                                          We have 4x hosts.
                                          Host1 was the original master (host2 is new master) and I think the DRBD replication count is 3 (how can I double check?)
                                          Host1:

                                          [21:15 uk ~]# drbdsetup status xcp-persistent-database
                                          xcp-persistent-database role:Secondary
                                            disk:Diskless quorum:no
                                            uk.dc1.xcp-ng-hyper2 connection:Connecting
                                            uk.dc1.xcp-ng-hyper3 connection:Connecting
                                            uk.dc1.xcp-ng-hyper4 connection:Connecting
                                          

                                          Host2, 3 and 4 has

                                          [21:18 uk ~]# drbdsetup status xcp-persistent-database
                                          # No currently configured DRBD found.
                                          xcp-persistent-database: No such resource
                                          

                                          kern.log files host1
                                          host1_kern.log.txt

                                          kern.log files host2
                                          host2_kern.log.txt

                                          kern.log files host3
                                          host3_kern.log.txt

                                          kern.log files host4
                                          host4_kern.log.txt

                                          Our monitor reported the first VM been down at 11am which is reflected in the log file. We also have ourly snapshot so I was wondering if this could also been the reason why. I hope the file above can help us understand the issue. Also, should I put host1 back as master?

                                          Thank you

                                          ronan-aR 1 Reply Last reply Reply Quote 0
                                          • ronan-aR Offline
                                            ronan-a Vates 🪐 XCP-ng Team @fred974
                                            last edited by

                                            @fred974 I'll take a look at the logs. Thanks. What's the ouput of lvs? If the database is not active, execute: vgchange -ay linstor_group.

                                            F 1 Reply Last reply Reply Quote 0

                                            Hello! It looks like you're interested in this conversation, but you don't have an account yet.

                                            Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

                                            With your input, this post could be even better 💗

                                            Register Login
                                            • First post
                                              Last post