XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. johannes
    J
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 0
    • Posts 4
    • Groups 0

    johannes

    @johannes

    1
    Reputation
    3
    Profile views
    4
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    johannes Unfollow Follow

    Best posts made by johannes

    • RE: Lost access to all servers

      @ronan-a My pool is mostly working now, though I'm still seeing some quirks. I believe the problems I've seen thus far is due to the hardware instability of one of my hosts, as noted before. While frustrating, it has given me the opportunity to work through some host and SR recovery processes. That's the silver lining. šŸ™‚

      The main issue that I'm still seeing is when I attempt to load a specific stopped VM on a specific host (xcp-ng3). The VM will fail to start and throw this error:

      SR_BACKEND_FAILURE_46(, The VDI is not available [opterr=VDI
      a9d53abc-f492-4826-b659-744ee87d4d94 not detached cleanly], )
      

      This VM will start without issue on the other two hosts. XO is not showing any other issues with this VM and is not showing any orphaned VDIs.

      I am also seeing these errors in the /var/log/linstor-controller directory:

      Host xcp-ng1:

      ERROR REPORT 648FDBE6-00000-000000
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:39:06
      Node:                               xcp-ng1
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           Exception
      Class name:                         SocketException
      Class canonical name:               java.net.SocketException
      Generated at:                       Method 'bind0', Source file 'Net.java, Unknown line number
      
      Error message:                      Protocol family unavailable
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          bind0                                    Y      sun.nio.ch.Net:unknown
          bind                                     N      sun.nio.ch.Net:461
          bind                                     N      sun.nio.ch.Net:453
          bind                                     N      sun.nio.ch.ServerSocketChannelImpl:222
          bind                                     N      sun.nio.ch.ServerSocketAdaptor:85
          bindToChannelAndAddress                  N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:107
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:64
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:215
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:195
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:186
          start                                    N      org.glassfish.grizzly.http.server.NetworkListener:711
          start                                    N      org.glassfish.grizzly.http.server.HttpServer:256
          start                                    N      com.linbit.linstor.api.rest.v1.config.GrizzlyHttpService:314
          initialize                               N      com.linbit.linstor.systemstarter.GrizzlyInitializer:88
          startSystemServices                      N      com.linbit.linstor.core.ApplicationLifecycleManager:87
          start                                    N      com.linbit.linstor.core.Controller:365
          main                                     N      com.linbit.linstor.core.Controller:613
      
      
      END OF ERROR REPORT.
      

      Host xcp-ng2 (each truncated for post limit):

      ERROR REPORT 648E4932-00000-000012
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng1'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Execute single-stage API UpdateFreeCapacity
              |_ checkpoint ⇢ Fallback error handling wrapper
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000013
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng3'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         ErrorCallbackNotImplemented
      Class canonical name:               reactor.core.Exceptions.ErrorCallbackNotImplemented
      Generated at:                       <UNKNOWN>
      
      Error message:                      com.linbit.linstor.transaction.TransactionException: Failed to start transaction
      
      Call backtrace:
      
          Method                                   Native Class:Line number
      
      ERROR REPORT 648E4932-00000-000014
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               RestClient(127.0.0.1; 'PythonLinstor/1.15.1 (API1.0.4): Client 1.15.1')
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Modification of node 'xcp-ng1' failed due to an unknown exception.
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Auto-Quorum and -Tiebreaker after node create
              |_ checkpoint ⇢ Reconnect node(s)
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000015
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng2'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Execute single-stage API ApplyPropsFromStlt
              |_ checkpoint ⇢ Fallback error handling wrapper
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000016
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               RestClient(127.0.0.1; 'PythonLinstor/1.15.1 (API1.0.4): Client 1.15.1')
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Modification of node 'xcp-ng1' failed due to an unknown exception.
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Reconnect node(s)
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000017
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         ErrorCallbackNotImplemented
      Class canonical name:               reactor.core.Exceptions.ErrorCallbackNotImplemented
      Generated at:                       <UNKNOWN>
      
      Error message:                      com.linbit.linstor.transaction.TransactionException: Failed to start transaction
      
      Error context:
          Exception thrown by connection observer when outbound connection established
      
      Call backtrace:
      
          Method                                   Native Class:Line number
      
      ERROR REPORT 648E4932-00000-000018
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng1'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         IllegalStateException
      Class canonical name:               java.lang.IllegalStateException
      Generated at:                       Method 'assertOpen', Source file 'BaseGenericObjectPool.java', Line #759
      
      Error message:                      Pool not open
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Execute single-stage API NotifyDevMgrRunCompleted
              |_ checkpoint ⇢ Fallback error handling wrapper
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          assertOpen                               N      org.apache.commons.pool2.impl.BaseGenericObjectPool:759
      
      ERROR REPORT 648E4932-00000-000019
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Restore node
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000020
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Fetch thin capacity info
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000021
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Fetch thin capacity info
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000023
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               RestClient(127.0.0.1; 'PythonLinstor/1.15.1 (API1.0.4): Client 1.15.1')
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Modification of node 'xcp-ng1' failed due to an unknown exception.
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Reconnect node(s)
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000024
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Fetch thin capacity info
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      

      Host xcp-ng3:

      ERROR REPORT 64905A1B-00000-000000
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 09:37:36
      Node:                               xcp-ng3
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           Exception
      Class name:                         SocketException
      Class canonical name:               java.net.SocketException
      Generated at:                       Method 'bind0', Source file 'Net.java, Unknown line number
      
      Error message:                      Protocol family unavailable
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          bind0                                    Y      sun.nio.ch.Net:unknown
          bind                                     N      sun.nio.ch.Net:461
          bind                                     N      sun.nio.ch.Net:453
          bind                                     N      sun.nio.ch.ServerSocketChannelImpl:222
          bind                                     N      sun.nio.ch.ServerSocketAdaptor:85
          bindToChannelAndAddress                  N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:107
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:64
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:215
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:195
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:186
          start                                    N      org.glassfish.grizzly.http.server.NetworkListener:711
          start                                    N      org.glassfish.grizzly.http.server.HttpServer:256
          start                                    N      com.linbit.linstor.api.rest.v1.config.GrizzlyHttpService:314
          initialize                               N      com.linbit.linstor.systemstarter.GrizzlyInitializer:88
          startSystemServices                      N      com.linbit.linstor.core.ApplicationLifecycleManager:87
          start                                    N      com.linbit.linstor.core.Controller:365
          main                                     N      com.linbit.linstor.core.Controller:613
      
      
      END OF ERROR REPORT.
      

      Sorry, that was a lot of logs, but I don't know exactly what you'd be interested in seeing. It's possible that all of the logs from host xcp-ng2 are from the last time host xcp-ng1 melted down. I'm still trying to determine exactly when that happened overnight. šŸ™‚

      Let me know if you need to see anything else! Thank you!

      posted in Compute
      J
      johannes

    Latest posts made by johannes

    • RE: RTL8153 Compile

      @Andrew said in RTL8153 Compile:

      @etomm For the code I compiled it mostly as-it (with one change to support the different XCP kernel header version). I also added three udev rule changes. One is included with the driver code. One change to stop XCP scripts changing USB ethernet names. And one that is to support static USB ethernet names and rename unknown USB ethernet devices to protect the system from conflicts.

      I now have it working correctly with multiple USB ethernet adapters and stabled names that don't interfere with regular ethernet adapters. It still requires manually adding the MAC addresses and ethernet device names because XCP/Xen does not support USB ethernet devices.

      @Andrew would you be willing to share the udev rule changes that you made? Also, are you using 8.2 or 8.3 beta? Thank you!

      posted in Development
      J
      johannes
    • RE: Lost access to all servers

      @ronan-a My pool is mostly working now, though I'm still seeing some quirks. I believe the problems I've seen thus far is due to the hardware instability of one of my hosts, as noted before. While frustrating, it has given me the opportunity to work through some host and SR recovery processes. That's the silver lining. šŸ™‚

      The main issue that I'm still seeing is when I attempt to load a specific stopped VM on a specific host (xcp-ng3). The VM will fail to start and throw this error:

      SR_BACKEND_FAILURE_46(, The VDI is not available [opterr=VDI
      a9d53abc-f492-4826-b659-744ee87d4d94 not detached cleanly], )
      

      This VM will start without issue on the other two hosts. XO is not showing any other issues with this VM and is not showing any orphaned VDIs.

      I am also seeing these errors in the /var/log/linstor-controller directory:

      Host xcp-ng1:

      ERROR REPORT 648FDBE6-00000-000000
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:39:06
      Node:                               xcp-ng1
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           Exception
      Class name:                         SocketException
      Class canonical name:               java.net.SocketException
      Generated at:                       Method 'bind0', Source file 'Net.java, Unknown line number
      
      Error message:                      Protocol family unavailable
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          bind0                                    Y      sun.nio.ch.Net:unknown
          bind                                     N      sun.nio.ch.Net:461
          bind                                     N      sun.nio.ch.Net:453
          bind                                     N      sun.nio.ch.ServerSocketChannelImpl:222
          bind                                     N      sun.nio.ch.ServerSocketAdaptor:85
          bindToChannelAndAddress                  N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:107
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:64
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:215
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:195
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:186
          start                                    N      org.glassfish.grizzly.http.server.NetworkListener:711
          start                                    N      org.glassfish.grizzly.http.server.HttpServer:256
          start                                    N      com.linbit.linstor.api.rest.v1.config.GrizzlyHttpService:314
          initialize                               N      com.linbit.linstor.systemstarter.GrizzlyInitializer:88
          startSystemServices                      N      com.linbit.linstor.core.ApplicationLifecycleManager:87
          start                                    N      com.linbit.linstor.core.Controller:365
          main                                     N      com.linbit.linstor.core.Controller:613
      
      
      END OF ERROR REPORT.
      

      Host xcp-ng2 (each truncated for post limit):

      ERROR REPORT 648E4932-00000-000012
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng1'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Execute single-stage API UpdateFreeCapacity
              |_ checkpoint ⇢ Fallback error handling wrapper
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000013
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng3'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         ErrorCallbackNotImplemented
      Class canonical name:               reactor.core.Exceptions.ErrorCallbackNotImplemented
      Generated at:                       <UNKNOWN>
      
      Error message:                      com.linbit.linstor.transaction.TransactionException: Failed to start transaction
      
      Call backtrace:
      
          Method                                   Native Class:Line number
      
      ERROR REPORT 648E4932-00000-000014
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               RestClient(127.0.0.1; 'PythonLinstor/1.15.1 (API1.0.4): Client 1.15.1')
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Modification of node 'xcp-ng1' failed due to an unknown exception.
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Auto-Quorum and -Tiebreaker after node create
              |_ checkpoint ⇢ Reconnect node(s)
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000015
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng2'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Execute single-stage API ApplyPropsFromStlt
              |_ checkpoint ⇢ Fallback error handling wrapper
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000016
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               RestClient(127.0.0.1; 'PythonLinstor/1.15.1 (API1.0.4): Client 1.15.1')
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Modification of node 'xcp-ng1' failed due to an unknown exception.
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Reconnect node(s)
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000017
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         ErrorCallbackNotImplemented
      Class canonical name:               reactor.core.Exceptions.ErrorCallbackNotImplemented
      Generated at:                       <UNKNOWN>
      
      Error message:                      com.linbit.linstor.transaction.TransactionException: Failed to start transaction
      
      Error context:
          Exception thrown by connection observer when outbound connection established
      
      Call backtrace:
      
          Method                                   Native Class:Line number
      
      ERROR REPORT 648E4932-00000-000018
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               Node: 'xcp-ng1'
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         IllegalStateException
      Class canonical name:               java.lang.IllegalStateException
      Generated at:                       Method 'assertOpen', Source file 'BaseGenericObjectPool.java', Line #759
      
      Error message:                      Pool not open
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Execute single-stage API NotifyDevMgrRunCompleted
              |_ checkpoint ⇢ Fallback error handling wrapper
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          assertOpen                               N      org.apache.commons.pool2.impl.BaseGenericObjectPool:759
      
      ERROR REPORT 648E4932-00000-000019
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Restore node
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000020
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Fetch thin capacity info
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000021
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Fetch thin capacity info
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000023
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      Peer:                               RestClient(127.0.0.1; 'PythonLinstor/1.15.1 (API1.0.4): Client 1.15.1')
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Error context:
          Modification of node 'xcp-ng1' failed due to an unknown exception.
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Reconnect node(s)
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      
      ERROR REPORT 648E4932-00000-000024
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 00:38:51
      Node:                               xcp-ng2
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           RuntimeException
      Class name:                         TransactionException
      Class canonical name:               com.linbit.linstor.transaction.TransactionException
      Generated at:                       Method 'startTransaction', Source file 'ControllerSQLTransactionMgrGenerator.java', Line #32
      
      Error message:                      Failed to start transaction
      
      Asynchronous stage backtrace:
      
          Error has been observed at the following site(s):
              |_ checkpoint ⇢ Fetch thin capacity info
          Stack trace:
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          startTransaction                         N      com.linbit.linstor.transaction.manager.ControllerSQLTransactionMgrGenerator:32
      

      Host xcp-ng3:

      ERROR REPORT 64905A1B-00000-000000
      
      ============================================================
      
      Application:                        LINBITĀ® LINSTOR
      Module:                             Controller
      Version:                            1.21.1
      Build ID:                           a677db312062add13e9b230b8b902d43a69caf13
      Build time:                         2023-03-22T14:05:41+00:00
      Error time:                         2023-06-19 09:37:36
      Node:                               xcp-ng3
      
      ============================================================
      
      Reported error:
      ===============
      
      Category:                           Exception
      Class name:                         SocketException
      Class canonical name:               java.net.SocketException
      Generated at:                       Method 'bind0', Source file 'Net.java, Unknown line number
      
      Error message:                      Protocol family unavailable
      
      Call backtrace:
      
          Method                                   Native Class:Line number
          bind0                                    Y      sun.nio.ch.Net:unknown
          bind                                     N      sun.nio.ch.Net:461
          bind                                     N      sun.nio.ch.Net:453
          bind                                     N      sun.nio.ch.ServerSocketChannelImpl:222
          bind                                     N      sun.nio.ch.ServerSocketAdaptor:85
          bindToChannelAndAddress                  N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:107
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOBindingHandler:64
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:215
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:195
          bind                                     N      org.glassfish.grizzly.nio.transport.TCPNIOTransport:186
          start                                    N      org.glassfish.grizzly.http.server.NetworkListener:711
          start                                    N      org.glassfish.grizzly.http.server.HttpServer:256
          start                                    N      com.linbit.linstor.api.rest.v1.config.GrizzlyHttpService:314
          initialize                               N      com.linbit.linstor.systemstarter.GrizzlyInitializer:88
          startSystemServices                      N      com.linbit.linstor.core.ApplicationLifecycleManager:87
          start                                    N      com.linbit.linstor.core.Controller:365
          main                                     N      com.linbit.linstor.core.Controller:613
      
      
      END OF ERROR REPORT.
      

      Sorry, that was a lot of logs, but I don't know exactly what you'd be interested in seeing. It's possible that all of the logs from host xcp-ng2 are from the last time host xcp-ng1 melted down. I'm still trying to determine exactly when that happened overnight. šŸ™‚

      Let me know if you need to see anything else! Thank you!

      posted in Compute
      J
      johannes
    • RE: Lost access to all servers

      To make matters more interesting, my primary host (xcp-ng1) running XO crashed due to overheating. (I just love the fun curveballs my lab environment throws me! šŸ˜†) When I powered that machine back up, I received the same "Unable to connect to linstor://localhost:3370" error as above on xcp-ng1 and of course the XOSTOR SR was not operating with nodes 1 & 3 offline.

      After some cursory poking around, I rebooted xcp-ng1 and lo and behold... my XOSTOR SR was fully restored for all three nodes!

      Part of this is my ignorance on how the underlying elements of XOSTOR work, so I'm going to throw out some ideas here that may or may not be true. Feel free to correct / educate me!

      When I forced a reboot of xcp-ng3, the XOSTOR (LINSTOR?) controller did not automatically rejoin the xcp-ng3 node into the SR for whatever reason when it came back online. Then when xcp-ng1 powered off unexpectedly, the controller (which I believe was running on xcp-ng1 - again I'm not fully sure how this functionality works when distributed across the cluster) again did not resume when that machine powered back on. After a controlled, graceful reboot of xcp-ng1, the controller started and synced back up with all 3 nodes. Does this make sense or seem plausible? šŸ™‚ Any logs that would be valuable?

      Edit: One additional detail that I left out: when xcp-ng1 came back online, running 'drbdadm status' resulted in this message: "No currently configured DRBD found." After the graceful reboot, all was working properly again.

      posted in Compute
      J
      johannes
    • RE: Lost access to all servers

      @ronan-a

      I may be having a similar issue to the one you helped @fred974 with last month. One (xcp-ng3) of the three hosts on my lab environment dropped out of the SR in the past day or so. I believe it may have coincided with a failed live migration to that host and/or a forced reboot of the host after that failed migration. Here's what I am seeing on the affected host:

      [09:14 xcp-ng3 ~]# linstor node list
      Error: Unable to connect to linstor://localhost:3370: [Errno 99] Cannot assign requested address
      [09:14 xcp-ng3 ~]# drbdadm status
      xcp-persistent-database role:Secondary
        disk:UpToDate quorum:no
        xcp-ng1 connection:StandAlone
        xcp-ng2 connection:StandAlone
      

      The other two hosts, xcp-ng1 and xcp-ng2, are still operating without issue. XO sees xcp-ng3 and does not throw any errors unless I attempt any action that utilizes the SR (makes sense). It seems apparent that the linstor controller is not running, as any linstor command results in the connection error above. Thoughts? Any other logs you need?

      FWIW, I'd normally just wipe the host and reinstall, but I wanted to bring it to your attention in case there's any value to the project. 😁

      posted in Compute
      J
      johannes