XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. darabontors
    D
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 35
    • Groups 0

    darabontors

    @darabontors

    3
    Reputation
    6
    Profile views
    35
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    darabontors Unfollow Follow

    Best posts made by darabontors

    • RE: Very scary host reboot issue

      @olivierlambert Just produced another reboot. I'm closing in on the way to replicate this issue.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue
      [ 334371.865769]  ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      [ 334371.865787]   INFO: PGD 2250ed067 P4D 2250ed067 PUD 228c9f067 PMD 0
      [ 334371.865803]   WARN: Oops: 0000 [#1] SMP NOPTI
      [ 334371.865810]   WARN: CPU: 9 PID: 57 Comm: ksoftirqd/9 Tainted: G           O      4.19.0+1 #1
      [ 334371.865818]   WARN: Hardware name: Dell Inc. PowerEdge R720/0C4Y3R, BIOS 2.9.0 12/06/2019
      [ 334371.865832]   WARN: RIP: e030:skb_copy_ubufs+0x19c/0x5f0
      [ 334371.865839]   WARN: Code: 90 cc 00 00 00 48 03 90 d0 00 00 00 48 63 44 24 40 48 83 c0 03 48 c1 e0 04 48 01 d0 48 89 18 c7 40 08 00 00 00 00 44 89 78 0c <48> 8b 43 08 a8 01 0f 85 3f 04 00 00 48 8b 44 24 30 48 83 78 20 ff
      [ 334371.865858]   WARN: RSP: e02b:ffffc9004026b6f8 EFLAGS: 00010282
      [ 334371.865864]   WARN: RAX: ffff888099621ae0 RBX: 0000000000000000 RCX: 00000000000000c0
      [ 334371.865873]   WARN: RDX: ffff888099621ac0 RSI: ffff888099621ac0 RDI: ffffea00031da880
      [ 334371.865881]   WARN: RBP: 0000000000000000 R08: ffff888099621a00 R09: ffff8881f0d43e98
      [ 334371.865890]   WARN: R10: ffffc9004026b8b0 R11: 0000000000000000 R12: ffff888096e61c00
      [ 334371.865898]   WARN: R13: 0000000000000000 R14: ffff88822b867a80 R15: 0000000000000000
      [ 334371.865918]   WARN: FS:  0000000000000000(0000) GS:ffff88822d440000(0000) knlGS:0000000000000000
      [ 334371.865927]   WARN: CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 334371.865935]   WARN: CR2: 0000000000000008 CR3: 00000002281aa000 CR4: 0000000000040660
      [ 334371.865949]   WARN: Call Trace:
      [ 334371.865958]   WARN:  skb_clone+0x71/0xa0
      [ 334371.865968]   WARN:  do_execute_actions+0x4ec/0x1750 [openvswitch]
      [ 334371.865978]   WARN:  ? ovs_dp_process_packet+0x7d/0x110 [openvswitch]
      [ 334371.865988]   WARN:  ? ovs_vport_receive+0x6e/0xd0 [openvswitch]
      [ 334371.865997]   WARN:  ? arch_local_irq_restore+0x5/0x10
      [ 334371.866005]   WARN:  ? get_page_from_freelist+0xa4f/0xf00
      [ 334371.866012]   WARN:  ? arch_local_irq_restore+0x5/0x10
      [ 334371.866020]   WARN:  ? get_page_from_freelist+0xa4f/0xf00
      [ 334371.866031]   WARN:  ovs_execute_actions+0x47/0x120 [openvswitch]
      [ 334371.866040]   WARN:  ovs_dp_process_packet+0x7d/0x110 [openvswitch]
      [ 334371.866050]   WARN:  ? key_extract+0xa53/0xd60 [openvswitch]
      [ 334371.866058]   WARN:  ovs_vport_receive+0x6e/0xd0 [openvswitch]
      [ 334371.866066]   WARN:  ? __alloc_skb+0x4e/0x270
      [ 334371.866075]   WARN:  ? notify_remote_via_irq+0x4a/0x70
      [ 334371.866085]   WARN:  ? __raw_callee_save_xen_vcpu_stolen+0x11/0x20
      [ 334371.866091]   WARN:  ? __alloc_skb+0x76/0x270
      [ 334371.866100]   WARN:  ? arch_local_irq_restore+0x5/0x10
      [ 334371.866108]   WARN:  ? __slab_alloc.constprop.81+0x42/0x4e
      [ 334371.866114]   WARN:  ? __alloc_skb+0x4e/0x270
      [ 334371.866120]   WARN:  ? __kmalloc_track_caller+0x58/0x200
      [ 334371.866127]   WARN:  ? __slab_alloc.constprop.81+0x42/0x4e
      [ 334371.866136]   WARN:  ? __kmalloc_reserve.isra.48+0x29/0x70
      [ 334371.866146]   WARN:  netdev_frame_hook+0x105/0x180 [openvswitch]
      [ 334371.866154]   WARN:  __netif_receive_skb_core+0x211/0xb30
      [ 334371.866163]   WARN:  __netif_receive_skb_one_core+0x36/0x70
      [ 334371.866170]   WARN:  netif_receive_skb_internal+0x34/0xe0
      [ 334371.866179]   WARN:  xenvif_tx_action+0x55c/0x990
      [ 334371.866187]   WARN:  xenvif_poll+0x27/0x70
      [ 334371.866193]   WARN:  net_rx_action+0x2a5/0x3e0
      [ 334371.866200]   WARN:  __do_softirq+0xd1/0x28c
      [ 334371.866208]   WARN:  run_ksoftirqd+0x26/0x40
      [ 334371.866215]   WARN:  smpboot_thread_fn+0x10e/0x160
      [ 334371.866223]   WARN:  kthread+0xf8/0x130
      [ 334371.866229]   WARN:  ? sort_range+0x20/0x20
      [ 334371.866235]   WARN:  ? kthread_bind+0x10/0x10
      [ 334371.866242]   WARN:  ret_from_fork+0x35/0x40
      [ 334371.866250]   WARN: Modules linked in: tun bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat 8021q garp mrp stp llc dm_multipath ipt_REJECT nf_reject_ipv4 xt_tcpu$
      [ 334371.866374]   WARN:  scsi_mod efivarfs ipv6 crc_ccitt
      [ 334371.866384]   WARN: CR2: 0000000000000008
      [ 334371.866396]   WARN: ---[ end trace 8b74661a79be8268 ]---
      [ 334371.868712]   WARN: RIP: e030:skb_copy_ubufs+0x19c/0x5f0
      [ 334371.868721]   WARN: Code: 90 cc 00 00 00 48 03 90 d0 00 00 00 48 63 44 24 40 48 83 c0 03 48 c1 e0 04 48 01 d0 48 89 18 c7 40 08 00 00 00 00 44 89 78 0c <48> 8b 43 08 a8 01 0f 85 3f 04 00 00 48 8b 44 24 30 48 83 78 20 ff
      [ 334371.868740]   WARN: RSP: e02b:ffffc9004026b6f8 EFLAGS: 00010282
      [ 334371.868748]   WARN: RAX: ffff888099621ae0 RBX: 0000000000000000 RCX: 00000000000000c0
      [ 334371.868759]   WARN: RDX: ffff888099621ac0 RSI: ffff888099621ac0 RDI: ffffea00031da880
      [ 334371.868769]   WARN: RBP: 0000000000000000 R08: ffff888099621a00 R09: ffff8881f0d43e98
      [ 334371.868778]   WARN: R10: ffffc9004026b8b0 R11: 0000000000000000 R12: ffff888096e61c00
      [ 334371.868788]   WARN: R13: 0000000000000000 R14: ffff88822b867a80 R15: 0000000000000000
      [ 334371.868805]   WARN: FS:  0000000000000000(0000) GS:ffff88822d440000(0000) knlGS:0000000000000000
      [ 334371.868815]   WARN: CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 334371.868823]   WARN: CR2: 0000000000000008 CR3: 00000002281aa000 CR4: 0000000000040660
      [ 334371.868837]  EMERG: Kernel panic - not syncing: Fatal exception in interrupt
      
      posted in XCP-ng
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      I found the problem.
      I am using OPNsense and forgot to disable TX checksum offloading. Very interesting that this checksum offloading caused catastrophic network disruptions on a Realtek nic, but no noticeable performance hit on Intel nics. This was an old host that featured a Realtek card. All my recent hosts that I use have only Intel nics. That is why I forgot about the whole offloading thing.

      Thanks for the tips.

      Best wishes to the whole community!

      posted in Management
      D
      darabontors

    Latest posts made by darabontors

    • RE: Xen Orchestra cannot connect to XCP-ng Host

      I found the problem.
      I am using OPNsense and forgot to disable TX checksum offloading. Very interesting that this checksum offloading caused catastrophic network disruptions on a Realtek nic, but no noticeable performance hit on Intel nics. This was an old host that featured a Realtek card. All my recent hosts that I use have only Intel nics. That is why I forgot about the whole offloading thing.

      Thanks for the tips.

      Best wishes to the whole community!

      posted in Management
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      Yes.

      posted in Management
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      @Danp
      The is WireGuard site to site VPN set up. If ping works from inside the VM hosting Xen Orchestra how can Xen Orchestra have no access?

      I am almost sure it is a certificate issue of some kind. I would like to generate a new certificate or somehow make Xen Orchestra ignore the certificate. I think XCP-ng Center ignores it by default, that is why it works from XCP-ng Center.

      What do you thing?

      posted in Management
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      @Danp Thanks for responding.
      I dont't use a HTTP proxy. I do have ping from Xen Orchestra to the host and from the host to the Xen Orchestra.

      I did get this error message in the logs:

      server.enable
      {
      "id": "XXXXXXXXXXXXX"
      }
      {
      "originalUrl": "https://X.X.X.X/jsonrpc",
      "url": "https://X.X.X.X/jsonrpc",
      "call": {
      "method": "session.login_with_password",
      "params": "* obfuscated *"
      },
      "message": "408 Request Timeout",
      "name": "Error",
      "stack": "Error: 408 Request Timeout
      at Object.assertSuccess (/opt/xen-orchestra/node_modules/http-request-plus/index.js:162:19)
      at httpRequestPlus (/opt/xen-orchestra/node_modules/http-request-plus/index.js:217:22)
      at file:///opt/xen-orchestra/packages/xen-api/transports/json-rpc.mjs:13:17"
      }

      I can connect via XCP-ng Center to the host, no problem. It's just Xen Orchestra that can't connect.

      posted in Management
      D
      darabontors
    • Xen Orchestra cannot connect to XCP-ng Host

      Dear community,

      I have a strange connection problem. I have the following situation:
      I need to install XCP-ng with DHCP assigned IP address so that I can connect it to my Xen Orchestra. I can connect to the host with this DHCP IP address. After I finish setting up my XCP-ng from Xen Orchestra, I need to give the host a new IP for management. A static IP, on a VLAN network.

      After the IP change, I could connect to the host with this new IP. After moving the host to a different location, suddenly there is an unspecified connection error while connecting to the host. This problem is only between Xen Orchestra and the host. I can connect with XCP-ng Center to the host, no problem. All networking works as should.

      I mention that when I changed the IP of the host, I also changed the root password.

      I suspect it is a certificate issue. It is the self signed certificate that XCP-ng generated during installation.

      The host is not exposed to the public internet. I use a VPN to connect it to Xen Orchestra.

      I'm using Xen Orchestra from the sources.

      Please help me fix this issue. This is a remote host and I already reinstalled XCP-ng, but the issue came back.

      posted in Management
      D
      darabontors
    • RE: Very scary host reboot issue

      @tuxen

      1. Absolutely no idea how to do this in Windows. I looked for any MTU setting but couldn't find any.
      2. This is not a viable workaround for me, maybe it would be useful to pin the issue to the xen PV driver, maybe I'll do some more testing on spare hardware.
      3. I read this, but I don't know how to test it. I didn't have any manual MTUs set so I don't know what values were before the update.

      What most definitely fixed the issue for me was using PCIe passthrough for the WAN interface. I used a 10 GbE NIC. It uses the ix driver (ix0) so IDK if this is related. Somehow PPPoE + WG + Windows Client on the virtual interface (Xen PV driver) in OPNsense produces this issue.
      At the moment I am happy with this mitigation.

      I'm a little spread thin with free time at the moment. Anyone care to test this further?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @Andrew That makes sense. I think I'll do just this. In the meantime I'll try to replicate the phenomenon on test hardware. I really need a permanent fix for this..

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert I'm thinking of a quick workaround. What if I use pci pass-through for the LAN and WAN interfaces and I physically connect the LAN port to another non PCIe pass-through port of the server and I use that port toninterface with my other VMs via OVS? Does it make any sense? Does it seem viable to mitigate this issue?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert said in Very scary host reboot issue:

      FreeBSD PV driver inside OPNsense or Pfsense.

      Who is maintaining the FreeBSD PV drivers?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      I found the MTU parameter. This time it was 1420 on both OPNsense WG interface and in Windows (client side). I was happy for about 5 minutes as I wasn't able to reproduce the crash, but then it happened again. My "favorite" way to trigger it is by pausing the file transfer, waiting for a couple of minutes and then resuming it. The transfer's MB/s jumps up like crazy in Windows, then freezes until it gets in sync with the real progress of the transfer. After two tries of pausing and resuming, the crash happened.

      @olivierlambert I use this setup on my infrastructure and my clients since at least 4 years. I never experienced this issue until as recent as September this year. You guys saw this issue ~6 months ago. Isn't there a way to backtrack any recent updates to Openswitch? I know it might be some updates on the FreeBSD side that made this openswitch bug surface just in recent times... I know there was little to no development on the WireGuard side of things this year.

      posted in XCP-ng
      D
      darabontors