XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login
    1. Home
    2. darabontors
    3. Posts
    D
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 2
    • Posts 35
    • Groups 0

    Posts

    Recent Best Controversial
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      I found the problem.
      I am using OPNsense and forgot to disable TX checksum offloading. Very interesting that this checksum offloading caused catastrophic network disruptions on a Realtek nic, but no noticeable performance hit on Intel nics. This was an old host that featured a Realtek card. All my recent hosts that I use have only Intel nics. That is why I forgot about the whole offloading thing.

      Thanks for the tips.

      Best wishes to the whole community!

      posted in Management
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      Yes.

      posted in Management
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      @Danp
      The is WireGuard site to site VPN set up. If ping works from inside the VM hosting Xen Orchestra how can Xen Orchestra have no access?

      I am almost sure it is a certificate issue of some kind. I would like to generate a new certificate or somehow make Xen Orchestra ignore the certificate. I think XCP-ng Center ignores it by default, that is why it works from XCP-ng Center.

      What do you thing?

      posted in Management
      D
      darabontors
    • RE: Xen Orchestra cannot connect to XCP-ng Host

      @Danp Thanks for responding.
      I dont't use a HTTP proxy. I do have ping from Xen Orchestra to the host and from the host to the Xen Orchestra.

      I did get this error message in the logs:

      server.enable
      {
      "id": "XXXXXXXXXXXXX"
      }
      {
      "originalUrl": "https://X.X.X.X/jsonrpc",
      "url": "https://X.X.X.X/jsonrpc",
      "call": {
      "method": "session.login_with_password",
      "params": "* obfuscated *"
      },
      "message": "408 Request Timeout",
      "name": "Error",
      "stack": "Error: 408 Request Timeout
      at Object.assertSuccess (/opt/xen-orchestra/node_modules/http-request-plus/index.js:162:19)
      at httpRequestPlus (/opt/xen-orchestra/node_modules/http-request-plus/index.js:217:22)
      at file:///opt/xen-orchestra/packages/xen-api/transports/json-rpc.mjs:13:17"
      }

      I can connect via XCP-ng Center to the host, no problem. It's just Xen Orchestra that can't connect.

      posted in Management
      D
      darabontors
    • Xen Orchestra cannot connect to XCP-ng Host

      Dear community,

      I have a strange connection problem. I have the following situation:
      I need to install XCP-ng with DHCP assigned IP address so that I can connect it to my Xen Orchestra. I can connect to the host with this DHCP IP address. After I finish setting up my XCP-ng from Xen Orchestra, I need to give the host a new IP for management. A static IP, on a VLAN network.

      After the IP change, I could connect to the host with this new IP. After moving the host to a different location, suddenly there is an unspecified connection error while connecting to the host. This problem is only between Xen Orchestra and the host. I can connect with XCP-ng Center to the host, no problem. All networking works as should.

      I mention that when I changed the IP of the host, I also changed the root password.

      I suspect it is a certificate issue. It is the self signed certificate that XCP-ng generated during installation.

      The host is not exposed to the public internet. I use a VPN to connect it to Xen Orchestra.

      I'm using Xen Orchestra from the sources.

      Please help me fix this issue. This is a remote host and I already reinstalled XCP-ng, but the issue came back.

      posted in Management
      D
      darabontors
    • RE: Very scary host reboot issue

      @tuxen

      1. Absolutely no idea how to do this in Windows. I looked for any MTU setting but couldn't find any.
      2. This is not a viable workaround for me, maybe it would be useful to pin the issue to the xen PV driver, maybe I'll do some more testing on spare hardware.
      3. I read this, but I don't know how to test it. I didn't have any manual MTUs set so I don't know what values were before the update.

      What most definitely fixed the issue for me was using PCIe passthrough for the WAN interface. I used a 10 GbE NIC. It uses the ix driver (ix0) so IDK if this is related. Somehow PPPoE + WG + Windows Client on the virtual interface (Xen PV driver) in OPNsense produces this issue.
      At the moment I am happy with this mitigation.

      I'm a little spread thin with free time at the moment. Anyone care to test this further?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @Andrew That makes sense. I think I'll do just this. In the meantime I'll try to replicate the phenomenon on test hardware. I really need a permanent fix for this..

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert I'm thinking of a quick workaround. What if I use pci pass-through for the LAN and WAN interfaces and I physically connect the LAN port to another non PCIe pass-through port of the server and I use that port toninterface with my other VMs via OVS? Does it make any sense? Does it seem viable to mitigate this issue?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert said in Very scary host reboot issue:

      FreeBSD PV driver inside OPNsense or Pfsense.

      Who is maintaining the FreeBSD PV drivers?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      I found the MTU parameter. This time it was 1420 on both OPNsense WG interface and in Windows (client side). I was happy for about 5 minutes as I wasn't able to reproduce the crash, but then it happened again. My "favorite" way to trigger it is by pausing the file transfer, waiting for a couple of minutes and then resuming it. The transfer's MB/s jumps up like crazy in Windows, then freezes until it gets in sync with the real progress of the transfer. After two tries of pausing and resuming, the crash happened.

      @olivierlambert I use this setup on my infrastructure and my clients since at least 4 years. I never experienced this issue until as recent as September this year. You guys saw this issue ~6 months ago. Isn't there a way to backtrack any recent updates to Openswitch? I know it might be some updates on the FreeBSD side that made this openswitch bug surface just in recent times... I know there was little to no development on the WireGuard side of things this year.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @tuxen You might be on to something. I need to clarify something. I am positive this issue is related to the Windows WireGuard client. On the same host, same OPNsense VM I have 10+ SitetoSite Wireguard connections configured moving 100+ GB daily and the host never reboots. I can only trigger it from a Windows WG connection.

      How do I verify MTU size for the WG connection in Windows 11? I cannot find it for the life of me...

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert I can confirm 100% that a workstation DELL that I use at one of my clients did the same thing.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      It is the DELL X540 2 x 10 GbE and 2 x 1 GbE daughter board in a DELL R720.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @Andrew
      eth4 is LAN:
      driver: igb
      version: 5.3.5.20
      firmware-version: 1.67, 0x80000fc9, 19.5.12
      expansion-rom-version:
      bus-info: 0000:0a:00.0
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: yes
      supports-register-dump: yes
      supports-priv-flags: no

      eth5 is WAN:
      driver: igb
      version: 5.3.5.20
      firmware-version: 1.67, 0x80000fc9, 19.5.12
      expansion-rom-version:
      bus-info: 0000:0a:00.1
      supports-statistics: yes
      supports-test: yes
      supports-eeprom-access: yes
      supports-register-dump: yes
      supports-priv-flags: no

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      I do have an update. I tried it from a Windows 10 VM. Same issue. I uninstalled VMware Player on the Windows 10 VM just to be sure. The reboot happened.

      I tried copying a the same file from my fileserver to my laptop and I couldn't cause the reboot. It only happens when I transfer files from my laptop to my server. So only sent traffic from the laptop's perspective to my OPNsense VM produces the reboot.

      I checked, TX checksumming is disabled on my OPNsense VM VIFs.

      I can confirm 100% I didn't have this issue before September this year. Maybe it is related to WireGuard version on server or client side.

      OPNsense version 23.7.2
      wireguard-kmod 0.0.20220615_1
      wireguard-tools 1.0.20210914_1
      OPNsense has xn0 for WAN and xn1 for LAN

      On my other host that also produced the reboot the hardware setup is different. The metal itself is different but more notably WAN is connected through a Dualport Intel NIC via PCIe Passthrough. Host reboot happened while copying an ISO through the WG tunnel to the host local ISO repository. So potentially the LAN xn0 produced the vSwitch crash. It couldn't happen on the WAN interface.

      In summary: I managed to reproduce the issue 4 times within 2 hours. It should be replicable. Maybe I'll spin up a completely new setup to try to replicate this outside my current production host.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert It's the least I can do. I really like XCP-ng and Xen Orchestra. I have around 15 clients with XCP-ng stacks in production. I run an MSP company. You understand this issue scares me a lot. Right now I'm randomly rebooting my own production server where a bunch of VM and TrueNAS backups land. I am fully motivated to mitigate this issue.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      @olivierlambert Just produced another reboot. I'm closing in on the way to replicate this issue.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      I continued with the transfer capped at 100 Mb/s (capped by WireGuard most probably) and after ~8 GB transferred, suddenly my tunnel collapsed. After short while, less than 2 minutes it came back up while no host reboot happened. WireGuard crashed somehow but didn't cause the Dom0 crash.

      Some other detail that might be unrelated: my PPPoE connection to my ISP has MTU 1492. WireGuard connection also has MTU 1492. Is this relevant in any way?

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      I just triggered the reboot with my setup I detailed above. I started transferring 26 GB worth of video files through my tunnel. My host restarted. I continued the transfer and now strangely somehow my tunnel is capped at 100 Mb/s.

      During the transfer when the host reboot happened I was having 300 Mb/s.

      So strange behavior.

      posted in XCP-ng
      D
      darabontors
    • RE: Very scary host reboot issue

      Guys, I might be onto something.

      I started having this issue in September this year, right after switching to a new laptop with Windows 11.

      I also have VMWare Player and VirtualBox installed on my laptop.

      I have a weird issue often with WG not being able to bring up the tunnel with an error message. I googled the error and it was something related to the other virtual network interfaces VirtualBox and VMWare player installs.

      I think the issue could be related to Windows 11 and my other Type 2 Virtualization platforms.

      I did try on my other laptop running Windows 10 and having VirtualBox installed and the host reboot isn't triggered.

      Could someone help replicate this specific combo that I have?

      posted in XCP-ng
      D
      darabontors