XCP-ng
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Register
    • Login

    All NICs on XCP-NG Node Running in Promiscuous Mode

    Scheduled Pinned Locked Moved XCP-ng
    7 Posts 3 Posters 327 Views 3 Watching
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C Offline
      carldotcliff
      last edited by

      This is more an information gathering post since I have scoured the forums, asked the AIs, and really haven't gotten much feedback.

      XCP-NG 8.3.0

      I've setup a 3 node cluster managed by XOA with NFS shared storage. While working through some tuning options for the storage, we noticed that all of the parent NICs on the hypervisors run in promiscuous mode, and that traffic is passed onto the VIFs of the VMs and can be seen by using tcpdump or nethogs. This very well may be expected behavior, but it seems odd to flood the VMs with this traffic. We have a similar setup on a VMware environment that we're migrating off, and do not see this behavior. Do the interfaces need to run in promisc mode or can that be disabled somehow? I tried some xe commands to disable the behavior but that didn't seem to change any behavior.

      The NICs in use are Mellanox ConnectX-5 MT27800 dual-port 25GB. I have an open trunk on one of them use for creating VLAN networks on the hypervisors. The other port is a single native VLAN config used for storage traffic.

      I did also notice some drops registering in the OS RX counters (we've only deployed Alma Linux 8.10 VMs so far). I have not tried to track down where those drops are coming from, but given the minimal traffic and load on this environment, it's surprising to see.

      We do have Professional support and can open a ticket, but I figured I'd ask these forums before going that route. Thanks in advance everyone.

      1 Reply Last reply Reply Quote 0
      • olivierlambertO Offline
        olivierlambert Vates 🪐 Co-Founder CEO
        last edited by

        Not sure about this, asking @bleader

        1 Reply Last reply Reply Quote 0
        • bleaderB Offline
          bleader Vates 🪐 XCP-ng Team
          last edited by bleader

          I think the promisc mode is due to the fact the interfaces end up in OVS bridges, without that, the traffic coming from the outside to the VMs MAC addresses would be dropped.

          Once it reach the OVS bridge the interface is in, it is up to OVS to act as a switch and only forward packets to the MAC he knows on its ports so all the traffic should not be forwarded to all the VIFs.

          I just tested on 8.2 and 8.3:

          • tcpdumpping icmp on 2 VMs, pinging VM1 does not show traffic on VM2, pinging VM2 does not show traffic on VM1, pinging the host show no traffic on the VMs
          • tcpdumpping everything, only ignoring ssh (as I was logged in on both VM in ssh), the only traffic I see is the multicast traffic on the network.

          So to answer your question, yes it is normal the NICs are in promiscuous, but that should not lead to all traffic going to all the VMs.

          1 Reply Last reply Reply Quote 1
          • olivierlambertO Offline
            olivierlambert Vates 🪐 Co-Founder CEO
            last edited by

            Thanks for the test @bleader

            If you think it's worth documenting somewhere, let us know that I ask Thomas 🙂

            1 Reply Last reply Reply Quote 0
            • bleaderB Offline
              bleader Vates 🪐 XCP-ng Team
              last edited by

              @carldotcliff if you are 100% positive you see traffic on the VM that should not reach them, it is worth opening a ticket as this is not an intended behavior. If you do, tell in the ticket that this was discussed in the forum with David (me), so our support team can assign it to me if they want to.

              For the dropped packets, I do not see any on my home setup, which is a pretty "small" network, in our lab, we do have some on our hosts. On bigger network, that could be pretty much anything, broadcast or multicast reaching the host that the NIC is chosing to drop itself, some NIC will also drop some discovery protocol frames, it would be hard to identify unfortunately, but that would not worry me as long as it is not a high count and not impacting performances.

              C 1 Reply Last reply Reply Quote 0
              • C Offline
                carldotcliff @bleader
                last edited by

                @bleader

                Thanks for the quick replies! Below are some tcpdump examples of what I am seeing on the XCP-NG nodes as well as the VMs:

                small tcpdump from one of the trunk ports on a XCP-NG Server:

                09:41:18.814082 IP 10.10.20.1 > vrrp.mcast.net: VRRPv2, Advertisement, vrid 20, prio 200, authtype none, intvl 1s, length 20
                09:41:18.814500 ARP, Request who-has 21-WEST-SCANNER.belvederetrading.com tell chivmprdtemp002.belvederetrading.com, length 46
                09:41:18.821352 ARP, Request who-has chisrvpbx001.belvederetrading.com tell 10.0.1.78, length 46
                09:41:18.827963 IP 10.10.158.10 > chisrvgerrit001man.belvederetrading.com:  exptest-253 46
                09:41:18.836743 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev213man.belvederetrading.com.37330: Flags [.], ack 1116740055, win 1574, length 0
                09:41:18.844230 ARP, Request who-has CHIWKSDEV589.belvederetrading.com tell chiqfx5200-001.belvederetrading.com, length 46
                09:41:18.852142 IP chiqfx5200-001.belvederetrading.com > vrrp.mcast.net: VRRPv2, Advertisement, vrid 25, prio 200, authtype none, intvl 1s, length 20
                09:41:18.853338 IP 192.168.130.48.mdns > mdns.mcast.net.mdns: 0 [5q] PTR (QU)? _hap._tcp.local. PTR (QU)? _companion-link._tcp.local. PTR (QU)? _rdlink._tcp.local. PTR (QU)? _hap._udp.local. PTR (QU)? _sleep-proxy._udp.local. (104)
                09:41:18.853880 IP6 fe80::4d8:5005:7c5:9c77.mdns > ff02::fb.mdns: 0 [5q] PTR (QU)? _hap._tcp.local. PTR (QU)? _companion-link._tcp.local. PTR (QU)? _rdlink._tcp.local. PTR (QU)? _hap._udp.local. PTR (QU)? _sleep-proxy._udp.local. (104)
                09:41:18.868284 ARP, Reply CHIWKSDEV445.belvederetrading.com is-at 58:6c:25:ca:4c:d9 (oui Unknown), length 46
                09:41:18.878062 IP 10.10.158.10 > chisrvgerrit001man.belvederetrading.com:  exptest-253 46
                09:41:18.888001 ARP, Request who-has 192.168.130.190 tell chiqfx5200-002.belvederetrading.com, length 46
                09:41:18.888014 ARP, Request who-has chivmdevtst088.belvederetrading.com tell chiqfx5200-002.belvederetrading.com, length 46
                09:41:18.888022 ARP, Request who-has CHIWKSDEV407.belvederetrading.com tell chiqfx5200-001.belvederetrading.com, length 46
                09:41:18.888025 ARP, Request who-has CHIWKSADM211.belvederetrading.com tell chiqfx5200-002.belvederetrading.com, length 46
                09:41:18.888193 IP 192.168.130.187.mdns > mdns.mcast.net.mdns: 0 PTR (QM)? lb._dns-sd._udp.local. (39)
                09:41:18.888322 IP6 fe80::18f4:8194:26e4:aa31.mdns > ff02::fb.mdns: 0 PTR (QM)? lb._dns-sd._udp.local. (39)
                09:41:18.894247 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.124.38420: Flags [P.], seq 1578138028:1578138277, ack 3622223145, win 2132, length 249
                09:41:18.894292 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.124.38420: Flags [F.], seq 249, ack 1, win 2132, length 0
                09:41:18.894828 STP 802.1w, Rapid STP, Flags [Learn, Forward, Agreement], bridge-id 8020.00:1c:73:ac:15:57.8033, length 42
                09:41:18.905751 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.86.36826: Flags [.], ack 2910688230, win 1492, length 0
                09:41:18.905804 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.86.36826: Flags [.], ack 16, win 1492, length 0
                09:41:18.907760 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.86.36826: Flags [.], ack 262, win 1491, length 0
                09:41:18.907802 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.86.36826: Flags [.], ack 754, win 1488, length 0
                

                As you can see, traffic from all subnets is visible to that NIC, which with the NIC running in promiscuous mode is expected since it's open to all VLANs.

                small tcpdump from one VM with an interface using a VIF on VLAN 208 from the NIC mentioned above:

                09:47:32.030953 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.124.38530: Flags [.], ack 62746, win 1619, length 0
                09:47:32.030954 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.124.38530: Flags [.], ack 63272, win 1619, length 0
                09:47:32.030955 IP chisrvprdflx004.belvederetrading.com.d-s-n > 10.10.208.124.38530: Flags [.], ack 63277, win 1619, length 0
                09:47:32.031001 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev210man.belvederetrading.com.57040: Flags [.], ack 104812, win 5836, length 0
                09:47:32.031047 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev210man.belvederetrading.com.57040: Flags [.], ack 106534, win 5852, length 0
                09:47:32.031177 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev210man.belvederetrading.com.57040: Flags [.], ack 109470, win 5830, length 0
                09:47:32.031210 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev210man.belvederetrading.com.57040: Flags [.], ack 110716, win 5863, length 0
                09:47:32.041909 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev203man.belvederetrading.com.57316: Flags [.], ack 262, win 2826, length 0
                09:47:32.041942 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev203man.belvederetrading.com.57316: Flags [.], ack 508, win 2825, length 0
                09:47:32.041975 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev203man.belvederetrading.com.57316: Flags [.], ack 754, win 2824, length 009:47:32.162042 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev199man.belvederetrading.com.52024: Flags [.], ack 81426, win 2340, length 0
                09:47:32.162052 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev199man.belvederetrading.com.52024: Flags [.], ack 82426, win 2340, length 0
                09:47:32.162073 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev199man.belvederetrading.com.52024: Flags [.], ack 85378, win 2340, length 0
                09:47:32.162330 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev199man.belvederetrading.com.52024: Flags [.], ack 98908, win 2235, length 0
                09:47:32.162474 IP chisrvprdflx004.belvederetrading.com.d-s-n > chisrvdev199man.belvederetrading.com.52024: Flags [.], ack 101959, win 2212, length 0
                

                All of that traffic is from other hosts on VLAN 208, but looking at the NIC config, promiscuity is set to 0, so it should not be getting that traffic passed to it:

                2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
                    link/ether 0e:2a:8d:d8:3b:b0 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 addrgenmode none numtxqueues 8 numrxqueues 8 gso_max_size 65536 gso_max_segs 65535 parentbus xen parentdev vif-0
                

                While typing this up though, I got in touch with one of my network admins. It appears our QFX switches have a bug where ARP entries are not being. discovered properly, leading to this kind of behavior. We were able to manually add entries and prevent this from happening, so I'll likely be on the hunt for non-existent ARP entries to clean up what I am seeing, and will go from there if the issue persists.

                Thanks again for the follow up!

                bleaderB 1 Reply Last reply Reply Quote 0
                • bleaderB Offline
                  bleader Vates 🪐 XCP-ng Team @carldotcliff
                  last edited by

                  Running tcpdump switches the interface to promiscuous to allow all traffic that reaches the NIC to be dumped. So I assume the issue you had on your switches allowed traffic to reach the host, that was forwarding it to the VMs, and wasn't dropped because tcpdump switched the VIF into promiscuous mode.

                  If it seems resolved, that's good, otherwise let us know if we need to investigate further on this 🙂

                  1 Reply Last reply Reply Quote 1
                  • First post
                    Last post