@manilx it is deffo interesting to see more proof that this but may be wider than expected.
Posts
-
RE: Epyc VM to VM networking slow
-
RE: Epyc VM to VM networking slow
@john-c That would only work for someone where it is possible to have non epyc machines handle this tho Unfortunately not an option for us.
-
RE: Epyc VM to VM networking slow
@olivierlambert Sure but the traffic still goes to a VM and through it and then out, That is also affected by the issue. Any traffic through a VM is affected by this bug as we have established earlier.
-
RE: Epyc VM to VM networking slow
@olivierlambert Define "Outside of master"? Because since backup traffic passes through the XOA VM if it is hitting this issue with VM traffic then that does not help does it?? Or do you mean outside of the pool entirely? As in a physical machine?
-
RE: Epyc VM to VM networking slow
@manilx Oh i absolutely agree that it is an issue... Maybe i could see that due to backups being handled by the XOA VM then whatever is causing our slowdowns for network between VMs (And out of VMs to external) might impact the networking and/or the process of the XOA backup process too.
What do you think @olivierlambert is these perhaps directly related? It sure would explain our very low backup speeds aswell that we see, (we have fully loaded synology FS2500's (all flash) with write intensive SSDs in.)
-
RE: Epyc VM to VM networking slow
@manilx i dont think it is directly related due to just how low it is. But we also see similar "Lower than expected speeds" on backups
-
RE: Epyc VM to VM networking slow
Spread network heavy VMs across the cluster as it is a per physical host limit and also changed our design a bit where we intended to have all levels of routers virtual we split out the core routers from that and they are physical.
-
RE: Epyc VM to VM networking slow
@LennertvdBerg they are still trying to figure this one out.
And an estimated full fix is not in sight just yet from what i know. Atleast i havent been informed in my ticket with them about this. But i do know they are still working very hard on this.
-
RE: Epyc VM to VM networking slow
@sluflyer06 This test does not say anything other than that you have a 10G nic and we already knew that the limit for latest gen amd's are just above 10G. If you insert an 25 G nic then you can only use half of that capacity likely and for some of us that are using this in actual datacenters that is a pretty critical issue.even more so when it seems the limit is shared per host so that 4 VMs running on same host if the limit is 12gbit means you get 3 gbit per vm. And when you realize lots of us may have 20-40 VMs per server that all use a decent portion of network it is suddenly really scary whenn you realize that is 300-600 mbit per server.
Or even worse when you realize that for those that have earlier gens of amd platform where the limit is 2-4 gbit ish.. now you re looking at 100-200 mbit per vm which suddenly is not very unobtainable for even a smaller provider during peak use times.
It is great that the issue is not triggered for you as your bottleneck is elsewhere, but it is a very serious issue for several of us.
With that said, Vates is handling it as good as anyone could request and i thank them for the attention given and the dedication to solving it.
It is a NASTY bug and very situational for it to have been discovered.
-
RE: Epyc VM to VM networking slow
I can unfortunately share that from ongoing ticket investigations in this, It is far more deeply rooted than something that a patch of going from one major kernel to another will "just fix" There are multiple leads being investigated and multiple vendors involved.
-
RE: Epyc VM to VM networking slow
@alex821982 no, we have seen no such indication... With that said we have not tested for that either and it is "possible" the main issue at hand is that all epycs processors hit an upper limit with network traffic going in and out of a VM. And it is a per physical host limit so if you have both VMs on same machine the limit is halved.
-
RE: Epyc VM to VM networking slow
@bleader said in Epyc VM to VM networking slow:
We're still actively working on it, we're still not a 100% sure what the root cause is unfortunately.
I can also vouch for that they are taking it seriously and working on it. I have an ticket with them since the start of this essentially and work is definitely being done in said ticket to solve this once and for all.
-
RE: Epyc VM to VM networking slow
@JamesG sure but none of those do concrete troubleshooting and digging to establish where it is and it also only seems like isolated issues and not something broad (while it is but people didnt look at it as such).
-
RE: Epyc VM to VM networking slow
@JamesG this bug has not been reported for two years. This thread is 6 months and our big report is open about the same amount of time.
It has had excellent attention since day one of us reporting it .
-
RE: Epyc VM to VM networking slow
@olivierlambert as @bleader mentioned. All testing shows that it is any VM networking at all. Vm to vm, vm to host, vm to external appliances are all equally affected. Just that vm to vm issue is half the bandwidth of all other usecases since i has to handle traffic to both VMs and as such is found faster. But no matter how the VM communicates there is an upper roof bandwidth limit that is VERY low.
-
RE: Epyc VM to VM networking slow
@manilx @florent it would make sense since XOA vm for backups is after all an VM and should be impacted by VM network issues for epycs.
-
RE: Epyc VM to VM networking slow
@Forza We have tried all the settings avail to tweak on our hardware, We have a full big twin chassi we have dedicated to vates doing testing of this issue and the first roughly month was spent going over settings together with vates and making sure everything was tweaked properly.
AMD is involved themself and if anyone knows AMD firmware settings and tweaking it would be AMD.
In all seriousness it is a very good suggestion but it has been looked into and unfortunately do minimal or no difference.
-
RE: Epyc VM to VM networking slow
@nicols Hello, you are not alone in this. We have delayed the deployment of our core ISP routing as virtual routers due to this.
We have an XOA premium and xcp-ng enterprise subscription and an open case on this.
I cannot go into detail about what is being said in there but i can say that the approach @olivierlambert and vates have is very good and they have my full confidence on this one.
@manilx i can say this affects all vm traffic as we have vm out to switch over to a second hardware host into a VM there and we see the limit there aswell just the roof is twice as high as inside the same HW box
Either way. Vates is VERY well aware of how serious this is and are taking good efforts to fix it.
-
RE: Epyc VM to VM networking slow
Heya!
Just chiming in that we (WDMAB) Are keeping tabs on this thread as well as our ongoing support ticket with you guys.
Saw our result up on the list.
If we can do ANYTHING further to assist then please do tell us. We are available 24/7 to solve this issue since it is very heavily impacting to our new production deployment.Regards.
Mathias W.