The past couple of days have been pretty nuts, but I've dabbled with testing this and in my configuration with XCP-ng 8.3 with all currently released patches, I top out at 15Gb/s with 8 threads on Win 10. Going further to 16 threads or beyond doesn't really improve things.
Killing core boost, SMT, and setting deterministic performance in BIOS added nearly 2Gb/s on single-threaded iperf.
When running the iperf and watching htop on the XCP-ng server, I see nearly all cores running at 15-20% for the duration of the transfer. That seems excessive.
Iperf on the E3-1230v2...Single thread, 9.27Gbs. Neglibile improvement for more threads. Surprisingly, a similar hit on CPU performance. Not as bad though. 10Gbps traffic hits about 10% or so. Definitely not as bad as on the Epyc system.
I'll do more thorough testing tomorrow.