DNS queries during backup job
-
Although it is nice that there is work arounds for the DNS spikes with either nscd or the in-process DNS cache, i think the DNS spikes are a symptom of a whole different issue.
I think we can safely assume that each DNS lookup is corresponding to one attempt at establishing a TCP connection then there is some code somewhere that spawns an awfull lot of short lived connections instead of reusing / pooling them - with all the issues that follows in that area (insufficient ulimit NOFILE, connections in TIME_WAIT/exhausting of client ports etc)
-
@hoerup I agree with your analysis, not sure how easy it will be to fix, we'll investigate.
-
Did some further testing if amount of DNS queries would correlate to the amount of actual connections made to the host. This doesn't seem to be the case which is even more interesting Some results below.
Ran an incremental from delta backup which took in total of 9 minutes:
- Amount of DNS queries: close to 7k
- Amount of HTTPS connects logged to host IP-address: 478.
- Amount of HTTPS connects/disconnects logged in total to host IP-address: 955
Connection counts were about the same with installation from
dns.lookup
branch provided by @julien-f above, without the amount of DNS queries obviously. -
@ronivay are all dns queries for the same host and record?
-
Yep. Same domain, asks A and AAAA at the same time, both being individual queries obviously.
-
Also XAPI (so on host side) doesn't support HTTP/2.
-
The DNS cache has been merged, keep us posted if you have any issues.