Sporadic external DNS issues on domain machines

Don
Joined
21 Oct 2002
Posts
46,750
Location
Parts Unknown
Have an odd issue in work where during roughly 9am to 5pm Monday to Friday, we have sporadic DNS lookup issues for external DNS. It will be fine for a couple of minutes, then very patchy for a while etc, causing chaos as you'd imagine.


One minute, nslookup to bbc.co.uk will work, next minute it won't resolve the IPs.. same for all websites.



During this time, we don't lose pings to the DNS server or firewall and pings to 8.8.8.8 work perfectly etc.

DC is 2012 R2 with DNS running on it, same spec for secondary DC / DNS. CPU/ram usage are fine on them. nslookup on any local machine is instantly replied to.

This wasn't an issue in March, I've been on furlough and came back to this problem. It's my problem to sort out.

Between 100-200 machines, hundreds of switches.

Changes since I've been off: Multiple Windows updates, Core switch replaced, new Linux based phone system installed, many desk moves :/

I want to blame the firewall, but when this issue occurs, DNS on the 'guest' wifi which goes through the same network on a different VLAN works perfectly fine.

I've set nslookup to test around 10 websites per minute, it logs the failures into a file. Seems to have zero zero issues overnight, but plays up from about 9am (coincides with people arriving at work..). The offices are spread wide over the business, so it's hard to visually pinpoint.

-

It's not as full on as a broadcast storm, so it'll be hard to identify immediately if we fix the issue. So far I've tried looking for rogue DHCP servers (in case someone has brought one in for some reason), none found.

Also tried disconnecting specific switches dotted around the site to identify a problematic area.. to no avail. I have a handful more to try.

Any pointers on what to try next?

I'm thinking turn on full firewall logs, full DNS server logs, Wireshark on DNS server. In all my years of working in IT, I've never seen a DNS issue like this before.
 
Soldato
Joined
15 Sep 2009
Posts
2,886
Location
Manchester
Check STP as mentioned, but can you ping from each VLAN on the core switch to external DNS reliably, are there any dropped packet warnings on the switch (may need to enable debugging if you don't syslog it), I would check the same on the Firewall.
 
Associate
Joined
13 Oct 2009
Posts
238
Location
Cumbria
You're having problems with the DNS forwarders configured on the DCs running DNS? I would try a process of elimination if possible, e.g. try nslookup on a laptop directly connected to your external network on a public IP, e.g. bypassing your firewall temporarily. It's not recommended to leave it like this, but you could try dropping it down to one forwarder at a time to see if it's a particular problem with just one of the forwarders. Bit of long shot, but I'd make sure all of the DNS related Windows services are running too.
 
Back
Top Bottom