Last night, TechCrunch reported that LinkedIn and Fidelity.com faced an outage due to a DNS error. ISC staff and colleagues observed that the error was caused due to the changing of nameserver information at the registry, leading to DNS queries to be directed to nameservers that did not correctly answer those queries. Suzanne Woolf, ISC’s Director of Strategic Partnerships, points out “the problem was apparently aggravated by a long TTL on the incorrect nameserver records, causing the bad information to persist in resolver caches and the misdirection of queries to continue far longer than such information is usually held”. Eric Ziegast, member of the security products group for ISC, also observed via DNSDB that over 40 other domains have been affected as well.
It doesn’t appear, based on current observations, that this incident was due to malicious activity. However, ISC staff have identified multiple domains hosted by the registrar that are still having DNS queries for them directed to the wrong nameservers, as caches in recursive DNS resolvers all over the Internet have continued to hold incorrect records for them. This data distributed in error could persist in recursive servers, by some reports, for up to two days from the original incident before they time out, meaning that end-users who rely on those servers might continue to be unable to reach the affected domains.
ISC Support staff have identified some steps that operators of recursive servers based on BIND can take to mitigate this issue by removing the bad data from their caches. The article is publicly available in our Knowledgebase, titled ‘How do I flush or delete incorrect records from my recursive server cache?’
It’s also worth pointing out that this case is not exactly the same as ‘cache poisoning,’ as the cached data was not introduced by a third party but published, at one time, as authoritative data.