do not stupidly delete ZSK files
Lawrence K. Chen, P.Eng.
lkchen at ksu.edu
Fri Aug 7 05:16:08 UTC 2015
On 2015-08-06 19:26, Heiko Richter wrote:
>> Though back then I was still building bind 32-bit, and the hardware
>> as much slower. A full signing was more than 10x longer than our
>> current hardware....which can get it done in just under a minute.
>> (usually) The need for speed is some people expect DNS changes to
>> be near instantaneous.
>
> So either you have very slow servers, or a really big zone, if it
> takes a whole minute to sign it.
>
> Just use inline-signing and the changes will be instantanious. As soon
> as nsupdate delivers a change to the master server, it will sign it
> automatically and send out notifies. Doesn't even take a second, as
> only the changes need to be signed, not the whole zone.
>
Its big and probably full of a lot of stuff that isn't needed anymore, etc.
Though there something weird about the zones too.
our ksu.edu zone will have more entries than the k-state.edu one, even though
by policy they should be the same, though I just fixed up delegated subdomain
that is only doing .ksu.edu form (the also don't list us as secondaries or
allow us to do transfers anymore...which they're supposed to according to
policy (and to ensure external resolution....especially if all their
129.130.x.y addresses become 10.42.x.y or something. Internally we're
probably running out of open blocks of IPv4, especially for anything that
wants /27 or bigger (such as a /21) It caused problems the first chunk from
a reclaimed block was used. The reclaimed block used to be our guest
wireless network (which is now a number of are was a growing number of blocks
in 10.x.x.x space) The switch to WPA2 Enterprise versus open guest, made it
too tempting to take easy way to get online. So it was required that campus
resources block access from guest networks. There was no notification that
the old guest network wasn't anymore...and its been years now.
But, often hear that it should would be nice if I filled these various
network blocks with generated forward/reverses....I'm rarely in the loop for
what and where the blocks are.
Anyways...the odd thing I was going with ksu.edu vs k-state.edu...the size of
the raw second zones end up fairly close in size so would expect a huge
difference in viewing the zones.
but, the named-compilezone to convert k-state.edu back into text took a few
seconds, while it took minutes to do ksu.edu.....same machine, etc. Wonder
why, and wonder to what extent I should investigate.....
But, our master server, is Sun Fire X4170 M2 (dual Xeon E5620's)....its bored
and a waste most of the time...until a full signing needs to get done.
Though it isn't as fun to watch when I was using a T5120 (64 threads)....load
average would break 100 and set all kinds of monitoring alerts.... but it
chugged along fine....though the apps (and their admins) in other containers
on it weren't as happy.
Years ago, loads exceeding 100 were often fatal and messy, since they used to
be caused by problems between ZFS and our old SAN (9985)....as much as they
didn't want us to, turning of zil was often the fix to make it not happen
anymore. The problem went away after we switched to new SAN (which isnt so
new anymore...as its end is nearing.
I've thought about looking for a solution that I can throw our zone configs
enough that would just work, but I largely haven't had time to do that. Or I
was hoping to get more backing on enforcing good behavior in my zones. (stop
the vanity of wanting 10.x.x.x servers at same level as your subdomain with
public.) Not sure how preprocesssing zone files to generate internal /
external (/ guest / dr) versions translates into a free ready to go solution
:)
I commented out the the latter two as the first never did what they wanted,
and I heard that the official DR plan was something that got written up back
in 2000 and and then shelved to be revisited when there's funding.... So
once we got we got secondaries outside of our netblock (we vanished complete
a few times when our Internet connection breaks, and the last major quite a
number of sites plus our email were externally hosted....
During recent DNS outage, i couldn't send replies to co-workers....our
Office365 tenant said i was an invalid sender :..( It also apparently
knocked me off of jabber and stopped having my deskphone forward to my
cellphone....or for me to get sms notications of voicemail.....
But, FreeNode contined to work....before jabber we had a private channel that
we hung out in (while its been a long time since we ran a node, we still have
well maybe not, since the co-workers that had those friends have all left
now....which is probably why ownership of the channel hasn't transferred to
me....)
>>
>> For those I do have a script that can run after and ssh into all
>> my caching servers have flush....
>
> You don't need to manually sync your servers. Just aktivate NOTIFY and
> your master will inform all slaves of any zone changes. If you also
> activate IXFR-Transfers, the slaves will only transfer the records
> that have changes; there's no need to transfer the whole zone.
> Combined with inline-signing your updates will propagate to all
> servers within a second.
>
Well, we do have our caching servers acting as slaves for some zones, but
frequently its not realiable for getting our busiest server (the server that
listed first on our DNS configuration page, and is what DHCP gives out as
first.) to not continue with its cached answer... I've made suggestions to
try to get them to spread things out....there's 6 servers....not just
two...as they some areas now get the second server first. Resulting in
second listed server being my second busiest. After that its a split between
3 and 5 ones. We used to list our datacenter DNS as 'backup', though we had
an outage our student information system due to the datacenter DNS getting
swamped by a few computers across campus (that were getting hammered by a
DDoS attack....
number 3 used to be 3rd busiest, but its popularity is has gone down...since
it only has a 100M connection, while others have gigabit. All the campus
servers used to be only 100M. But, people that know which say it matters...
But, tis in the powerplant and has one leg on inverter power...the batteries
for the old phone system are there....next to large empty room....
though at the moment, no incremental capabilities.... so I can hit a slave a
few times before the transfer finishes the info updates. (just as I can hit
master servera few times after it does 'rndc reload' after the
signing....before it reflect the change...
But, it it was actually hard getting to the amount of automation that I have
now.... but occasion people fight the automation. (some more than others)
>>
>> Now if only I could figure out how to do that to the rest of the
>> world to satisfy those other requests.
>
> It's just a matter of lowering your ttl. Resolvers all over the world
> will cache your records according to your ttl. If you really have
> 86400 set as ttl, any given record will be queried only once per day.
>
> Just lower the default ttl to a resonable number and your updates will
> propagate faster to the resolvers. It's just a question of how much
> bandwidth and resources are you willing/able to give to DNS? Lower it
> step-by-step until either hit the limit in your bandwidth or the
> system-resources of your servers.
>
>>
>> Recently saw in incident....a department that has full control of
>> their subdomain made a typo on an entry with TTL 86400. They had
>> fixed the typo, but the world still wasn't seeing the correction.
>> Asked us if we could lower the TTL for it, to maybe 300.
>>
>> Hmmm... no.
>
> If they have full control of their subdomain, why don't they just
> change the ttl themselves?
>
that's basically what my co-worker said.... in responding to the ticket.
But, what they're ask is we lower the TTL of the already cached value.
> Setting a ttl of 1 day seems a bit high, but of course it always
> depends on your zone. If the data is static, 1 day is find, but for
> dynamic zones this is a but high.
>
There lots that seem to feel that 1 day is what things need to be at except
for temporary reasons....though people often forget to have to lowered in
advance of a server upgrade or something. And, this case they had made a
typo on where the new server was...so instead of traffic shifting from old to
new as their update spread out....it all disappeared....
All my domains are static, and I just have forwarding set to the servers that
have dynamics subdomains (though I'm slave to them...shich this new bind has
me a bit stumped on what the correct way to go is.
> When you use inline-signing, your updates will be signed on-the-fly,
> as they come in, so you can lower the ttl to a few minutes without any
> problems. This helps much in keeping outdated data out of any
> resolver's cache.
>
Hopefully a solution will suddenly appear that can replace the scripts I've
mashed together over the years to do what we do now....
I had thought I'd have solution to our current DNS problem in place by
now....
--
Who: Lawrence K. Chen, P.Eng. - W0LKC - Sr. Unix Systems Administrator
with LOPSA Professional Recognition.
For: Enterprise Server Technologies (EST) -- & SafeZone Ally
More information about the bind-users
mailing list