Why forwarding is a Bad Thing
Brad Knowles
brad.knowles at skynet.be
Sun Mar 25 23:53:09 UTC 2001
At 12:45 PM +0100 3/25/01, Jim Reid wrote:
> So, all things being
> equal, overall DNS lookup times will have no significant difference on
> the delivery of subsequent messages to the list (assuming there were
> any significant differences for the initial lookups, which I doubt,
> but won't quibble about).
Thing is, all things are almost never equal.
> If you factor in the overhead of
> replenishing expired RRs -- with or without forwarding -- they're
> likely to be lost in the noise. I doubt if anyone could measure the
> difference in this scenario between a forwarding and non-forwarding
> server. The chances are that the names will have expired from the name
> server that's the forwarding target at the same time they expired from
> the local server. Therefore the lookup overhead will be the same apart
> from the extra delay of the local server waiting for the target server
> to answer a lookup that the forwarding server could have done for
> itself if it didn't forward.
You're ignoring second-level caching effects resulting from
multiple clients hitting the same set of central caching servers. If
L2 caches never worked under any circumstances whatsoever, then we
wouldn't have the term "L2 cache".
The real question is, do L2 caches provide a significant
measurable advantage in this particular situation. My anecdotal
personal experience is that yes, they do make a significant
measurable difference. However, I do not have any hard quantifiable
numbers to back this up.
> IIUC, the biggest latency problem for mailing lists is not DNS
> lookups. It's the tardiness of the remote mail servers. The main
> performance factor is having smart mail software which can parallelise
> delivery: ie the same message can be sent simultaneously to several
> recipients. There is a very interesting paper on tuning sendmail for
> large mailing lists by Rob Kolstad. It barely mentions DNS and none of
> his tuning tricks relied on anything the name servers did. The URL is:
>
> http://www.usenix.org/publications/library/proceedings/lisa97/full_papers/21.kolstad
I talked to Rob extensively before writing my "Sendmail
Performance Tuning for Large Systems" paper that I presented at
SANE'98. I carefully re-read and reviewed his paper for the "Design
and Implementation of Highly Scalable E-mail Systems" paper I
co-wrote with Nick Christenson, and presented at LISA 2000. I also
carefully read and reviewed Strata Chalup's paper (among all the
others I could find on related subjects, all listed in my
bibliography):
Chalup, S. R., Hogan, C., Kulosa, G., et. al
"Drinking from the Fire(walls) Hose: Another Approach to Very
Large Mailing Lists"
USENIX, LISA XII Proceedings, December 1998
<http://www.usenix.org/events/lisa98/full_papers/chalup/chalup_html/chalup.html>
Unfortunately, the summaries I wrote of their papers for the LISA
2000 paper had to be omitted from the presentation, but you can read
them at <http://www.shub-internet.org/brad/papers/dihses/mta-review/>.
Rob is of the opinion that sendmail gets within a hair's breadth
of the theoretical maximum performance possible on a local network,
and therefore no further work ever need be done on it. It certainly
doesn't need parallelization, etc.... Once you apply the
optimizations he discovered (very little of which are directly
applicable to sendmail itself), he feels that there is little that
can be done to improve its performance with respect to large mailing
lists.
I disagree strongly with Rob, and feel that adding
parallelization to the mix will significantly help improve the
performance of handling large mailing lists, which is a large part of
the reason why qmail and postfix have been so successful in this role.
I also feel that optimizations such as the sort you've suggested,
and which I've heard from Bryan Beecher are a good idea -- such as
having a "fast" machine with very low timeouts handle the initial
delivery attempt, and anything that doesn't make it in the initial
attempt should be dumped on a set of "slow" machines with more normal
timeouts.
However, I also believe in optimizations that can (and should) be
applied at the DNS level.
While I do not have any concrete proof that a second-level
caching/forwarding design significantly improves overall performance,
my personal experience is that this is the case.
> I'd be delighted if you or anyone else could point me at another
> serious analysis of mailing list throughput and how a forwarding name
> server "improved" performance.
As you know, I am interviewing with certain companies where I
might be able to test theories like this, if I end up getting hired
by them. This could potentially lead to another paper to be
presented at an upcoming conference.
If I were to be hired by a suitable company, and did have the
opportunity to conduct tests of this sort, would you be
willing/interested to join with me as co-author of the paper, and try
to explore all possible avenues of what does and does not work, as
well as trying to come up with suitable explanations as to why we
believe these things to be true?
--
Brad Knowles, <brad.knowles at skynet.be>
/* efdtt.c Author: Charles M. Hannum <root at ihack.net> */
/* Represented as 1045 digit prime number by Phil Carmody */
/* Prime as DNS cname chain by Roy Arends and Walter Belgers */
/* */
/* Usage is: cat title-key scrambled.vob | efdtt >clear.vob */
/* where title-key = "153 2 8 105 225" or other similar 5-byte key */
dig decss.friet.org|perl -ne'if(/^x/){s/[x.]//g;print pack(H124,$_)}'
More information about the bind-users
mailing list