BIND - sorting of reverse domain.

Tue Jul 9 01:20:54 UTC 2002

> On Mon, 8 Jul 2002 Mark.Andrews at isc.org wrote:
> >> On Mon, 8 Jul 2002 Mark.Andrews at isc.org wrote:
> >> >> >...
> >> >> >	At that point you also have the node that immediately preceeds 
> the
> >> >> >	node you were looking for if you cared to save it as you traver
> sed
> >> >> >	the tree.  Try it with any binary tree.  You either hit the nod
> e
> >> >> >	you are looking for, you traverse the node immediately preceedi
> ng
> >> >> >	it or there is no preceeding node.  The same can be said for th
> e
> >> >> >	node after the one you were looking up.
> >> >>
> >> >> In a binary tree, one may have the node that precedes it, but there is 
> no
> >> >> guarentee that the CONTENTS of that node will also happen to be the imm
> edi
> >> ate
> >> >> ly
> >> >> adjacent sequential element, as would be true for a sorted list.
> >> >
> >> >	I never said that.  I said you would pass it and that you could save
> >> >	it.
> >>
> >> But by saying "immediately preceding", you indicated that it would be adja
> cen
> >> t.
> >> If it were merely passed, it would not be IMMEDIATELY preceding; just simp
> ly
> >> preceding.
> >
> >	The node that immediately preceeds the key in sort order.
> >	I also said if you cared to save it as you traverse the
> >	tree.  There is no point in saving it if it would always
> >	be the current node.
> >
> >	I did start out saying that DNSSEC needs to know the node
> >	before the missing name from which you assumed that the
> >	zone was stored as a array.
> 
> Which is most efficient with an ordered list, not a tree.  For any element, g
> o
> to it directly, then move one (doubly linked list or array).  That operation 
> is
> not as trivial with a tree.
> 
> I properly selected a list when you stated that the search stops when one has
> encountered a value greater (assuming an ascending order search) than the
> target.  That termination condition does not exist in a tree arrangement.
> 

	You really think a list is the way to go with a multi-million node
	zone?

> >> >>  The tree
> >> >> contents could represent an element for an item which sequentially FOLL
> OWS
> >>  th
> >> >> e
> >> >> current target just as well as it could represent one which PRECEDES, a
> nd
> >> >> neither need be adjacent.  The sequentially adjacent element could be i
> n A
> >> NY
> >> >> of
> >> >> the nodes in the tree in the path from the target back toward the tree 
> roo
> >> t -
> >> >> it need not be in the immediately preceding level.
> >> >>
> >> >> Do you even understand what a balanced binary search tree is?  Your sta
> tem
> >> ent
> >> >> implies that you do not; I clearly infer such.  Perhaps you need to rev
> iew
> >>  th
> >> >> e
> >> >> characteristics of such.
> >> >>
> >> >>                      A
> >> >>                   B     C
> >> >>                 D   E F   G
> >> >>
> >> >> In this simple tree, the equivalent sorted list (left to right) is "DBE
> AFC
> >> G".
> >> >> Here, E and F have an adjacent node which (although in the path from th
> e t
> >> ree
> >> >> root) is NOT adjacent in the tree:  E-A, and A-F.  A is NOT an "immedia
> tel
> >> y
> >> >> preceding node" to any of D, E, F, or G.
> >> >>
> >> >> One has to remember the ENTIRE path from the root to the leaf, not just
>  th
> >> e
> >> >> immediately preceding node, for your algorithm to work.
> >> >
> >> >	No.  One just has to remember the appropriate node.  This can be
> >> >	determined *as* you perform the traversal.
> >>
> >> That is not what your original statement said.  Remembering the path remem
> ber
> >> s
> >> every node in that path - even if only one of those will be needed later, 
> we
> >> don't know which one until we hit the leaf (and run out of tree).
> >>
> >> >		H
> >> >	   D          L
> >> >	B     F    J     N
> >> >
> >> >	Now look for A, C, E, G, I, K, M and O.  In all cases your
> >> >	search will pass the node immediately before and after the
> >> >	name being looked up or you can determine there isn't one.
> >> >	The same applies for unbalanced trees.  You also don't need
> >> >	to remember the whole path.  Just the appropriate nodes.
> >>
> >> Actually, NO.  In the case where a node is actually found, the adjacent no
> des
> >> need not be encountered if the target is a non-leaf.  I agree that where t
> he
> >> target is not represented in the tree, the neighbors will be found in the 
> pat
> >> h
> >> (in either order, with possibly others which we will not know are not the
> >> nearest neighbors until we hit the end).  However, obviously, at most ONE 
> wil
> >> l
> >> be adjacent; the other could be anywhere between the leaf and the root.  T
> he
> >> adjacent one could be "higher" or "lower" in the sort sequence.  That's no
> t t
> >> he
> >> guarentee implied by your original statement a few posts/e-mails back that
>  sa
> >> id
> >> that the "next" (which implies "higher" and NOT "lower") would be adjacent
> .
> >>
> >> >	For A and C you will be at B and you will have traversed D.
> >> >	for E and G you will be at F and you will have traversed D and H.
> >> >	for I and K you will be at J and you will have traversed H and L.
> >> >	for M and O you will be at N and you will have traversed L.
> >> >
> >> >	To find the preceeding node everytime you would take the right
> >> >	branch you save the current node.
> >> >	To find the next node everytime you would take the left branch you
> >> >	save the current node.
> >> >
> >> >	With one traversal you either have the node you are looking for or
> >> >	you know the node before and after the key if they exist.
> >>
> >> True, but not conclusively.
> >
> >	You do know it them conclusively.  What you can't find out in one
> >	traversal is the node and the node just before it and the node just
> >	after it which is why my statement has a "or" in it.
> 
> No, you don't - because the traversal sublist may have OTHER nodes in it as w
> ell
> (if longer than a path count of 2).  You still have to perform some sort of
> operation to eliminate them.  This can be done "on the fly", but it still
> represents an extra step.  Only after that is the result conclusive.

	It was conclusive given the description.  Just because you have
	preconcieved ideas that don't match the description it doesn't
	make the conclusion wrong.
 
> What is conclusive is that the traversal sublist will CONTAIN the neighbors.
> It may go beyond that and contain MORE than just the neighbors (if not pruned
>  on
> the fly).  It is also a property that one of the neighbors will be the leaf (
> in
> the case of the target not being found) or the node adjacent to the leaf (if 
> the
> leaf represents the target), but NOT which neighbor it is ("higher" or "lower
> ")
> - in some searches, it will be the higher one; in others, the lower one.
> Without a further operation, all one knows is that this node is ONE of the
> neighbors, and the traversal sublist somewhere contains the other one.
> 
> In the case where a non-leaf node holds the target, the neighbors will both b
> e
> leaf nodes in that target's subtree.
> 
> >>  The traversal will give a subset of nodes - the
> >> FIRST step.  An additional operation is still required on that subset to e
> xtr
> >> act
> >> the two nodes representing the values "bookending" a value not found (unle
> ss
> >> of
> >> course, the traversal path is 2 or less).  Only when this subsequent opera
> tio
> >> n
> >> is performed will these neighbors be known conclusively.
> >
> >	What additional processing?  You just remember the bookends as
> >	you traverse the tree.
> 
> That's your extra step:  Replacing (when appropriate) each bookend in order t
> o
> generate the best neighbors.  This is a characteristic of the algorithm that 
> one
> selects to do the search - not a characteristic of the tree data structure
> itself.  Your "memory" step is NOT inherent to the simplist binary search.
> 
> >node_t *
> >find(ket_t key, node_t *node, node_t *prevp, node_t *nextp) {
> >	*prevp = *next = NULL;
> > again:
> >	if (node == NULL)
> >		return (NULL);
> >	if (key == node->value) {
> >		*prevp = *next = NULL;  /* invalidate */
>                 ^^^^^^^^^^^^^^^^^^^^^^ - might not be necessary!
> >		return (node);
> >	}
> >	if (key < node->value) {
> >		*nextp = node;
> >		node = node->left;
> >	} else if (key > node->value) {
>                ^^^^^^^^^^^^^^^^^^^^^^    - Unnecessary conditional.
> >		*prevp = node;
> >		node = node->right;
> >	}
> >	goto again;
> >}
> 
> Maintaining the "nextp" and "prevp" is a separate step than traversing the tr
> ee
> itself.  However, why flush their values?

	Because you don't read descriptions and I wanted to make it
	perfectly clear that the values are invalid in this case.

>  Undoubtedly, the code which calls
> this routine is going to check for the target first and never use the neighbo
> r
> pointers where the target was found (to use them would be meaningless anyway
> because the algorithm doesn't go far enough [i.e. BEYOND the target] to find 
> the
> correct values).
> 
> A "goto" instead of a while loop?  If this were actual code in use, then this
> program deserves to be scrapped and [parts] rewritten from scratch (and the
> rest, simply rewritten).

	A while loop indents the loop body.  People are to scared
	of gotos.  Yes, teaching how not use them is good because
	the can be abused.  Not using them when they are appropriate
	is also wrong.

> 
> >	node = find(key, top, &prev, &next);
> >
> >	If the tree is empty then node, prev and next will be NULL,
> >	otherwise you have the node or the bookends.
> 
> With that, I agree.
> 
> 
> I would write it as follows:
> 
> /* CAVEAT:  nextp and prevp valid only if the target is not found i.e. NULL *
> /
> node_t *find(key_t key, node_t *node, node_t **prevp, node_t **nextp)
> {
>     *prevp = *nextp = NULL;
>     while(node != NULL) {
> 	if (key == node->value) break;
> 	if (key < node->value {
> 	    *nextp = node;
> 	    node = node->left;
> 	} else {
> 	    *prevp = node;
> 	    node = node->right;
> 	}
>     }
>     return(node);
> }
> 
> 
> 
--
Mark Andrews, Internet Software Consortium
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: Mark.Andrews at isc.org