Binary search trees are something we all know (and love??) from Algorithms 101. To search for elements of a set
S, we can build a binary tree, and write elements of
S in the nodes, such that the in-order traversal of the tree gives the elements of
S in sorted order. Clasically people say "the tree is in symmetric order", but I do not understand where the terminology comes from. Since searching is by comparisons, we may just as well assume
S={1,..,
n}.
Now, let's say I'm going to search for
x1,
x2, ...,
xm in this order. How well can a binary tree perform? Well, depends what you mean:
- there are sequences where you need Omega(m lg n) comparisons (every comparison reveals one bit of entropy), and that is achieved by any balanced tree. But this worst-case analysis is not answering the question of how well you can do on some given access sequence x1, ..., xm.
- If you intend to build a static tree and use it for the entire access sequence, all that matters about the access sequence are the frequencies f1, .., fn -- how many times you access every element i. Then, the famous dynamic program of Knuth finds the optimum tree [Optimum Binary Search Trees Acta Inf. 1: 14-25 (1971)]. Asymptotically, the cost is described by the entropy: sum fi lg(m/fi).
- But who says your tree needs to be fixed? We all know trees can be dynamic, e.g. being changed by rotations. The rest of the post is about computing the minimum cost needed by a dynamic binary tree to solve a given access sequence.
Model. We can come up with lots of models for how to charge binary trees for their work, but they are all equivalent up to constant factors. So let's stick to a simple one: charge cost 1 for every node that you "touch" during every access (which includes touching by a comparison or a rotation). Note that if you consider any fancy transformation that reconfigures a subtree of k nodes in the tree, you can implement it with about 2k rotations [Sleator, Tarjan, Thurston: Rotation Distance, Triangulations, and Hyperbolic Geometry STOC 1986]
Geometric picture. Trees are ugly and I find it totally impossible to think about them. So let's represent the access sequence as a 2D diagram, like the following. Let
A denote the set of (red) points in the diagram.
We are now going to summarize the behavior of the binary search tree in this diagram. If at time
t (while performing access
xt), the algorithm touches a node with value
z, we are going to put a blue dot at coordinate (
z,
t). Let
B denote the union of red and blue dots. For instance:
Lemma: Pick any two points from
B not on the same row or column. The rectangle having the two points as opposite corners contains at least one other point from
B. (Above you see some examples of rectangles.)
Proof: Say some rectangle defined by (
z1,
t1) and (
z2,
t2) contains only these two points. Essentially, you can't access
z1 at time
t1 and
z2 at time
t2 without accessing their lowest common ancestor at some point (either to go to the other node, or to move the other node through some rotations). See paper for details. --QED
That's nice. But here's the real surprise:
Lemma: Call a set
B orthogonally satisfied if for any two points in
B, the rectangle defined by them contains another point. For any set
B superset
A (the access sequence), with
B orthogonally satisfied,
B describes a valid behavior of a dynamic binary search tree.
Proof: Nontrivial, but doable. The issue is that the blue points on a row tell you what nodes to touch at that time. But you do not know what to do with those nodes -- whether and how to rotate them to be consistent with future blue points. We have an algorithm which looks at future blue points and rotates the current blue points in the best possible way. --QED
A cute research question. We have thus shown an equivalence. To compute the minimum cost of a dynamic binary tree for a given access sequence, we have to solve the following cute, simple, geometric problem:
*** Given a set
A of points in 2D, with one point per row, find a minimum superset
B which is othogonally satisfied.
So what do we know about this problem? Not much:
- there is an O(lglg n) approximation algorithm (in a later post).
- starting with arbitrary A, the problem is NP-hard. Alas, if A has more than one (red) point per row, it doesn't describe a real access sequence. Thus, this generalized problem is not very meaningful for binary trees.
- there is a very natural greedy algorithm that we can't analyze.
Plenty of open problems for your enjoyment...
The greedy algorithm. Go down the rows one by one, and at every step add points to the current row to satisfy all rectangles formed with one point from this row and one point above. A picture is worth a thousand words... The green points are the ones added in this step, and the the hashed region is irrelevant (not "visible" by empty rectangles).
We like to call this algorithm IAN, as it originates in a paper of Ian Munro, where it is described in the language of trees [On the Competitiveness of Linear Search. ESA 2000]. Interpreting that algorithm in the language of 2D diagrams, we have this natural greedy approach.
There are simple examples where the algorithm is O(
m) in cost away from optimum, but no worse examples are known. This means it may very well be an O(1)-approximation, but we do not know how to show any nontrivial bound.
Meet the team. These results are due to the MIT Binary-Trees Team
(TM), famous around CSAIL for perfecting the art of drawing diagrams on the boards in G6, to the point of installing a projector to display a grid on the board. The team includes
Erik Demaine (now with his own dot-org domain),
Dion Harmon (former student of Erik's),
John Iacono (a most exceptional computer scientist, both in skill and personal style, and a good friend), and more recently
Daniel Kane, rising mathematical star. Right now, the best pointer to the work is Dion's thesis. A publication is forthcoming -- except that two of us just got tenure, one just graduated and went to industry, and another is busy writing blogs.