## Tuesday, August 7, 2007

### Binary Search Trees (II): Lower Bounds

In the first post on Binary Search Trees (BSTs), I talked about a geometric view of the following optimization problem: given an access sequence (x1, .., xm), what is the best running time achievable by a dynamic BST on this sequence?

The typical approach to get an approximation algorithm is to identify a lower bound on the cost, and match it as close as possible. Traditionally, we have known two lower bounds for the BST problem, coming from [Robert E. Wilber: Lower Bounds for Accessing Binary Search Trees with Rotations. SICOMP 18(1): 56-67, 1989; FOCS'86].

In the rank-time geometric view, these two bounds and any other bounds we have been able to think of fall in a framework that we call cut bounds. Definitions:

Theorem: Cost of best BST = Omega(sum Li)
Proof: There is a very meaningful proof, to be discussed in a later post. --QED

Thus, the lower bound proven by some cut system is sum Li. Wilber I is obtained by a cut system which is independent of the access sequence (discussion in a future post). Wilber II is obtained by considering a cut starting at every point, and going up to -∞. If you think about it, for each point (=cut), it counts the number of left-right transitions in a sequence of points converging towards it, when looking up (back in time): This is annoyingly close (but not quite) to the greedy IAN. The cost of the greedy is essentially equal to the total number of points in a converging sequence, while the lower bound is equal to the number of alternations. Despite the similarity, we do not know any bound on the ratio of this upper and lower bound, and we do not know any separation.

The optimal cut bound. Since all lower bounds we know are cut bounds, the obvious question is to find the best bound achievable in this framework. For a given access sequence, define the following undirected graph:
Theorem: The highest possible cut bound = the maximum independent set in the graph of rectangles.

Proof of
"": For any cut bound, we find an independent set of the same size: take every transition across a cut, and draw the unsatisfied rectangle defined by the points on the left and right. (5 minutes with pencil and paper will convince you the rectangle are independent)

Proof of
"": For any independent set, we need to construct a cut giving the same bound. For every rectangle, consider a cut with the same vertical range, that is close to the left edge for negative-type rectangle, and close to the right edge for positive-type. Order cuts by increasing Y, breaking ties by increasing x. Each cut will have a lower bound of exactly 1, paying for the rectangle (5 more minutes with pencil and paper). --QED

Note a very pleasing thing about the maximum independent set: it is flip- and rotation-invariant. Rectangle satisfiability is by definition flip- and rotation-invariant, so the optimal BST must have this property. Thus, whatever the right lower bound is, it must have this property also.

Now, it does not seem too useful to know that the best lower bound is a maximum independent set of some geometric graph, since that is both too hard to compute and too difficult to reason about. We have a neat greedy that approximates the maximum independent set within constant factors (it is similar to the IAN greedy, but taylored to this problem). Thus, the best lower bound is rather explicit. Unfortunately, we cannot construct a mathcing upper bound.

Open problems:
• can we come up with any non-cut lower bounds? (that are better?)
• is Wilber II asymptotically equal to the best bound (the maximum independent set)? Wilber conjectured so.
• does IAN come anywhere close to the best lower bound? is there a separation?