Saturday, June 7, 2008

Being Rich (II): Motivating LSD

We now continue our introduction to asymmetric communication complexity and its applications. In the first episode, we showed a deterministic lower bound for Lopsided Set Disjointness: Alice and Bob each have a set (of very different sizes), and they want to communicate to determine whether the sets are disjoint.

But are we using LSD just for entertainment, or is it a fundamental to our understading of the world? (Ahem.) In this post I want to convince you that LSD is a truly fundamental problem, something like the root of all evil in data structures. You will hopefully be motivated to continue watching this "Being Rich" series very carefully.

The central importance of LSD was understood in a recent paper entitled (Data) Structures that I submitted to FOCS. Quite possibly, it is the best paper I've written so far.

In this paper, I show that a lot of data-structure lower bounds can be shown directly by reduction from set disjointness (and by a lot, I mean almost all known ones, and several new ones). This leads to greatly simplified proofs of many known results, as well as several new and interesting lower bounds.

Since a picture is worth 1000 words, here it is:

For the problems in red, we get better results that what was previously known. For the rest, our reductions simplify existing proofs. The solid arrows are the new reductions, where all the magic happens. The dotted arrows are known or folklore reductions.

If you don't know what these problems are, or if you want to know about the reductions themselves, just hold until the sequel posts. This post is merely ment as an appetizer :)

Note that our tree contains problems from very different "classes" of data structures, for which previous lower bounds used very different techniques with with no suspected relationship:

  • dynamic problems, where we study the trade-offs between update and query times.

  • "easy" static problems, which can be solved in some poly(lg n) time with space O(n polylog(n)). One usually asks for the best possible query time with space O(n polylog(n)).

  • "hard" static problems, typically suffering for a course of dimensionality. Upper bounds typically involve very large space (often superpolynomial).
The way I see it, this paper is a classic example of scientific luck :) Each reduction is technically easy, but the overall result is conceptually hard: you need 2-3 reduction steps, and you need to magically come up with the intermediate problems in the reduction chain (the most unexpected being reachability oracles in butterfly graphs). I was very lucky to be looking at precisely the right problem at the right time, and this beautiful tree suddenly came together.

To finish on a funny note, let me summarize the 3 different feelings that I have had towards this paper:
  • Oh cool! Putting structures in "data structures." Now the complexity types have no excuse not to pay attention :)

  • Finally, I can teach some lower bounds! All I need is a very simple bound for LSD, and then it's all clean reductions instead of lower-bound mess. I can even assign some of the simple reductions as homework!

  • :( ... The beautiful theory and understanding that we've been building for 20 years collapses to one single problem. We know so little. We're now just like main-stream complexity, big dreams and few proved lower bounds.


Anonymous said...

I'm afraid to read this article because I may end up like Syd Barrett.

Anonymous said...

mip, I read your paper with interest. Indeed the results are beautiful and insightful. I think however that the proofs are extremly sketchy, and the results not properly discussed. In fact they are so sketchy to the point where if I were a FOCS reviewer (which I'm not) I would have hesitated to accept. It is a shame that you put the community, and the FOCS committee, in such a predicament, as the paper is indeed worthy.

Anonymous said...

I'd second anon 2. Looking at some of your previous papers, you take a lot of liberty with the concept of an extended abstract, more than once claiming results that seem non-trivial, and whose proofs haven't appeared in any form several years after the conference publication. It is one thing to omit or handwave a small detail from a conference paper and not write a journal version. It is quite another to claim a result without even an inkling of the proof. While your "strategy" helps you write lots of papers quickly, it discourages people from working in your area, and hurts your reputation in the long run.

And this paper is indeed sketchier than most. If I were a FOCS reviewer, I would reject the paper so that a better version appears at the next conference. Conferences are for dissemination of results, not for timestamping unwritten proofs of good results.

I hope your thesis committee encourages (encouraged?) you to write more complete proofs than the conference committees have been doing.

You mention mainstream complexity. Try following their writing standards.

Mihai said...

What's sketchy about this write-up? This is an honest question. In my own reading, it has all proofs and a healthy dose of intuition for each. Which part don't you like?

As a general comment, you seem to have some pedantic misconceptions :) Thesis readers don't actually read the thesis. And reputation is built on solving hard problems, not on explaining thing nicely.

Anonymous said...

And reputation is built on solving hard problems, not on explaining thing nicely.

Actually, reputation is based primarily on proper paper selection. Apparently you have chosen A4, whereas real theorists always use letter.

Anonymous said...

Which part don't you like?
Since you ask: The proof of Theorem 7, which is at the root of your approach, is unavailable. A weaker version of that is Theorem 6, which you claim is proved in [6]. That is a false statement. The proof of Theorem 6 in [6] is deferred to the full version, which doesn't seem to exist.

Mihai said...

Well, there's a very simple deterministic lower bound (see my last post). If you want to take the pedantic view, I'm only proving deterministic lower bounds, not randomized lower bounds. Fine by me.

The contribution of this paper is in the reductions. The set disjointness lower bound is boring.