Friday, May 23, 2008

Being Rich (I)

[Updated May 26: writing improved based on a "referee report" by an anonymous commenter. Let me also point out that use of paper and pencil while reading this text is encouraged.]

Let switch gears from politics to science... I am going to run a short series of posts talking about asymmetric communication complexity and how it relates to data structures. In this post, I start by defining the problems and the model, and give a simple (but very interesting) example of a lower bound.

No background is assumed. I believe a high school student should be able to follow this as a gentle introduction to lower bounds. (Yes, yes, I mean a high school student educated in Easter Europe, who knows the European "basic Math." For example, we use (1 + 1/n)n --> e at some point.)

If there are steps that you want clarified, you are welcome to post a question.

Set disjointness. Today, we start with a simple, abstract problem: set disjointness. Let's say Alice has a set S, Bob has a set T, and they are trying to communicate somehow, to determine whether S and T intersect. To quantify the problem, say |S|=n, |T|=m, and both sets come from a universe [u]={1,...,u}.

For now, we will only be interested in deterministic communication protocols (as opposed to protocols using randomization). In "symmetric" communication complexity, we are primarily interested in the dense case, n=m=Θ(u), and we ask how many bits of communication Alice and Bob must exchange to solve the problem correctly.

The trivial upper bound is u+1: Alice sends a bit vector of size u, specifying her set, and Bob replies with the answer. Alternatively, Bob can send his bit vector, and Alice computes the answer. It can be shown easily (see below) that any protocol must use Ω(u) bits of communication in the worst case, so the trivial protocol is asymptotically optimal.

In the asymmetric case, assume n<<m, and m=Θ(u). Because the input sizes are rather different, it is more interesting to measure the number of bits sent by Alice and Bob separately, instead of measuring the total communication. Denote by a the number of bits sent by Alice throughout the protocol, and b the number of bits sent by Bob.

Clearly, there is a trade-off between a and b. The trivial protocols from above give two points on the trade-off curve. Alice can describe her set using a = lg(u choose n) = O(n lg(u/n)) bits, and Bob replies with b=1 bit. Alternatively, Bob can describe his set using b=u bits, and Alice replies with a=1 bit.

Pause for a second, and try to come up with the best trade-off curve interpolating between these two extreme points.

This space intentionally left blank.

Here is how the trade-off looks like:
Values of a = o(n) are ineffective, i.e. Bob cannot save asymptotically over sending u bits. Similarly for values of b=o(n). The interesting portion of the trade-off is the curved one, which is described by: b = u / 2O(a/n).

Here is how we can achieve this trade-off. First, let k≥a be a parameter. We break the universe [u] into k blocks of roughly equal size, u/k. Now:

  • Alice begins by specifying the blocks containing her elements. This takes a=lg(k choose n) = O(n lg(k/n)) bits (if there are fewer than n distinct blocks, pad arbitrarily).
  • for every block containing an element of Alice, Bob sends a bit vector of size u/k, specifying which elements are in T. This takes b = nu/k bits.
  • Alice replies with one more bit giving the anwer.
We have a=O(n lg(k/n)) and b=nu/k. Now eliminate the parameter: k=n*2O(a/n) and thus b = u / 2O(a/n).

Lower bounds. We are now going to prove that this trade-off is the best possible for determinstic protocols. This proof originates from [Miltersen, Nisan, Safra, Wigderson STOC'95], the paper that defined asymmetric communication complexity.
Without disparaging this great paper, I must say that the presentation is awful. The authors fail to realize the vast conceptual contributions that their asymmetric communication model can make. Think of the implications to understanding social interactions between humans (potentially, we may finally understand communication inside a marriage!). Instead, the authors chose to concentrate on boring technical aspects, like showing that their concept can prove lower bounds for computational problems.
To prove lower bounds for set disjointness, we are going to look carefully at the truth table of the function. This is a gigantic object, a 2D array with (u choose n) rows (corresponding to Alice's possible inputs), and (u choose m) columns (corresponding to Bob's inputs). Every entry is 1 if the set described by the column is disjoint from the set described by the row, and 0 otherwise.

The starting point of most communication lower bounds is the concept of (combinatorial) rectangles. A rectangle is a matrix minor of the truth table, i.e. a set AxB, where A is a subset of the rows and B is a subset of the columns. [Warning: this does not look like a rectangle in the geometric sense, since the rows in A and columns in B may not be consecutive.]

Suppose you are an outside observer, who doesn't know the inputs of Alice and Bob, but watches the communication taking places and tries to guess the inputs. After seeing some transcript of the communication, what have you learned about the input? It turns out, the possible problem instances that lead to a fixed communication trascript are always a combinatorial rectangle!

This can be seen by induction on the communication protocol. Before any communication, any inputs look plausible to the outside observer, i.e. the plausible inputs are the the entire truth table matrix. Say that in the next step, Alice sends a bit. This breaks the plausible inputs of Alice in 2 disjoint classes: the inputs for which she would send a 1, and the inputs for which she would send a 0. Observing the bit she sent, your belief about the input changes to a subset for Alice, and does not change in any way for Bob. Thus, your belief changes to a subrectangle, that drops some of the rows of the old rectangle. By analogous reasoning, when Bob sends a bit, your belief changes by dropping some columns.

Communication lower bounds follow by proving two facts:
  1. If the communication protocol is short, the rectangle reached at the end is "large". (Intuitively, there weren't many steps of the induction to shrink it.) But in the final rectangle, all entries must be identical, either zero or one, because the protocol finishes by announcing the answer to the problem. In other words, if the protocol finished by announcing an answer, and there's still a plausible input for which the answer is different, the protocol is sometimes wrong.

  2. Combinatorially, one shows that any large rectangle is bichromatic (contains both zeros and ones).
We are now going to show lower bounds for set disjointness, implementing these 2 steps.

1: Richness and large rectangles. How do we prove a useful result saying that the final rectangle must be "large"? For this, [MNSW] introduce the notion of "richness" (I don't know the story behind the name). We say (the truth table of) a function is [U,V]-rich if it contains at least V columns, each of which has at least U one-entries.

Lemma: If Alice and Bob compute a [U,V]-rich function by communicating a total of a, respectively b bits, the function contains a rectangle of size (U/2a) x (V/2a+b) with only one entries.

Proof: By induction on the length of the protocol. Let's say that we are currently in a rectangle AxB that is [U,V]-rich. We have two cases:
  • Bob communicates the next bit. Let's say B0 is the set of columns for which he sends zero, and B1 is the set for which he sends one. Since AxB contains V columns with at least U ones, either AxB0 or AxB1 contain V/2 columns with at least U ones. We continue the induction in the [U,V/2]-rich rectangle.

  • Alice communicates the next bit, breaking A into A0 and A1. Each of the V columns that made AxB rich has at least U/2 ones in either A0xB or A1xB. Thus, in either A0xB or A1xB we have find at least V/2 columns that have at least U/2 ones. We can continue the induction in a rectangle that is [U/2, V/2]-rich.
At the end of the protocol, we reach a monochromatic rectangle that is [U/2a, V/2a+b]-rich. Since the rectangle has nonzero richness, it contains some ones, and therefore it contains only ones. Furthermore, it must have size at least U/2a by V/2a+b to accomodate the richness.

2: Rectangles in set disjointness. Remember that we are interested in the case of a dense T; let's assume m=u/2 for concreteness. A column of the truth table is a set T of m=u/2 elements from the universe [u]. For each T, there are (u/2 choose n) sets S which are disjoint from T. Thus, the problem is [(u/2 choose n), (u choose u/2)]-rich. By the richness lemma, we obtain a one-rectangle of size (u/2 choose n) / 2a by (u choose u/2) /2a+b.

Assume we have a rectangle {S1, S2, ...} x { T1, T2, ...} that is all ones. Then, evey Si is disjoint from every Tj. Defining S'= S1 ∪ S2 ∪ ... and T'= T1 ∪ T2 ∪ ..., it must be that S' is disjoint from T', therefore
(1) |S'| + |T'| ≤ u.
But also, the size of the rectangle is at most (|S'| choose n) by (|T'| choose u/2), because every row is an n-subset of S', and every column an m-subset of T'. Therefore:
(2) (|S'| choose n) ≥ (u/2 choose n) / 2a
(3) (|T'| choose u/2) ≥ (u choose u/2) / 2a+b
We now pull the following binomial inequality out of our hat:
(A choose C) / (B choose C) ≥ (A/B)C
Then (2) becomes (u/(2|S'|))n < 2a, and thus |S'| > u / 2O(a/n). From (1) we have |T'| ≤ u - |S'| ≤ u (1 - 2-O(a/n)). Applying the binomial inequality to (3), we have:
(u / |T'|)u/2 ≤ 2a+b => (1+2-O(a/n))Θ(u) ≤ 2a+b => eu/2O(a/n) ≤ 2a+b => b ≥ u / 2O(a/n), QED.
The first implication used 1 / (1 - ε) > 1 + ε. The second implication used (1 + 1/A)B = ((1 + 1/A)A)B/A = eΘ(B/A).

The Great Conceptual Debate

If you're a theorist and you haven't been hiding at the South Pole in the past month, you probably know about the great conceptual debate, initiated by this letter, discussed at STOC (great coverage by James), and recently picked up by Michael and Luca. In general, I found the opinions of James and Luca to be close to mine, and I much enjoyed the way they expressed their thoughts. Let me now expand on a few bits of my opinion that I did not hear somewhere else already.

Philosophy. First, we begin with a joke:

While the university is planning its budget for the next year, the Physics department puts in a request for 1 billion dollars for a new accelerator. The president of the university goes to the head of the department, and complains:
"Why can't you guys be more like the Math department? All they need is pencils, paper, and waste-paper bins..."
"Or better yet, why can't you guys be like the Philosophy department? All they need is pencils and paper."
The great danger that I see in our current trends is that we may be moving towards a philosophy department (and we don't want that, if for no other reason then because we don't envy the funding and respect they get).

With every new model from a new field that we understand only partially, and with every problem open to endless interpretation over a beer (like modeling communication with aliens or social networks), it becomes harder and harder to say what should make it to the waste-paper bins.

A theoretical field is meant to generate depth, and we should never forget that depth is our only strength. I would say that in TCS, the conceptual contributions we make when introducing new models are nothing compared to the conceptual contributions we make in developing proofs.

The only place where we can make a difference is in developing brilliant ideas for hard problems. We will never beat more practical fields at their favorite game (modelling many different problems intelligently, and recasting known theoretical ideas in that framework). And this is not what we should be trying to do.

I should emphasize that my point is not חדש אסור מן התורה. We need to adjust to a changing environment. But the rate of introducing new problems should be kept very low (I would say, even lower that we have it now), to give us time to concentrate on deep solutions. And we should be wary of problems that are an endless modelling game.

The AI lesson. If the Philosophy department seems too far away, let us focus on something across the hall from us. Do you remember the period when AI was obsessed about the "great questions," like what intelligence really is, what knowledge is, what the universal principle is for dividing preprogrammed and learned behavior, etc etc? I was not around the CS departments in the 70s, but I hear theorists were not too deeply respectful of those perennial ideological debates.

Thus, I will not be among the people to welcome papers on a computational view of consciousness, communicating with aliens, etc. This is not because I share the sadly common opinion that this is junk research (see comment 5 here, ahem). Saying that this is junk is typical theory snobbery, and is the last thing we need to be saying. (Plus, some of the guys who were in AI 30 years ago are our colleagues, and I actually like some of those people.)

These things are a valid topic for thought --- but I just don't think we should be the community thinking about them. Theory has a huge advantage in being concrete and objective, inheriting that immortal coldness of Math. I don't want us to lose this advantage because we find questions about consciousness and aliens entertaining. There are other forums for these questions, which also have more experience on how to deal with scientific publication on such issues.

The conceptual letter. After so much debate, dissecting the conceptual letter feels like dissecting a cadaver, but for completeness' sake, here goes:

1. As everyone observed, the conceptual manifesto shot itself in the foot by being too vacuous. What is it really saying? Of course we all want simple and elegant proofs for major open problems with great practical applications. But what is this conceptual business? Since I don't like very under-specified problems rife with ideological debates, I shouldn't be the one trying to define what a "conceptual" paper is. Unfortunately, the letter doesn't do a great job either.

2. As many people have pointed out already, purely conceptual papers are not so easy to gauge, whereas author names are. Making an official push for more conceptual papers can easily politicize the process and give a lot of weight to big names. I wish the signers of the letter (most of them highly influential people) had thought about this more carefully when they were adding their names to the list.

I am told that this effect is visible even in STOC'08. Some people are making statistics on the "big name effect" at this conference (hopefully to be released soon), and I am told numbers look bad compared to previous conferences.

3. As you already know, I disagreed strongly with some decisions by this STOC PC, and I am not the only unhappy person. Unfortunately, the part where the letter is praising STOC'08 is one of the few unambiguous ones...

4. Confusing the notion of concept with the notion of new model is major mistake that we are making. Great concepts are developed all the time in new, beautiful proofs of important theorems. To me as a theoretician, these concepts seem much more important and valuable than modelling a problem correctly.

Trying to distill concepts, explanations, taglines etc in what we have done is a great goal with major impact to our funding, recruiting, status among disciplines etc. But let's spend our energy finding a nice description for some deep understanding that we already have, not searching for some theorems that are an excuse for nice descriptions and taglines.

Referee reports lie big time, because people think being conservative and polite is more important than being honest. In my short experience as PC member, this was painfully obvious. Just because the PC said your paper is not technical enough doesn't really mean this is why they rejected it. It's just something they can say easily, as opposed to attacking your model. I wish PC chairs were much more active in combating such reviews.

That said, I do agree that we enjoy technical difficulty too much. It has happened that we accept a new problem when somebody comes up with a complicated algorithm for it, and then we reject the 2nd paper that has a simple and better algorithm. This is entirely unacceptable.

Yes, we were wrong about some crypto papers in 1982. But mentioning it so often in 2008 is like using the original sin to explain sexism. (Not to mention that the connection drawn to these papers seems wrong, as Luca aptly explains.)

Sunday, May 18, 2008

STOC'08 update

My call for confirmation worked. I got a second member of the FOCS'07 PC to confirm that 15 rejections appeared in STOC'08. So pending new developments, let's treat this number as roughly correct.

The next steps are to:

  • also get a 2nd confirmation for SODA'08.
  • look at previous FOCS / SODA, and see how many rejects appeared in next STOC to see whether the numbers are really different. If you were on a previous FOCS / SODA PC and would like to help anonymously, please let me know... I think these statistics would be an interesting factoid for everybody, regardless of what we think of the current STOC.
To clarify where I'm going with these numbers: my goal for now is to simply get them out. I'll let you, the community, interpret them. My gut feeling is that the numbers are high (15 is roughly a fifth of the program!). Coupled with the unobjectionably low count of algorithmic papers, I think there are good reasons for discontent.

Friday, May 16, 2008


Normally, Lance has a yearly post asking where people around the theory world ended up at the conclusion of their job search. But this year I'm going to organize a coup :). Please leave a comment and let us know where you or your friends are going next year, if graduating or changing jobs. Many decisions are still to be made, but let us know when you do decide.

For now let me begin with my personal story. The hiring season is finally over for me. It's been a very hard few weeks (deciding what I want to do), and a much harder day telling people about it. I really look forward to getting back to research after this.

The outcome is that I am going to AT&T Labs, the fittest descendant of the legendary Bell Labs and seemingly a good place to work towards a Nevanlinna prize ;). The lab has certainly had its share of corporate nightmares in the past decade, but it seems it is coming back as a major force in this confused landscape of corporate research.

Next year proper, I will be at IBM Almaden, on a Raviv fellowship. I think IBM Almaden has one of the most amazing groups in theory from the US, featuring some 5 theory hires in the past few years, and some true all-time classics (including an almost mythological figure of concrete complexity, Miki Ajtai). The dedication of the relevant people there to "sell" a theory group to the corporation is truly admirable; theory would not be the same without such efforts. It was very difficult for me to reject a permanent offer from this group, but in the end ATT was slightly better for personal reasons.

I am one of those people who treat labs as a long postdoc, and I think my path will be to return to academia after a number of years. For a second, declining university offers to go to labs felt like a step into the void. The decision was not particularly easy, since I had to turn down Georgia Tech, a place with an admirable dedication to theory, who many see as the big rising star in our field; and UCSD, a place with a truly captivating personality, a disarming feeling of friendliness, and some very ambitious plans for the future (I liked to call it the Google of academia).

Why did I choose labs? Hard to say... I could use a few years to work on some problems that I really want to solve. I value industrial experience, and (I think) I'm interested in networking -- SIGCOMM / SIGMETRICS kind of stuff. Things are too dynamic now and I want to keep my flexibility and avoid the big commitments that a university job requires. And, in the end, some of the values I admire the most are courage and ambition, and I should stick to them now.

Wednesday, May 14, 2008

STOC 2008

STOC 2008 is coming up in a few days. Though I think it is important to attend STOC/FOCS even when you don't have a paper, I was simply too disappointed by this conference to go.

Scrolling down the list of 83 accepted papers, I could not help marvel at the almost-complete lack of (non-approximation) algorithms from the program. Alas, this was not really for lack of submissions. I know of several papers that were submitted, and in my opinion, should have been very easy STOC/FOCS accepts. (Since these are not my own papers and not yet published, I cannot really talk about them.)

Quite likely, the heavy bias against algorithms was due in part to the composition of the PC, which essentially contained nobody in the area. But another way to view it is that algorithms were collateral damage in the raging war on whether it is conceptually more important to approximate the Stochastic Traveling Dog and Pony Problem with Piecewise Linear Costs, or the Games that People Don't PlayTM. (IMHO, the best papers from these fields always got in; it is only the marginal papers that need a boost for being "conceptual"... But that is a topic for another discussion.)

Based on the rantings that you can hear left and right (and, during this spring's interviews, STOC'08 was a popular topic of conversation), it seems the war might have taken its toll on the quality of the conference as a whole. Reportedly, this STOC contains about 10 papers rejected from SODA'08 and 15 papers rejected from FOCS'07.

If these numbers are true, there is serious reason to worry about what this conceptual drive is doing to our academic standards (no, we don't really believe that all these papers were improved substantially from the past submission...) To quote a PC member from a past conference: "These papers were not rejected for being too conceptual, they were rejected for being too weak."

If you were on the SODA or FOCS PC and would like to confirm these numbers or post your own count, please leave an anonymous comment or send me personal email. The latter is preferable because it makes the message trustworthy. I will post your recount, but your identity will remain forever secret.