[Updated May 26: writing improved based on a "referee report" by an anonymous commenter. Let me also point out that use of paper and pencil while reading this text is encouraged.]
Let switch gears from politics to science... I am going to run a short series of posts talking about asymmetric communication complexity and how it relates to data structures. In this post, I start by defining the problems and the model, and give a simple (but very interesting) example of a lower bound.
No background is assumed. I believe a high school student should be able to follow this as a gentle introduction to lower bounds. (Yes, yes, I mean a high school student educated in Easter Europe, who knows the European "basic Math." For example, we use (1 + 1/n)n --> e at some point.)
If there are steps that you want clarified, you are welcome to post a question.
Set disjointness. Today, we start with a simple, abstract problem: set disjointness. Let's say Alice has a set S, Bob has a set T, and they are trying to communicate somehow, to determine whether S and T intersect. To quantify the problem, say |S|=n, |T|=m, and both sets come from a universe [u]={1,...,u}.
For now, we will only be interested in deterministic communication protocols (as opposed to protocols using randomization). In "symmetric" communication complexity, we are primarily interested in the dense case, n=m=Θ(u), and we ask how many bits of communication Alice and Bob must exchange to solve the problem correctly.
The trivial upper bound is u+1: Alice sends a bit vector of size u, specifying her set, and Bob replies with the answer. Alternatively, Bob can send his bit vector, and Alice computes the answer. It can be shown easily (see below) that any protocol must use Ω(u) bits of communication in the worst case, so the trivial protocol is asymptotically optimal.
In the asymmetric case, assume n<<m, and m=Θ(u). Because the input sizes are rather different, it is more interesting to measure the number of bits sent by Alice and Bob separately, instead of measuring the total communication. Denote by a the number of bits sent by Alice throughout the protocol, and b the number of bits sent by Bob.
Clearly, there is a trade-off between a and b. The trivial protocols from above give two points on the trade-off curve. Alice can describe her set using a = lg(u choose n) = O(n lg(u/n)) bits, and Bob replies with b=1 bit. Alternatively, Bob can describe his set using b=u bits, and Alice replies with a=1 bit.
Pause for a second, and try to come up with the best trade-off curve interpolating between these two extreme points.
x
x
x
x
x
This space intentionally left blank.
x
x
x
x
x
Here is how the trade-off looks like:
Values of a = o(n) are ineffective, i.e. Bob cannot save asymptotically over sending u bits. Similarly for values of b=o(n). The interesting portion of the trade-off is the curved one, which is described by: b = u / 2O(a/n).
Here is how we can achieve this trade-off. First, let k≥a be a parameter. We break the universe [u] into k blocks of roughly equal size, u/k. Now:
- Alice begins by specifying the blocks containing her elements. This takes a=lg(k choose n) = O(n lg(k/n)) bits (if there are fewer than n distinct blocks, pad arbitrarily).
- for every block containing an element of Alice, Bob sends a bit vector of size u/k, specifying which elements are in T. This takes b = nu/k bits.
- Alice replies with one more bit giving the anwer.
Lower bounds. We are now going to prove that this trade-off is the best possible for determinstic protocols. This proof originates from [Miltersen, Nisan, Safra, Wigderson STOC'95], the paper that defined asymmetric communication complexity.
Without disparaging this great paper, I must say that the presentation is awful. The authors fail to realize the vast conceptual contributions that their asymmetric communication model can make. Think of the implications to understanding social interactions between humans (potentially, we may finally understand communication inside a marriage!). Instead, the authors chose to concentrate on boring technical aspects, like showing that their concept can prove lower bounds for computational problems.To prove lower bounds for set disjointness, we are going to look carefully at the truth table of the function. This is a gigantic object, a 2D array with (u choose n) rows (corresponding to Alice's possible inputs), and (u choose m) columns (corresponding to Bob's inputs). Every entry is 1 if the set described by the column is disjoint from the set described by the row, and 0 otherwise.
The starting point of most communication lower bounds is the concept of (combinatorial) rectangles. A rectangle is a matrix minor of the truth table, i.e. a set AxB, where A is a subset of the rows and B is a subset of the columns. [Warning: this does not look like a rectangle in the geometric sense, since the rows in A and columns in B may not be consecutive.]
Suppose you are an outside observer, who doesn't know the inputs of Alice and Bob, but watches the communication taking places and tries to guess the inputs. After seeing some transcript of the communication, what have you learned about the input? It turns out, the possible problem instances that lead to a fixed communication trascript are always a combinatorial rectangle!
This can be seen by induction on the communication protocol. Before any communication, any inputs look plausible to the outside observer, i.e. the plausible inputs are the the entire truth table matrix. Say that in the next step, Alice sends a bit. This breaks the plausible inputs of Alice in 2 disjoint classes: the inputs for which she would send a 1, and the inputs for which she would send a 0. Observing the bit she sent, your belief about the input changes to a subset for Alice, and does not change in any way for Bob. Thus, your belief changes to a subrectangle, that drops some of the rows of the old rectangle. By analogous reasoning, when Bob sends a bit, your belief changes by dropping some columns.
Communication lower bounds follow by proving two facts:
- If the communication protocol is short, the rectangle reached at the end is "large". (Intuitively, there weren't many steps of the induction to shrink it.) But in the final rectangle, all entries must be identical, either zero or one, because the protocol finishes by announcing the answer to the problem. In other words, if the protocol finished by announcing an answer, and there's still a plausible input for which the answer is different, the protocol is sometimes wrong.
- Combinatorially, one shows that any large rectangle is bichromatic (contains both zeros and ones).
1: Richness and large rectangles. How do we prove a useful result saying that the final rectangle must be "large"? For this, [MNSW] introduce the notion of "richness" (I don't know the story behind the name). We say (the truth table of) a function is [U,V]-rich if it contains at least V columns, each of which has at least U one-entries.
Lemma: If Alice and Bob compute a [U,V]-rich function by communicating a total of a, respectively b bits, the function contains a rectangle of size (U/2a) x (V/2a+b) with only one entries.
Proof: By induction on the length of the protocol. Let's say that we are currently in a rectangle AxB that is [U,V]-rich. We have two cases:
- Bob communicates the next bit. Let's say B0 is the set of columns for which he sends zero, and B1 is the set for which he sends one. Since AxB contains V columns with at least U ones, either AxB0 or AxB1 contain V/2 columns with at least U ones. We continue the induction in the [U,V/2]-rich rectangle.
- Alice communicates the next bit, breaking A into A0 and A1. Each of the V columns that made AxB rich has at least U/2 ones in either A0xB or A1xB. Thus, in either A0xB or A1xB we have find at least V/2 columns that have at least U/2 ones. We can continue the induction in a rectangle that is [U/2, V/2]-rich.
2: Rectangles in set disjointness. Remember that we are interested in the case of a dense T; let's assume m=u/2 for concreteness. A column of the truth table is a set T of m=u/2 elements from the universe [u]. For each T, there are (u/2 choose n) sets S which are disjoint from T. Thus, the problem is [(u/2 choose n), (u choose u/2)]-rich. By the richness lemma, we obtain a one-rectangle of size (u/2 choose n) / 2a by (u choose u/2) /2a+b.
Assume we have a rectangle {S1, S2, ...} x { T1, T2, ...} that is all ones. Then, evey Si is disjoint from every Tj. Defining S'= S1 ∪ S2 ∪ ... and T'= T1 ∪ T2 ∪ ..., it must be that S' is disjoint from T', therefore
(1) |S'| + |T'| ≤ u.But also, the size of the rectangle is at most (|S'| choose n) by (|T'| choose u/2), because every row is an n-subset of S', and every column an m-subset of T'. Therefore:
(2) (|S'| choose n) ≥ (u/2 choose n) / 2aWe now pull the following binomial inequality out of our hat:
(3) (|T'| choose u/2) ≥ (u choose u/2) / 2a+b
(A choose C) / (B choose C) ≥ (A/B)CThen (2) becomes (u/(2|S'|))n < 2a, and thus |S'| > u / 2O(a/n). From (1) we have |T'| ≤ u - |S'| ≤ u (1 - 2-O(a/n)). Applying the binomial inequality to (3), we have:
(u / |T'|)u/2 ≤ 2a+b => (1+2-O(a/n))Θ(u) ≤ 2a+b => eu/2O(a/n) ≤ 2a+b => b ≥ u / 2O(a/n), QED.The first implication used 1 / (1 - ε) > 1 + ε. The second implication used (1 + 1/A)B = ((1 + 1/A)A)B/A = eΘ(B/A).