To wrap up this thread of posts, let me discuss some motivation and directions for asymmetric communication channels, described in post II.
In my talk in China, I mentioned the following naive motivation. We have two sensors in a river. Given the average velocity of the flow, we can deduce that the readings of the sensor upstream at some time t, will be correlated with the readings of the sensor downstream at some time t+Δt. The correlation will not be perfect (otherwise we wouldn't use two sensors), but it is enough to compress the message. However, the sensor downstream doesn't know the distribution of his own input, since he doesn't know the other sensor's reading at an earlier time.
This example is simple and intuitive enough that it should work as PR. Here are the more serious motivations that get thrown around:
- sensor networks (in some abstract way)
- internet connections (most ISPs give you faster download than upload; downloading from a satellite is significantly easier than uploading)
- stenography and combating consorship. This comes from the paper: [Feamster, Balazinska, Harfst, Balakrishnan, Karger: Infranet: Circumventing Web Censorship and Surveillance. USENIX Security Symposium 2002]
Moving on to 2., I have not seen examples where this approach might be useful at the scale of realistic ISP links, nor examples of how you might get the server to know the distribution better than you.
About 3., I have a feeling that it is simply accidental cobranding. To get your paper accepted to a practical conference, it helps to use some fancy theoretical work somewhere (maybe a place where you don't even need it), and theoretical work can claim practical importance if somebody uses it in a supposedly practical conference. I am reminded of a funny quote: "I would gladly work on the practical side of academia, but so little of practice actually applies to practice..."
Thus, unfortunately, I do not find the motivations I heard so compelling (outside of theoretical beauty with which I will fully agree).
What is probably needed is more attention towards the EE end of the discipline. Is anybody aware of any EE-work on this problem? In our abstract setting (Bob knows a distribution), any algorithm is ultimately impractical, because nobody works with perfect knowledge of a distribution (a distribution is a big object!). What we need is better modelling of what "understanding a distribution" means, and algorithms based on that. To give the prototypical sily example, the distribution may be (roughly) normal, so Bob only needs to know the mean and standard deviation.
On the lower bound side, I have some rough ideas to prove that the lower bound holds even if the distribution is the trajectory of a Markov chain of polynomal size (as opposed to the distribution in our paper, which really took exponentially many bits to describe).