Friday, September 14, 2007

Slepian, Wolf and relatives (I)

The [Splepian-Wolf'73] coding scheme is a famous result in coding theory, that inspired the field of distributed source coding. I guess this topic would be better for Michael's blog, but I am curious to ask some questions, and I want to describe a slightly related beast in the next post.

Review. You have k clients, and a server. The clients have some inputs x1, ..., xk, each of n bits, and they want to communicate all inputs to the server. Now assume these inputs come from some joint distribution with entropy much less than nk. Can one use similarly little communication?

If the distribution is rectangular, i.e. the xi's are independent, each client i can perform source coding (say, Huffman coding) based on his own distribution, achieving H(xi) communication. With dependencies, if clients broadcast their messages in order, each client can perform source coding based on his distribution conditioned on the previous messages. This achieves optimal total communication, since Sumi H(xi | x1, ..., xi-1) = H(x1, ..., xk).

Slepian and Wolf showed that it is possible to achieve optimal communication even if all clients communicate one message simmultaneously, and the messages are only seen by the server.

In fact, there is a whole simplex of possible scenarios for optimal communication. Consider k=2 to make notation easy. Then, it is possible for client 1 to send m1 bits, and client to to send m2 bits as long as:

  • m1+m2 ≥ H(x1, x2)
  • m1 ≥ H(x1 | x2)
  • m2 ≥ H(x2 | x1)
Now, this is the perfect kind of result if you want to start a field: it shows that something is possible in principle, but through a nonconstructive probabilistic technique which isn't really feasible to use in practice.

Since then, people have developed efficently encodable and decodable distributed codes for all sorts of important special cases. A "special case" means what makes the inputs correlated (they are close in what metric?). To give realistic examples, the inputs may be close in Hamming distance, or in edit distance, or they may be numbers that are close in value (say, two noisy observations of the same event).

Questions. I never actually worked on this (but on something related, see next post), so I only have some meta questions. The field is usually sold as being important in sensor networks. Is this "for real", or just theoretical PR? People interested in sensor networks on our side of theory (eg the streaming folks) do not seem too aware or interested in this field. Why?

4 comments:

Suresh Venkatasubramanian said...

The field is usually sold as being important in sensor networks. Is this "for real", or just theoretical PR? People interested in sensor networks on our side of theory (eg the streaming folks) do not seem too aware or interested in this field. Why?

Some of the major questions right now in sensor networks involve dealing with the limited battery power of sensors. This makes most computations difficult, and communication even more so, and there's also a focus on distributed methods that don't require central control.

I don't know how the SW methods work, but at least right now, they'd need to be distributed, as well as efficient on a small client, to gain some traction. Also, the data streaming from the clients is viewed as adversarial by algorithmic folks, rather than as information-theoretic.

Mihai said...

Suresh, the whole point of Slepian-Wolf is that it is distributed, as each client sends one message without hearing anything from the other clients. Plus, compression is done precisely to reduce communication, which you say is the main topic of research.

As for the worst-case vs information theoretic views, they are both equally applicable. If you assume two clients make an observation of the same phenomenon, and they get two strings of small Hamming distance, you may assume the difference between the strings is adversarial, or average case by some distribution. Of course, I am on the adversarial side :)

Anonymous said...

Most (all?) research on sensor networks in academia --- even by ostensibly "practically-minded" people --- seems to be PR. I recently saw a talk (sorry, can't remember the reference) where the speaker questioned whether most of the scenarios being studied were of any practical interest to actual users.

So, to answer your question, I think the answer is "ye, it's PR" but theory is not alone in this respect.

Anonymous said...

Hi,

I am not an expert on Slepian-Wolf, so you might want to take the rest of this post with a grain of salt. Basically, my understanding is that SW framework has the following pros and cons:

* Pros:
- it is a probabilistic model, which can overcome the worst-case lower bounds
- information-theoretically, the model is well-investgated and well-understood

* Cons:
- at least until recently, the encoding/decoding complexity was an issue. See e.g., the this thesis.
- the data must come from a static and known distribution. This tend to hold for "prototypical" applications of SW, such as sensors on space stations. On the other hand, understanding the distribution of the data is often a big reason why people put sensor networks out there in the first place.
- often, one is not interested in all collected data, but only in some "interesting" chunks (anomalies etc). In this case, it is more efficient do in-network data processing, instead of transmitting all data to the source. SW as is does not allow for such "functional compression". But there are steps in that direction, see e.g.,

Distributed functional compression through graph coloring, with V. Doshi, M. Medard and S. Jaggi, in Proceedings of Data Compression Conference, Snowbird, 2007.

Cheers,

Piotr