The International Olympiad in Informatics (IOI 2010) is taking part this week at the University of Waterloo, Canada.

The olympiad often features a hard problem, which is intended to be solved by a handful of contestants. This year, the problem was solved by about 6 people. Read the problem below and give it a shot! :)

I will describe the problem in both TCS and IOI fashion.

**Asymptotic version.**You are given an unweighted, undirected graph on

*N*vertices. Some sqrt(

*N*) vertices are designated as "hubs". You have to encode the pairwise distances between all hubs and all vertices in O(

*N*

^{1.5}) bits of space.

The encoder and decoder may run in polynomial time. Of course, the decoder does

*not*see the original graph; it receives the output of the encoder and must output the explicit distances between any hub and any other vertex. (This list of explicit distances takes O(*N*^{1.5}lg*N*) bits.)**Non-asymptotic version.**You are given a graph on 1000 nodes and 36 designated hubs. You have to encode the distances between all hubs and all vertices in 70,000 bits of space.

The non-asymptotic version is a bit harder, since you have to pay more attention to some details.

**The research version.**Prove or disprove that the distances can be encoded using (1+o(1))

*N*

^{1.5 }bits of space. I don't know the answer to this question (but I find the question intriguing.)

## 9 comments:

I wonder if anyone managed to solve the non-asymptotic version by constructing the trivial encoding then compressing it by Huffman etc.

None of the competitors who tried data compression (alone) achieved more than 50 points. The data were designed to be difficult to crack with statistical compression methods.

Gordon Cormack

IOI 2010 Scientific Director

[P.S. to whomever runs this site. I don't think you should ask people to enter their google password.]

This site (blogger/blogspot) is run by Google.

Probably, I'm missing something, but...

Let's consider some spanning tree T. For root, let's encode all sqrt(N) distances explicitly. Then for each node, let's encode its parent in T, and differences between its and parent's distances. All differences are 0, +-1, so we do it by spending O(sqrt(N)) bits.

I wonder if anyone managed to solve the non-asymptotic version by constructing the trivial encoding then compressing it by Huffman etc.No. (People looked at the winning solutions so I can say this with certainty.)

The committee has, over the years, become quite adept at making test cases, and theory makes a very good predictor for reality.

ilyaraz: Yes, that works. But it seems to be a non-obvious idea :) 7 people solved it by the official count.

So, this solution gives you log(3)*n^{1.5} bits of space, plus smaller order terms.

The open research question is whether one can do c*n^{1.5} bits, for any constant c<log(3).

Any ideas?

I think I can show that the answer is log(3) n^1.5 + O(n log n), within tight upper and lower bounds. For the upper bound, pick a spanning tree (or forest) of the graph on n points. Then, for each node, and each edge of the tree, we encode the +1, 0, or -1 indicating how the distance to that node changes as we traverse the edge (away from the root).

For the lower bound, I can encode log(3) k m bits in a graph with 2k + m vertices, and recover them from the distances from k vertices to all the others. Let a_1, ... , a_k be the nodes, b_1, ... , b_k be connected to the nodes with edges (a_i, b_i) for all i.

The m vertices x_1, ..., x_m can each be placed at distance 1, 2, or >2 from each a_i independently by placing an edge from x_j to a_i, from x_j to b_i, or neither edge. This encodes mk independent choices from three possibilities. If we let k = n^{1/2} and m = n - 2 n^[1/2}, then I think we get log 3 n^1.5 - O(n) bits encoded in the distances from a_i to x_j.

Indeed, the lower bound is ridiculously simple :) Very nice!

the following note may be related (they encode all the distances though)

On compact representations of All-Pairs-Shortest-Path-Distance matrices

Paolo Ferragina, Igor Nitto, and Rossano Venturini

Theoretical Computer Science

Volume 411, Issues 34-36, 17 July 2010, Pages 3293-3300

Post a Comment