Monday, August 3, 2009

More on Problem 2

You might be wondering what caused the hiatus in posting CEOI problems. My modus operandi as a researcher is to induldge in whatever obsession I currently have. For the past week, the raging obsession has been Problem 2 (tempered, of course, by the beaurocracy of joining AT&T).

Several people found the O(n4) solution, both during the real competition and on the blog. In addition, Berman, Karpinski, and Lingas also found this solution in a recent arXiv paper. (Thanks to Nate Xiaoming Xu for telling me about this paper!) Their paper is not about rectangle covers, but something about wireless antennas. This goes to show that olympiad problems are often deep enough to become research papers. The only additional skill that we have in research is keeping a straight face when we write the introductions. (Don't worry, it comes with age.)

As I announced in the comments to the previous blog post, I found an O(n3) solution. This appears to be tricky (several people tried to reconstruct my solution in the comments, but without success). My solution is described in this post.

I am still quite optimistic that O(n2) might be possible, but so far I haven't been able to beat cubic time. If you find a way, I will be impressed.

If you don't find a way but want to discuss promissing ideas, use the comments to this post. Think of this as an experimental poly-CS project. The discussion is open to anyone. Don't be shy to post, but please be use a nickname for identification ("CarpathianBlackMetal" is much more informative than "Anonymous"). Also number your comments (e.g. "This is comment #4 from CarpathianBlackMetal"). Do not delete a comment if you posted something wrong. Explaining why it is wrong is much more informative for the audience.

Now let me describe the solutions I know for this problem.


Observation 1. In an optimal solution, any two rectangles are either disjoint or "nested" (one of them is taller and has the base included in the other). If two rectangles were to intersect properly, the shorter one can be shrunk horizontally until the rectangles become disjoint.

Let the points have coordinates (x[i], y[i]). Assume they are sorted by x.

Observation 2. We can construct an optimal solution in which every rectangle satisfies:
  1. it extends horizontally from some x[i] to some x[j];
  2. it extends vertically up to A / (x[i] - x[j]), i.e. as high as possible.
  3. it covers points i and j, i.e. y[i], y[j] ≤ A / (x[i] - x[j]).
Perhaps the only nonobvious condition is 3. But note that if the rectangle doesn't cover point i, I might as well move its edge inside (beyond point i), and thus increase the height. The only change that could happen would be to cover more points.

An O(n4) solution. We thus know that there are only O(n2) distinct rectangles to consider. Denote them by R(i,j); though not all pairs (i,j) are valid rectangles.

If R(i,j) is used in the solution, we have a well-contained subproblem to solve: cover all points between x[i] and x[j], which are not already covered by R(i,j), that is, they are above the rectangle. Let A(i,j) be the minimal number of rectangles needed to cover all such points. This is our dynamic program.

We fill the dynamic program by increasing values of j-i. To compute A(i,j), let S be the set of points k (i < k < j) which are not covered by the rectangle. Note that i and j are always covered (by condition 3. above). Thus, any information pertitent to covering S has already been computed by our dynamic program.

Let S={X1, X2, ... }. To compute A(i,j), we need to find splitters a < b < c < ... which minimize A(X1, Xa) + A(X{a+1}, Xb) + A(X{b+1}, Xc) + ...

This is a classic dynamic programming problem itself, or, if you wish, a shortest paths problem where A(x,y) is the length of the path from node x to node y. This subproblem can be solved in O(n2) time (which is "linear time", since there are n2 subproblem A(x,y)). Thus, the original dynamic program is filled in O(n4).

An O(n3) solution. A common strategy for improving the running time is to try to maintain more information. Thus, I first reformulate the dynamic program above to use O(n3) state.

Let T(i,j,k) be the optimal number of rectangles to cover all points (x,y) with x[i] ≤ x ≤ x[j] and y ≥ heights[k], where heights contains all y's in sorted order. Unlike the previous solution, I consider all heights k, not just the maximal one for a fixed i and j.

This dynamic program is easy to fill in O(n4). For some triple (i,j,k), I will either:
  • draw a rectangle from x[i] to x[j] of maximal height. If this height is more than y[k], I get the optimum for covering the remaining points from T(i,j,new height).

  • if there is no rectangle spanning x[i] to x[j], there must be a break point m, such that T(i,j,k) = T(i,m,k) + T(m+1,j,k). Iterate to find the optimal such m.
To fill this in O(n3) requires a nice trick. The order of the computation will be: by increasing value of j-i, and, for some fixed (i,j), by increasing k.

To compute the optimal T(i,j,0), proceed exactly as above, taking time O(n). The rest of T(i,j,k) for k>0 can be computed in constant time per entry. For each T(i,j,k+1), I have two possibilities:
  • T(i,j,k+1) = T(i,j,k).

  • T(i,j,k+1) < T(i,j,k). Thus, the optimal solution for k+1 is different from the optimal solution for k. But the set of points that must be covered by these two solutions only differ by one point P (with y[P]=k). Thus, point P must not be covered by the solution for T(i,j,k+1) — if it were, the better solution for T(i,j,k+1) would also hold for T(i,j,k). Thus, x[P] is a break point in the solution T(i,j,k+1), and we have T(i,j,k+1) = T(i,P-1,k+1) + T(P+1,j,k+1).

Good luck thinking about this problem!

11 comments:

  1. I had an email discussion about the cubic solution, which I'm posting here for the benefit of the readers.

    I thought about such a solution, but I think it is wrong.

    I am actually rather convinced that it is correct. I coded it up and it passes all the test cases used during CEOI, which should be good empirical validation.


    In your solution you define subproblem (i,j,k) of covering points m such that x_i ≤ x_m ≤ x_j and y_m ≥ H[k] where H stores the sorted y's.

    Correct.


    But you start the recurrence with (i,j,0) which is strange because those are the subcases that cover most points.

    I agree that it's counter-intuitive. That's why I like the algorithm :)


    In particular, is it trivial to test if the cost of (i,j,k+1) is equal to the cost of (i,j,k)?

    Yes. I know that T(i,j,k+1) ≤ T(i,j,k), because any solution covering the point for (i,j,k) also covers all points for (i,j,k+1).

    What I have to detect is the case when T(i,j,k+1) is strictly smaller
    by 1. If it is, it means I have a new solution for (i,j,k+1) which was invalid for (i,j,k) -- otherwise, I would've used it there. How could this new solution be invalid then, but valid now? Only if it does not cover the point at height H[k]! So this new solution, if it exists, is precisely T(i,P-1,k+1) + T(P+1, j, k+1).

    To compute T(i,j,k), I keep the minimum of the two options: T(i,j,k) and T(i,P-1,k+1) + T(P+1, j, k+1).


    And the subproblem (0,n-1,0) is THE problem! So I do not see how to progress from smaller k to higher -- and why should I?

    You need to know (i,j,k) for k ≥ 1, because you are using that in the base case when drawing new rectangles. If I want to draw a rectangle from x_i to x_j, I need to know T(i+1, j-1, k) for the appropriate k (which is nonzero).

    Hope this clears things up.

    ReplyDelete
  2. #1 CarpathianBlackMetal

    Just to clarify, are you assuming that all x's and all y's are different?

    To compute T(i,j,0), you suggest to "proceed exactly as above", but the "above" tells us we need to look at T(i,j,new height). I am assuming you could simply use T(i+1,j-1,height), right?

    ReplyDelete
  3. #2 CarpathianBlackMetal

    For simplification I consider in the following that points have distinct coordinates.

    A good first step is to lower again memory usage from n^3 to n^2 while preserving the time complexity :D

    Therefore, we can compute T'(i,j) = T(i,j,0) directly as follows:

    You fill in T'(i,j) by increasing j-i.

    Assume you are trying to compute T'(i,j). There are two options:

    a) either there is a split point k
    such that T'(i,j) = T'(i,k) + T'(k+1,j)

    b) or you have to place a rectangle R between i and j. Here's where the tricky part starts. Compute i' to be the first point after i which is outside R and j' the first point before j which is outside R.

    Now either the next step is to place a rectangle between i' and j', in which case T'(i,j) = T'(i',j') or there is some split directly above R.

    So we have to find a split "point" k such that T'(i,j) = T'(i,k) + T'(k',j). k' is the first point after k which is not contained in R.

    The intuitive reason the above works is that T'(i',j') and respectively T'(i,k), T'(k',j) do not have to cover points within R which they wouldn't cover anyway.

    ReplyDelete
  4. #1 Hrushikesh (or is it #3?)

    Should we also try to prove lower bounds while we are at it?

    Here is one obvious way in which one might proceed:
    Consider the related family of problems {DECIDE_k} for each natural number k. The input to DECIDE_k is the same as input for your problem, but the algorithm has to only decide whether the answer is <= k or > k.

    Now, DECIDE_1 can be done in linear time.

    DECIDE_2 can be done in time O(n log n), the time taken for sorting. Can it be done faster or can one prove a matching lower bound?

    What are the best lower and upper bounds for DECIDE_3?

    Can we prove quadratic/super-quadratic lower bounds for any DECIDE_k?

    ReplyDelete
  5. #3 CarpathianBlackMetal

    My second comment was wrong. You would need to find many break points k1, k1', k2, k'2, etc which I don't know how to do in linear time which sucks.

    @Hrushikesh

    You can do Decide_2 in linear time even if the points are not sorted. You simply enumerate the two possible "shapes" (either there is a bottom rectangle or a horizontal separator) and you can decide any one of them in linear time.

    Same trick with the enumeration of "shapes" works for Decide_3, but I think in this case you need the points to be sorted.

    For Decide_4, it seems to be trickier, but I wouldn't be surprised to see a linear time algorithm for constant number of rectangles.

    ReplyDelete
  6. This is Mihai, comment #2.

    Just to clarify, are you assuming that all x's and all y's are different?

    Indeed, I can assume this by arbitrary tie breaking.

    To compute T(i,j,0), you suggest to "proceed exactly as above"

    Here is the procedure that I was referring to (copy-paste from the post, slightly adapted):

    First, we draw a rectangle from x[i] to x[j] of maximal height. I get the optimum for covering the remaining points from T(i,j,new height).

    The other option is that there is no rectangle spanning x[i] to x[j]. Thus, there must be a break point m, such that T(i,j,0) = T(i,m,0) + T(m+1,j,0). Iterate to find the optimal such m.


    A good first step is to lower again memory usage from n^3 to n^2 while preserving the time complexity

    I agree. Let us first try to achieve this goal. I do not yet have a clear idea of how to attack it.

    Now, about Hrushikesh #1: I conjecture that my O(n^3) algorithm can actually be turned into O(n^2*K), if you can cover everything by K rectangles.

    For each pair (i,j), I now hold the optimum for every height k. How about hold, for every X, the lowest height you can solve by using ≤ X rectangles? In other words, T(i,j,k) decreases from X to 0 as k increases. Can I just hold the points where T(i,j,k) strictly increases?

    If this attack works, it would give an O(n^2) solution for the entire family of Decide_k.

    ReplyDelete
  7. #4 CarpathianBlackMetal

    Mihai, I think there is a well-foundedness problem with your description of how to compute T(i,j,0).

    You say that you draw a rectangle from x(i) to x(j) and then compute the rest with T(i,j,new height), but this value has not yet been computed.

    I am missing something?

    ReplyDelete
  8. Mihai #3.

    Mihai, I think there is a well-foundedness problem... You say that you draw a rectangle from x(i) to x(j) and then compute the rest with T(i,j,new height), but this value has not yet been computed.

    No, I was careful about that. Note condition 3 in Observation 2. It says that you only need to consider this rectangle if it covers both i and j. This means that once you kick out the points covered by the rectangle, the minimum (by x coordinate) is strictly larger than i, and the maximum strictly smaller than j.

    You can look at T(this minimum, this maximum, new height)

    ReplyDelete
  9. Mihai #4.

    In retrospect, my suggestion in Mihai #2 for how to get O(n^2 * K) does not work.

    I now have a different suggestion, which seems more plausible: let T(i,k,x) be the maximum j such that you can cover all points between i and j, and above k, using at most x rectangles.

    Does this work for getting O(n^2 * K)?

    ReplyDelete
  10. #2 Hrushikesh
    "#3 CarpathianBlackMetal": You are indeed right, k=2 is linear, and k=3 is O(n log n). Probably we can get it down to O(n log n) for any k.

    "Mihai #4": Wow, that does work.
    Putting it down it here for sake of completeness:

    State is T[i][h][k] denoting the maximum j such that all points between i and j with height >= h can be covered using at most k rectangles.

    Answer to the instance is yes if T[0][0][K] = (n - 1)

    Observe that T[i][h+1][k+1] >= max(T[i][h][k+1],T[i][h+1][k]).

    So, in the manner similar to O(n^3) solution to original problem, the configuration for T[i][h+1][k+1] covers the point (x[t],h) or not. If yes, T[i][h+1][k] = T[i][h][k]. Otherwise find the smallest number of rectangles required to points in [i,t-1] with height >= h+1, let this number be l, which can take only constantly many values. We have that T[i][h+1][l] >= (t - 1), and then, T[i][h+1][k] = T[t+1][h+1][k-l].

    Handle T[i][0][k] differently, similar to the O(n^3) solution for original problem.

    We get a O(n^2 klogk) solution.

    ReplyDelete
  11. Mihai #4.

    Hrushikesh, it's good you verified the O~(n^2 k). Weird algorithm, indeed.

    What makes you optimistic about getting O(n lg n) for any constant k? I do not see it. Even if you get O(2^k * nlg n), I would find this quite interesting.

    ReplyDelete