Thursday, June 19, 2008

LSD Randomized Lower Bounds

Apparently a few people took issue with my (Data) Structures paper because it claimed a randomized lower bound for LSD without a proof, and this propagated to the FOCS PC. I recently received an email from the PC asking for a sketch of the proof, pronto.

As I explained before to anon commenters on this blog, I don't know what the big fuss is about. The paper is certainly not trying to claim credit for this proof (it is not even mentioned in the introduction). It's just trying to say "yeah, we know how to do it, and it's not interesting; don't waste your time on it." The results of the paper are interesting independently of any LSD proofs. If you don't like my claim, you only get deterministic lower bounds (see here for deterministic LSD lower bounds, which were known since STOC 95). No big loss.

But since I had already claimed that the randomized LSD lower bounds are straightforward (once you understand the literature), I had the moral obligation to support my claim. So here is my PDF response, in which I give a (sketch of a) proof for the randomized bounds. Don't be too harsh on this; due to the whole context I only had a few hours to write the thing, so some calculations of constants are left out. The writing is also bad, but hopefully if you're actually interested in the topic you already know enough to follow.

I think we need to grow up as a discipline, and understand that the ideas of a paper are not measured by the formula depth. If it's painfully complicated (which this proof is), it doesn't mean it's actually interesting or cool or worthy.

14 comments:

Michael Mitzenmacher said...

Hi Mihai. Having looked at your paper, I'll go on record as really enjoying the insight presented. I'm a believer that the "computer science" view of the world and the "information theory" view of the world should be brought closer together, and this paper does that nicely and shows some of the benefits of these connections.

I do think that it's reasonable, however, for a PC to be concerned when it sees unsubstantiated claims made in a paper. You suggest that if they don't like the fact that there's a claim about LSD randomized lower bounds that doesn't seem to appear elsewhere, the PC should just ignore it. On the other hand, you could have avoided the situation by just sticking entirely to the deterministic results, pointing out that any randomized lower bounds would carry over, and then suggest such lower bounds could/would be forthcoming. This would avoid confusion for some readers, and the PC (or any reviewer) has a legitimate right to question if your current writeup could confuse or possibly misread a reader who isn't, say, a regular reader of your blog.

Also, while I understand your point of view is that this LSD randomized lower bound is "boring", given that it's useful in this setting you've established, I'd suggest it does make sense to at some point write it down carefully.

All this shouldn't prevent the results from being published eventually. But given that you've announced them (so there's no issue of credit) and described them, I think it's well within a reviewer's purview to argue that the paper can and should be written better before publication. (And, of course, it's within your rights to argue otherwise.)

Best of luck.

MiP said...

Michael, thank you for the positive comments on the paper. I find it very cool also, but I'm understandably biased :)

Regardless, let me disagree with the rest :) Conferences are not created equal. If I stay up for 3 nights in a row to submit 4 papers to a conference it is because I want them there. Rejecting a good paper because you don't like the light in which some results are presented (which essentially boils down to PC whim) is unacceptable. It is tantamount to disconsidering the author's work, and it destroys all the motivation quickly.

Also, conferences are not created equal because this is the last one when I'm a student. :)

Anonymous said...

Mihai, I understand your argument from the point of view of the author. However, from the point of view of a PC member, would you allow a paper with unsubstantiated claims to be published, even if these claims are not the main result of the paper? I would not, because this will lead to confusion later, and make life hard for people who wish to do follow-up work in this area. There are already too many papers like this, in FOCS and STOC -- where results are claimed, but full proofs cannot be found, even when one contacts the authors. In addition, I know of several papers where full proofs of some results were not presented, and later a bug was found. To prevent such contingencies, I would do what the FOCS committee did. Of course, another alternative could be a conditional acceptance of the paper, provided the author either removes the claims about the randomized lower bounds, or provides full proofs of such bounds.

Anonymous said...

Mihai,

Forget the PC. Why not write the randomized proof and add it to your paper, even as an appendix if you want. The point of a paper is not to prove how smart you are to a PC, but to communicate ideas and insights to other researchers. Also, the proof is not that complicated.

I do not know how to do the randomized proof. If, indeed, randomized lower bounds for asymmetric LSD yield all these other lower bounds, isn't the proof for LSD the one thing that I *really* want to know if I'm thinking about randomized data structure lower bounds for some other problem?

Even in the deterministic case, if the proof is just a "one paragraph counting argument," why not include it for the sake of completeness so that I can read the damn thing without having to go through all the overhead in the paper you reference?

MiP said...

1. To recast my earlier arguments in your terms, an acceptance with the comment "I really hate the claim in those two paragraphs, please remove" is one thing, rejecting a paper is quite another. I disagree with Michael that you have the right to reject papers for trivialities.

2. I did just write the randomized proof, and I explained the deterministic one carefully in my original blog post. But no, the randomized proof will not fit in any reasonable size appendix (within 10 pages).

3. Hastad's PCP or Raz's parallel repetition are key tools in hardness of approximation, but few people have any clear sense of how they're proved. Here, I would expect that most people in data structures will have little interest in the randomized LSD proof, except to know that it exists.

MiP said...

On a second thought, I'm really disturbed by y'all saying that I should be conservative and not tell people a straight story about a result (I know how to do it, it's not interesting). Would you rather have the facts misrepresented? Perhaps then I should write another paper on LSD? (since it's now ointed as an important problem...)

If we want to do serious research we can't allow ourselves to slip slowly towards becoming a bunch of gutless conservatives, who put formalism above anything else.

James said...

I disagree with Michael that you have the right to reject papers for trivialities.

I doubt they will reject the paper. I suspect it would be accepted even if you had replied "Fine, just pay attention to the deterministic lower bounds." But if, for some reason, the reviewers expressed concerns about incompleteness, I don't think it's out of line for the PC to ask you. (In fact, it's nice of them to give you the chance to counter a possibly questionable reviewer.)

I did just write the randomized proof, and I explained the deterministic one carefully in my original blog post. But no, the randomized proof will not fit in any reasonable size appendix (within 10 pages).

Your paper does not contain a link to your blog post. We cannot all track the entire arc of your research career and electronic persona in order to have context for every result. As the reader, I appreciate a (central!) one paragraph argument being included for the sake of completeness.

Even a sketch is helpful, and for reasons that you don't envision at the time. Again, the fact that so many problems are LSD-hard means that there must be something important going on in that proof, even if no new ideas are required to write it down. I would at least appreciate your paper having a guide to figuring out what that something is.

Also, on principle, no one can claim a randomized lower bound via a reduction from LSD unless the hardness of LSD is established somewhere (and that place cannot be "Mihai's head").

Hastad's PCP or Raz's parallel repetition are key tools in hardness of approximation, but few people have any clear sense of how they're proved.

So? Neither author suggested that we should "trust them" about the proof, because knowing the details is not important to use the result. Moreover, it inevitably arises (and happened in both these cases), that at some point one has to dig into the details to make progress, and they have to be there in that 1% of cases where someone needs them.

On a second thought, I'm really disturbed by y'all saying that I should be conservative and not tell people a straight story about a result (I know how to do it, it's not interesting).

Dar? I didn't see anyone assert that you should claim the randomized LSD lower bound is important or hard or new. Just that you should prove it if you want to write "Theorem: ..." in your paper. I often write "The following argument is essentially known, and follows by combining [31] with [12], using the framework of ..." followed by 5 pages of technical arguments. If you don't want to interrupt the flow of the exposition, you can always put it in an appendix.

MiP said...

James, I agree with you on essentially all counts. (I think) they can't reject it regardless of all this randomized nonsense. The randomized lower bound should be written (and will be in the fall). And PCs should talk to the authors freely (and I have on the SWAT PC, as detailed in a memorable post on this blog).

The best temporary solution that I found was to claim the result in the paper so people can cite it. That's not giving me extra credit, but I'm accepting the burden of writing it without claiming credit -- and in the mean time people can cite the conference version. So again, what's the big fuss about? There's zero reason why anybody would want to rebut my claim.

This all feels like an automatic reaction from people that have been preconditioned by the formalist creed: "Aha, no proof! No proof! All hell will break loose! Let's use the power of the PC to avert the disaster!"

I agree that my reaction is too strong for the present case, and I apologize for that. It's just that I've been getting annoyed by hearing these kind of things over the years.

Anonymous said...

I agree with Mihai (and James).

The important thing is to have an actual proof for the main contribution of a paper.

The "peripheral" results -- even if they are technical -- are of minor importance.

In particular, this paper should (and would) get into FOCS, even if the randomized LSD lower bound claim was not there in the first place.

Michael Mitzenmacher said...

I disagree with Michael that you have the right to reject papers for trivialities.

Mihai, can we avoid putting words in my mouth? (Now, and in the future.) I suppose that's your interpretation of what I said, but it's not what I said.

If there's a fundamental point of disagreement, it seems to be that, in my opinion, a reviewer has the right to recommend a paper be rejected if the writing is substandard beyond some threshold. (If you re-read my last paragraph, that seems pretty clearly to be my point.) I'm not sure if you disagree with that, since you seem intent on arguing with something I never said. But perhaps you do. If you do, then we indeed have a point of disagreement we could discuss.

I am not arguing that your paper is below that threshold. But I can understand why an unsubstantiated claim that could be significant to the paper would be of concern to the PC. You are arguing that the claim is substantiated (in this blog and your reply, if not in the paper itself), and that's it's actually not significant. I'd imagine if the PC agrees with that assessment, these concerns will disappear. But their opinion may differ in the end. In short, they may find it less trivial than you. Please note, I am not offering an opinion one way or another, I'm just explaining the framework you're working in as I understand it, and if you read my original post, I'm offering suggestions as to how such issues could be avoided in the future.

MiP said...

Hi Michael! I'm not really trying to put words into your mouth. Sorry if it feels like that.

It's just that in my world view, we have "big ideas" and "trivialities." (yes, I get complaints about my world view all the time)

If you don't get the idea because of writing quality, fine. If you get it and you don't think it's "big enough," fine. But if you get the idea, and it's big, the rest are trivialities.

You can strongly advise the author to explain this connection better, cite this paper, clarify this proof etc etc. But that's not valid reason for even thinking about rejection. PCs should not be adversarial. It would be a great contribution to this community if the image of the PC were improved to a "friendly counselor" from that of an institution giving mean feedback and looking for any reasons to reject.

I guess this is our point of disagreement. Maybe I can convince you to switch to this view before STOC'09 :)

Anonymous said...

Mihai,

Even though you are completely right about the main point of the paper being independent of the randomized LB (and so did the PC think, it seems), your treatment of it was very un-professional:

(1) You mention it as just another theorem. To the reader it certainly seems like you are claiming credit for it.

(2) Due to the importence of this problem (even before this paper, and doubly so now), an improved LB is of significant interest -- it has to be written clearly and carefully.

(3) The fact that you claim a LB BEFORE a full proof is WRITTEN is very problematic: lower bounds are known to often "fall apart" once being written down. In fact, finding the exact notions, definitions, and lemmata for a FORMAL proof is a very large part of the LB work (this is somewhat different than for, say, algorithms.)

In short, this was "your bad". Suggestion: just get over it and publish a full proof of the lower bound in ECCC asap.

MiP said...

Alright, alright. But I must disagree with (1). Claiming credit in CS means writing an introduction meant to bring the readers to tears, which explains how your result is a fulfilment of an important Biblical prophecy.

My theorem is not even mentioned in the introduction, and I never once say that the problem is important or interesting.

ryanw said...

I suggest renaming the
"(Data) STRUCTURES" paper to
"On the Power of LSD". =)