WebDiarios de Motocicleta: August 2008

Friday, August 29, 2008

SODA

I am happy to report that our SODA submission on The Geometry of Binary Search Trees was accepted.

This paper works on the old question of competitiveness in the binary search tree model (some old posts of mine: 1, 2, 3), and makes some progress that is not easy to quantify. We propose a geometric interpretation of the problem, which seems to eliminate all the messiness attached to trees, and makes previous lower bounds (the Wilber bounds) very clear. The geometric view is somehow very appealing; a referee called this "an amusing paper" (in a very positive comment), and I will happily agree to this description.

Unfortunately, our work does not lead to a new upper bound, though it leads to a reformulation of an old conjecture. Joan Lucas and Ian Munro independently introduced a natural offline greedy algorithm, which they conjectured to be an O(1) approximation. Speaking informally, the prevailing opinion seemed to be that this had to be an O(1) approximation (modulo technical details in proving it!), but the really interesting question was to show an online O(1)-competitive algorithm. Well, our work shows that this view was wrong: we can make the Lucas-Munro algorithm online, with a constant slow-down.

Thus, if the conjectures of Lucas and Munro are correct, this tree achieves dynamic optimality. This is the first proposal for dynamic optimality after the famous splay trees that started the whole field.

Our geometric view also has consequences for the lower bounds (which are crucial if we want constant approximation). The only known lower bounds were Wilber's 2 bounds from FOCS'06. Wilber's first bound was used in our O(lg lg n)-competitive algorithm. Though Wilber conjectured that his second bound is stronger, we still have no idea how the two compare.

In our framework, Wilber's bounds are special cases of a very natural class of "cut bounds," and they get very simple descriptions. Instead of comparing the two Wilber bounds, we now face a broader question of finding the best possible cut bound. We solve this by proposing a new lower bound that is shown to be an O(1)-approximation to the best cut bound. This supersedes Wilber's results, though we do not know that it is strictly stronger than Wilber's bounds.

Monday, August 25, 2008

FOCS Best Student Paper

It seems I won the best student paper at FOCS. Yay. I suppose having 3 of the 5 student papers finally did the trick. :)

The prize goes to Succincter, which seems to make it the first "best student paper" at STOC or FOCS given for a data structure. I can't remember any "best paper" given for a data structure either, though I cannot find these lists. My special thanks for people in approximation algorithms, hardness of approximation, game theory, coding theory, complexity theory, quantum computing, and cryptography for making this possible :)

All in all, I think "Succincter" is a decent pick. It's a paper that might have a big impact on how we think about succinct data structures in the future. It's also a problem that's been pestering me since June'05, when Seth described it to me during my visit at Max Planck, Saarbrücken. (Thanks, Seth!) Finally, it got some brilliant reviews, including one that only said "Yes" and one calling the question a "target known for at least 2 decades".

On this last point, the key lesson seems to be that you should write your introduction in the last 45 minutes before the deadline, after sleeping 2 hours/night for 3 days in a row. After an incoherent introduction, the reviewers will feel the need to sing their praise of the paper. By contrast, as I learned from the reviews of another student paper, it is "incredibly poor taste" to say in your introduction that your results are clean, or, god forbid, explain how they simplify a proof by some other people.

Friday, August 22, 2008

Olympics

It's been a bad olympics year...

In rowing/women's eights, Romania got a scandalous Bronze, breaking a streak of Golds lasting since '92. It was also the first time since 1976 when they didn't get Gold or Silver.

Gymnastics disappointed badly, with nothing more than a Bronze in women's team and a Gold in women's floor (compared, say, to 4 Golds + 3 Silvers + 3 Bronze in Athens 2004).

Handball is bound to finish on the 7th or 8th place. As a Romanian editorialist aptly noted, the intensity of online swearing that this has produced was previously reserved for soccer.

But above all, Romania got 1 Gold, 1 Silver, 1 Bronze (and 1 nothing) in the IOI happening this week. While the team deserve our warm congratulations, we must be honest and say that this is a disappointing result. In my years, we got 2 Golds + 1 Silver + 1 Bronze several times, and we kept thinking that we should do better...

Nevertheless, our IOI performance was stellar compared to what Romania did in the IMO. For a country used to perfect scores and top participants, getting no Gold in IMO'08 is outright abysmal.

Romania has to get its act together before it's too late. The one thing that we cannot afford to learn from Europe is their acceptance of mediocrity. It is not particularly bad being a mediocre Westerner, in no small part due to a pervasive feeling of entitlement. But Romanians share no feeling of this sort. In the global conscience, the mediocre Romanian is the strawberry harvester of Spain, the construction worked of Israel, and the purse snatcher of Italy. Without a constant flow of outstanding results, these images will define how the nation thinks of itself.

Tuesday, August 12, 2008

Puzzles

I have a deep aversion of the "puzzle" concept. Delving into a problem is a very taxing process for me, and I just refuse to do it for the gratification of solving something that people actually expected me to solve fairly quickly. My brain can only internalize so many concepts in one day (like, maybe 2), and I can't waste this resource just so I can feel good about solving a small little problem.

Thus, when I read a recent blog post about how a theorist and two mathematicians spent their time after meeting randomly on a bus tour, I felt as if I was reading about an entirely outlandish experience, and I couldn't stop thinking "Why the heck... I can't believe they actually did this..."

Of course, I reminded myself that many people seem to love puzzles. An old friend used to say that "the brain doesn't rest, it atrophies." My adviser's office is the nightmare of any claustrophobic person: it is overflowing with small physical puzzles. I often fidget with them when I'm talking to Erik. The fact that I usually "solve" these problems by 5 minutes of random hand movement only confirms that I should never think about such things :)

My brain applies the same self-defense strategies when it comes to olympiad problems, which is why you don't see too many olympiad problems on this blog. When I originally read the Dwarves problem, I said "DP" (dynamic programming) and stopped. Only the next day, when 3 IOI guys claimed they really couldn't solve it, did I persuade myself to think about it...

Thursday, August 7, 2008

Thesis Defense

Yesterday, I defended my non-existent thesis, apparently with success. (The thesis itself will be done on August 29, at or around 2:58pm. There are plans to make it readable.)

Here are the slides from the defense in PPTX (also saved as PPT with minor screw-ups). Sorry, no reasonable PDF. Due to lack of time in making the slides, a lot of the presentation was verbal commentary, but hopefully the slides also have some value.

In other news, I am once again fed up with Powerpoint, and I will be switching back to LaTeX/Beamer. Last time I quit Beamer because xfig sucked, but in the mean time I discovered TikZ and it is impressive.

By the way, when discussing the history of cell-probe lower bounds, it occurred to me that the two major foundational papers that really started the whole field were:

[Ajtai 88], published in Combinatorica, but proving a false result.
[Fredman, Saks 89], published in STOC, containing tons of claims without proofs, and not followed by any journal version.

Idiots. It's not as anybody will remember them 20 years later for starting an entire research field. Research is about proving immortal theorems that represent absolute truth. This requires triple-checking of all proofs, and inclusion of absolutely all details (even in SODA submissions). The proofs of such absolute truth must be deposited in journals for perpetual archiving*, thus preserving the name of the author forever.

[[ *Though paper is an almost perfect solution to preserving your ideas into cold immortality, some care must be exercised. To prevent ware and tare, it is recommended that your piece of absolute truth does not motivate any reckless readers to actually open the journal at the pages of your paper. ]]

Incidentally, these two papers are also the "second paper" in the field (after Andy Yao defined the model a decade earlier). This reminded me of Luca's very sensible "second paper" observation.

Sunday, August 3, 2008

Algorithms for Farey Sequences

I recently uploaded the final version of my paper with Jakub Pawlewicz, due to appear in Algorithmica. The paper has a sparkling story (from my own perspective, at least) that should be told.

Starting research. In my glorious freshman year at MIT, when I was working on P vs NP and convincing random professors to pay me, I was trying to understand what research meant. This was not an easy task, since the closest thing I had to a mentor was Alex, who had just started research the previous summer. (Erik didn't quite qualify, since my second meeting with him was 3 or 4 months after he had started paying me.)

In any case, the lesson that I managed to absorb from the environment was that you're supposed to ignore all the cool stuff that's in books, and come up with new algorithms that were not known before Earth was graced with your existence. Then, you write papers describing these new algorithms, and you join the cool club of people who are in the business.

Judging that I might not have a paper on P vs NP soon, I decided to start looking for other problems, and I remembered a book on algorithms that I had read in middle school. This was the kind of book that Romanian associate professors write (because you need to have a number of books before becoming full professor): it contained all sorts of unrelated random topics that the author had come across since writing his previous book. Next to planar closest pair in O(n lg n) was a discussion of Farey sequences and an optimal algorithm for generating them. (See my older post for a definition of Farey sequences and a cool application.)

As soon as I remembered this algorithm, I saw potential for a "result": if we ask for just the kth element of the sequence, can we obtain a sublinear algorithm that doesn't generate the whole sequence? Here "sublinear" means better than O(n²), because the Farey sequence of order n has O(n²) elements. Of course, there is no real motivation to this problem, but this didn't bother me at the time.

A result. Soon enough, I had an algorithm running in O(n lg n) time, and summer was coming up. I was a member of the Scientific Committee of the Balkan Olympiad in Informatics happening that summer, so I assigned the problem in the competition. (Don't worry, I'm not a big fan of assigning "unrealistic" problems, so my time bounds allowed for an easier and less efficient algorithm that I had discovered at first. Even so, it turned out to be the perfect "hard problem," that required some unusual ideas and was solved only by a couple of people.)

Returning to "research mode," I used Google to check if my result was known (seemed not), and discovered the Algorithmic Number Theory Symposium (ANTS). I then wrote a paper with my ex and sent it to the conference. Looking at it in retrospect, it looks like a very childish paper. Nonetheless, Math conferences are not competitive, and it got accepted. The referee report contained the best polite equivalent of "the authors don't know what the heck they're talking about" that I have yet seen. It read: "The authors unintentionally hide the fact that the quantities A(.) that they define in the course of the solution are actually the Mertens function."

At the conference, people were entertained by the paper. They asked if we could get an O(poly(lg n)) algorithm on a quantum computer (I maintain my original response: What??), and asked whether we could count primitive lattice points inside polygons efficiently. A primitive lattice point is a point (x,y) where x and y are coprime; the Farey problem can be reduced to counting primitive lattice points in a special class of triangles.

I kept thinking at odd times about getting a faster algorithm, and about counting primitive lattice points in larger classes of polygons. I had a complicated algorithm running in O~(n^3/4) time, and some results for primitive lattice points. However, by then I was engaged in doing research on more high profile problems, and I could never get myself to write another paper on this topic.

Shock and awe. As it turns out, not publishing a follow-up was a brilliant decision. Last year, Krzysztof flew to California for my birthday party. Before the discussion got elevated by consumption of alcoholic beverages, he managed to tell me that Jakub Pawlewicz, a student at his alma mater in Poland, had written a follow-up paper about algorithms for Farey sequences. As it turned out, Jakub had a very nice algorithm achieving O(n^3/4) time.

At first, I was amused and surprised by the fact that someone had actually picked up the problem! Then, I found out that the paper was accepted to ESA'07, and I was even more surprised (I had never considered mainstream CS conferences for such a problem). But only when I found out that it got the Best Student Paper Award at ESA, was I totally blown away!

Now, had I published my follow-up paper, then almost certainly history would have been different. I would have never found out that the best student paper at ESA can be won for a follow-up paper to something I did in freshman year! As you can imagine, this made for some entertaining pub discussions, like:

Admired top-notch researchers get research awards. Also, there are some godly figures, so established, so influential, that research awards are being handed out for works in the fields that they created. But who else can claim that research awards are being given for work in a field that they introduced in freshman year? :)

The aftermath. Despite the entertainment value, getting scooped was not entirely a nice feeling, since I am, like any researcher, a bit attached to my results. My consolation was that Jakub is also a former medal winner at Informatics Olympiads (IOI and CEOI), so at least I had been scooped by a very worthy person.

Fortunately, I lucked out for the second time. I asked Jakub for the paper, and his solution was so clear and elegant, that I found it easy to plug my old ideas into it. Within the hour, I saw how to improve his algorithm to O(n^2/3) and was writing him an email about it. Yes, I imagine he hates me for this episode :)...

In the end, we merged the two results into the paper I mentioned in the beginning, which appears in the Algorithmica special issue for ESA'07. This was the ideal outcome for me: I got my ideas published with minimal effort, and with a bonus story.

Friday, August 1, 2008

Open Access?

I recently received an invitation to join the editorial board of a new journal called Algorithms, which claims to be an open access journal. Circumstantial evidence suggests the invitation was distributed to more than a handful of people. The list of people who have already accepted contains some established members of the community, as well as many names I have never heard before.

The main idea of this particular publisher is to have a publication model in which authors pay to get published (in the neighborhood of 1000 dollars!), and readers can access the papers for free on the Internet. Though you may choose to call this "open access," I profoundly dislike the name. As soon as you attach a big cost at either of the two ends of the business (publishing or reading), it stops being "open." It seems to me that we have another case of a creative publisher trying to scheme us into paying them large amounts of money for no value added.

You will not find my name on this editorial board.

While on the topic of open access, I must say that all my support is behind Theory of Computing. This wonderful initiative gives us a truly open access journal (zero cost for everyone), and a journal of superb quality. Based on articles published so far, I view ToC as being in the same constellation with SICOMP.

In the unlikely future universe in which I will not be overcommitted by several notches above the sanity level, I will be able to do things without deadlines. Then, I may write some journal papers without the deadlines/incentives of special issue, and I will try to send them to ToC.

WebDiarios de Motocicleta