WebDiarios de Motocicleta: November 2007

Monday, November 26, 2007

Theory Channel Line-up

Via Muthu, I hear about the Aggregator. Here's the deal: most people use RSS readers (which they should be, for example reader.google.com is wonderfully easy to use). But if you're among those people who deliberately avoid readers, you find yourself navigating to all the webpages for various blogs. The aggregator can save you some clicks, by merging all Theory CS blogs into one page. So if you're a theorist and reader-hater, bookmark this page.

PS: Yes, the picture of me on this site (left side below) has great nostalgic value. But a few restless years later, I look more like on the right side...

May'04 ---??--> Oct'07

Sunday, November 25, 2007

Death to intellectuals

Ion Iliescu, former president of Romania (1990-1996, 2000-2004), now has a blog. Of course, after the ever ridiculous Mahmoud Ahmadinejad got his blog, this may not sound so shocking. But, trust me, this 77-year old Romanian blogger really is special.

To my international readers: if some stories (in particular, the miner story) seem just too much to be true, let me assure you -- they are entirely true, and accepted history.

Communist years. Iliescu got his political training in Moscow during the first years of Romanian communism, and compensated for being homesick by contributing editorials to the Romanian state-run newspaper (Scânteia) . The topic was of course the contrast between the primitive capitalist imperialist cannibals, and the new glorious communist society.

Back to Romania, he quickly advanced in the nomenclatura to be a minister. In the 70s he was seen by everybody as Ceauşescu's heir, which began to freak out the dictator. As a result, Iliescu was sent to be prime secretary in various remote counties.

The revolution. The Romanian revolution is a unique event in European history. Unlike the velvet revolutions in other communist countries, this one was not about throwing three stones and breaking two windows: at the end, more than 1100 people were dead, and 3300 wounded. We know who these people are: demonstrators who were assailing government buildings. But unlike other bloody conflicts in recent memory, we have no idea who shot these people and why. The official reports indicated that "unidentified terrorists shot the demonstrators." (rather sensitive choice of a word, isn't it?)

One must find it truly remarkable that to this day we have no plausible explanation. In similar cases, we always have a very plausible official version (say, Bin Laden), which conspiracy theorists choose not to believe. Personally, I seldom count myself among conspiracy theorists, but faced with the lack of an explanation, I have to become one.

Many (most?) Romanians have suspected that Iliescu and other leading party members seized the opportunity of public unrest to stage a revolution, and seize power. The secret service was ordered to shoot enough people to make the revolution credible, without actually stopping the demonstrators. NB: it seems clear that a militaristic communist state had enough bullets, grenades and tanks to keep demonstrators out of the presidential palace, especially with several days' notice.

Conspiracy theory aside, the fact is that high-ranking communist officials, led by Iliescu, emerged as leaders of the anti-communist struggle (!!), and obtained control of "revolutionary party." In the days after the revolution, Iliescu was an inspiration to democracy fighters with statements like:

Let's build a Communism with a humane face! (eternul "comunism cu o faţă umană")
Ceauşescu tarnished the noble ideals of Communism ("a întinat nobilele idealuri")
The multi-party system is a backwards concept ("sistem retrograd")

Hooligans. When the communist turned-revolutionary party (FSN) announced that it would participate in the elections it was organizing, this was too much for politically-aware Romanians. Bucharest intelligentsia, led by professors and students, took the streets, and occupied one of the main squares in Bucharest (hence, the movement was soon known as Piaţa, "The Square"). They staged a sit-in that lasted several months, featuring some hunger strikers, and, rather uniquely, lots of singing. The demands were that former communists not run in an election they were organizing, and that national media be granted political independence.

The authorities (aka the revolutionary party) turned the propaganda machine against the demonstrators, aided of course by control of the media. On some days, Iliescu would call the demonstrators worthless hooligans, who might as well remain on the streets. On other days, the demonstrators were fascists, probably infiltrators from the West. A remarkable quote by another leading party member (Brucan) was "Of course we will not talk to them; how can you talk to someone who doesn't eat? My only advice is that they should have a steak."

The response of the demonstrators was to proudly adopt the name hooligans (esp. in the Romanian form golani). A song of those days, which over the years has gained moving overtones, proclaimed that "I'd rather be a hooligan, than a party activist/ I'd rather be dead, than again communist." Song on YouTube.

Miners. Unable to address the Piaţa problem, Iliescu decided to take drastic measures: he called the miners from the Petroşani region to "protect the democracy." Armed with identical fighting clubs and a mob mentality, some 12000 miners boarded special trains for Bucharest. Stopping briefly in Craiova (where I lived), they managed to trash a neighborhood in half an hour, with apparently no good reason.

Reaching Bucharest, they immediately attacked the demonstrators, shouting two infamous slogans:

Death to intellectuals! (Moarte intelectualilor)
We work, we don't think! (Noi muncim, nu gândim) -- in line with the standard communist doctrine that the working class is the only legitimate ruler of a country.

The attack was extremely violent, despite the fact that the demonstrators were defenseless. Official counts of the Iliescu regime put the number of victims at 8 dead. Since then, the numbers have been revised to 200-300 dead and thousands wounded. Several mass graves have been discovered, in which a crushed skull is the common cause of death.

That night, students wrote on the university wall adjacent to the square "TianAnMen II."

In the following days, the miners roamed the streets looking for potential revolutionaries (i.e. beating up people with a beard or long hair). To encourage democracy, they also trashed all opposition parties, making a famous report that they had found intoxicating drugs that the opposition was going to use to influence voters, capitalist propaganda materials, "and a typewriter."

At the end, Iliescu came down among the miners and (in a live broadcast on national TV) thanked the miners "for being a strong force, one with an attitude of high civic conscience."

Years later, Iliescu denied that he had anything to do with the violent actions of the miners. In another remarkable quote, he said "I had called the miners to plant pansies in university square" (să planteze panseluţe) after the demonstrators retreated.

Here is a video capturing key moments of those days. It should make sense even if you don't speak Romanian.

Then, Mr. Former President, I can only suggest that you decorate your new blog with pansies. Welcome to the blogosphere, where I can assure you that your feelings towards the intellectuals are entirely reciprocated.

Friday, November 16, 2007

Good Friday


~/work% mkdir 4D
~/work$ cp ~/bs/preamble.tex 4D/paper.tex
~/work$ cp ~/bs/hacked-article.cls 4D/article.cls
~/work$ cp ~/bs/hacked-size11.clo 4D/size11.clo

To: coauthor-1. Sorry, I am a bit delayed due to another paper, and I still don't have a complete draft, but the paper is falling nicely into place. I promise to have a first draft out by Monday morning.

TODO:

title?
abstract
intro. Track down fractional cascading refs!
figure out complete list of coauthors
figure out how to analyze Bob's side of the rectangle (NB: we're losing smthg like a log or log^2 in the density of queries, is the intuition still true?)
technical section
make nice drawing with 2 coupled trees
does a 2-sided lb seem doable? Figure out whether to claim it "in the full version".
announce social event Monday 4am, G575. Buy cookies for event.
confirm postdeadline beer @CBC, 8:30pm.

Tuesday, November 13, 2007

Is Google paying you?

Kamal Jain gave a talk today on "atomic economics". Here is my own description of what he meant, though it seemed Kamal disagrees with at least some of my description :)

Think of credit cards. Whenever I pay with one, the store only gets 1-x% of the posted price, i.e. the the credit card company makes x% of the money. That is nice profit for the CC company, but as always, competition is driving all profits towards zero. Since there are multiple banks willing to give me a CC, and they all retain the same percentage from a merchant, they need to compete on the customer side --- they have to fight for the honor of counting me as their customer. So they start giving me points / miles / cash back / etc. Getting 1% of your money back is already quite standard, and customers need more to be impressed. To some extent, this incentivizing actually works (personally, I tend to use my AmEx card due to the better reward system).

Now, Google and Yahoo are making money because of me (merchants pay them when I click on an ad). In principle, this means we are not at an equilibrium: since these companies are competing for customers (equal ad revenue), one of them should start offering me a share of their profits whenever I click, and the other will have to follow suit.

Kamal sees several problems (market inefficiencies) that will prevent this welfare-maximizing scenario. I also see them as problems, but I have a different opinion on the degree of surmontability. Problems:

Customer abuse: profit sharing has to be tied to some actual buying event, or otherwise I will have an incentive to spend my day clicking on ads (and not buying anything), driving the merchant out of business. This is a problem for branding ads (e.g. eBay likes to display worthless ads all the time to build its name into collective conscience), and for ads that are separated from the actual purchase (e.g. an ad to a restaurant). But for e-commerce, which seems to be the raison d'etre of search advertising, this detail can be worked out by sharing information between Google and the merchant.
Merchant fraud: the merchant marks the price up by $5, adds $5 to the advertising budget (getting a top slot), and tells me that I will get $5 cash-back from Google (which means I don't care their price was $5 higher than the competition). Realistically, this won't happen, for several well-understood reasons. Common sense dictates that Google will only share a percentage of their profit, so I'll only get, say, $2.5 back, which means I care about the price markup. Like credit-card companies, Google and Yahoo have a weapon to enforce this common sense: customer differentiation. This is oligopolistic behavior, but one that is remarkably stable. What happens is that frequent buyers (resp. people with good credit histories) get better cash-bask deals, because they are presumed to have more authority to "negotiate". Then, the merchant gets a mix of people with different cash-back percentages, but who look the same to the merchant.
Bounded rationality, a heavily used buzzword, and the really exciting consideration. Unlike traditional advertising, Google and Yahoo have really tiny costs. This means they can still make a ton of money with a razor-thin cut of the merchant's profit. For instance, if Google is paid 0.1% percent of the transaction value, what could they offer me as an incentive? Maybe 0.05% of the price. But that will be irrelevant to me -- the mental cost of remembering that Google gives me more cash back on lingerie gifts vastly outweighs the 0.05% payoff.

Kamal's point is that the last phenomenon is happening more and more often, as the economy moves to an atomic scale, making the relative cost of rationality in each decision increase. He draws a funny parallel to the Heisenberg uncertainty principle, which IMHO only goes as far as "on a small enough scale, things become funky".

The market inefficiency generated by the small scale of Web transactions is great for Web-based companies like Google, since it means they can make huge profits below the radar of competition forces. The sad news (for a scientist) is that it all becomes a social competition, instead of a scientific/economic one. It's much more important if I, the customer, like Google's colors than if I'm getting my fair share of cash back.

On a larger scale, is this a social problem? I guess so, but not a very big one. If our threshold for irrationality is below 1%, the total income of such companies cannot grow above 1% of the money we all spend. But we're already paying the Federal government some 30% of what we make, and quite possibly many of us are happier with the success of Gmail than the hearts-and-minds campaign in Iraq.

PS: Given the topic, the number of jokes and anecdotes in Kamal's talk was too small by an order of magnitude. This reinforces my belief that we shouldn't be doing this kind of thing (i.e. study pure economics, because it is has something to do with networks or computers). If such pursuits will be more than wasted theoretical brain-cycles, they need to impact managers and policy-makers. But for that to happen, we need to give talks at the level of political speeches, and no self-respecting theorist is willing to do that. People in economics (who incidentally serve on policy-making boards every now and then) have already developed the art of communicating to politicians.

Friday, November 9, 2007

Branding and STOC/FOCS

Kamal Jain gives some advice on resubmitting papers that got rejected from STOC/FOCS. That reminds me of a question I've had for while: can you build up a name by only submitting papers in the right places (i.e. where they get accepted)?

Personally, I feel very strongly about self-selection. In my (short) career, I have had a total of 2 rejects:

a paper rejected from ICALP, which got accepted to the subsequent SODA ;)
a "short paper" rejected from SODA, which essentially disappeared (we came up with new ideas and obtained much better results). Incidentally, I'm glad the short track at SODA was stopped. It was a temptation to do mediocre work -- if you write a really good result in 2 pages, you will always get accepted even without the short track.

I self-select due to my own ethical beliefs, independent of the environment in the community. However, I'm often told that this is not career-optimal from a selfish perspective: if you shoot for a reasonable reject rate, say under 50%, you can often push papers up a notch, making for a better CV.

Is this advice short-sighted? Does having a low reject rate help you in getting more attention from reviewers? If a reviewer knows you're not a "deceiver", he will feel uneasy about rejecting, and will think about it for another minute -- at the very least, he knows one person in the world (you, the author) finds the paper interesting. If a reveiwer has already rejected a couple of your papers, he won't have so many moral problems -- after all, it's possible that you yourself don't think the paper is good enough.

What do you think? Can you get reviewer sympathy by active self-selection?

Tuesday, November 6, 2007

Cute Problem: Wooden Convex Hull

I didn't manage to keep an earlier promise to post olympiad-style problems... But here is a cute one.

You have n trees at given coordinates in the plane. You want to build a polygonal fence around the trees. However, the only way to get wood for the fence is to cut down some trees in the first place! Let's say each tree gives you enough wood for M meters of fence. The goal is to find the minimum k, such that you can cut down some k trees, and surround the rest by a fence of M*k meters.

What is the best algorithm you can find? Express your answer in terms on n and k.

As a bonus challenge, find a fast algorithm for the case when trees give you a different amount of wood M₁, ..., M_i.

Monday, November 5, 2007

Napkins

The inexhaustible Muthu came up with the fantastic idea of assembling napkins used for research-related work. This could turn out to be very interesting.

Here is my own contribution: a Starbucks napkin filled during a 2 hour caffeinated discussion with Alex. It contains the proof of a communication-complexity lower bound for near-neighbor search in L_∞, matching the current upper bound of Piotr from FOCS'98. We were working on this problem during our summer internship at IBM Almaden (yes, yes, we're officemates but we had to go to California to write a paper together). As you might expect, the fact that we're now trying to finish the details has some correlation with the Nov 19 fireworks party :)

Friday, November 2, 2007

A Numbers Game

I have said before that I believe theory/CS is a numbers game. During a few discussions sub vino, I was challenged on this issue. Two main points seemed to emerge:

What matters is your best result (L_∞).
STOC/FOCS get 10-15 obvious accepts, and a heavy tail of competent papers, among which it's hard to choose. If we care about making STOC/FOCS a quality measure, why not accept just those 10-15 papers?

One of my main points is that I don't actually trust ourselves to get either of these 2 schemes to work. The first degenerates into the power of public hype and recommendation letters, since none of us will actually solve P vs NP (or any other result that is universally considered important). The second degenerates into a bloody political struggle, since the top few places depend too much on randomness.

I have seen enough examples arguing for pessimism, mostly while I was acting as a reviewer. But since I can't actually talk about those, let me give 2 personal examples.

Predecessor. In STOC'06, Mikkel and I had our predecessor lower bound. The main claim to fame of this paper is that it was the first lower bound to separate linear and polynomial space.

When it comes to such a claim, in my opinion, it doesn't actually matter if you remember anything from the data structures class you took in college. It doesn't matter if you know that the bound is showing optimality of a very influential data structure from FOCS'75. It doesn't matter if you know this was one of the most well-studied problems in data-structure lower bounds, and people were actually betting the previous lower bound was tight.

No. Somebody comes and tells you that we've been toiling at this lower bound game for more than 20 years, and we haven't yet managed to show that a data structure with space n¹⁰ is sometimes better than one with space O(n). Now, they tell you, such a bound is finally here, and a vast array of problems from Algorithms 101 have switched from unapproachable to hopefully solvable. That should be a worthy result.

How did it feel from the inside? This was a paper we had worked on for 2 years. Just the conceptual understanding that we needed to develop felt like one of the longest journeys I have ever embarked on. And in the end, it was actually a clean result, which indeed needed conceptual developments, but not so much technical pain! Mikkel, who never ceases to amaze me with his youthful optimism, was betting on a best paper award. I was of course at a university, and had a better estimator of the community hype factor. I knew that Irit's PCP was unbeatable for an award, but I was still convinced our paper would go down with a splash.

The actual outcome? Moderately positive reviews. Nobody took notice. And we weren't even invited to the special issue, which was a genuine shock for us.

Range Counting. In STOC'07, I had a lower bound for 2D range counting. After the predecessor paper, I quickly decided this was the next big goal in the field. Range queries are very, very well-studied, play a crucial role in our understanding of data structures and geometry, and are a major bridge to worlds outside theory, such as databases (see this irrefutable argument). Armed with some appreciation of the techniques in the field, this also felt like a make-or-break problem: we were either going to solve it, or get stuck again for 20 years.

Though I like to look at cell-probe lower bounds, range counting has a life of its own. This class of problems was studied from the dawn of computer science, way before our understanding of models crystallized. People have been proving many excellent lower bounds for range queries in the so-called semigroup model: each input point is associated with weights from a semigroup; the goal is to compute the sum in a "range" (rectangle, circle, wizard's hat etc), using only the semigroup addition. In this model, we obtain a good understanding of geometry: it is enough to characterize a partial sum of a subset of points by the extreme points in a few directions. Addition only allows us to "build up" subsets, so we need to understand composability of constant-dimensional shapes, dictated by such extreme points.

While the bounds for semigroups were making very nice progress, we remained clueless about even the slightly stronger group model, where subtractions are allowed. This is the gap between understanding low-dimensional geometry and understanding computation (a very high-dimensional, unstructured space, such as the n-dimensional space of linear combinations). Proving bounds in the group model became the first lower-bound question in the surveys of Pankaj Agarwal and Jeff Erickson. Thus, the group model seemed hard enough; never mind the cell-probe model, which is significantly stronger than a model with additions and subtractions.

About a year after predecessor (I had the final critical insight one week before STOC), I managed to show an optimal lower bound for 2D range counting in the cell-probe and (of course) the group model. Again, it does not seem to me like a paper where you need to be an expert to understand that something interesting is happening. Citing from the abstract:

Proving such bounds in the group model has been regarded as an important challenge at least since [Fredman, JACM 1982] and [Chazelle, FOCS 1986].

It seems to me that whenever somebody claims to address a challenge from JACM'82 and FOCS'86, the only possible answers are:

You are lying.
The paper is wrong.
It seemed like an interesting problem back then, but things have changed and we no longer care. (... and I don't see this applying to range queries.)
Yes, this is a very good paper.

The actual outcome? Lukewarm reviews: "seems like an interesting result, and there is a comma missing on page 7". No student paper award, which again I was expecting because my hype-meter was well-aware of Sergey's paper. (In another great proof of our intellectual maturity, this otherwise nice paper might go down in history for using Mersenne primes, rather then understanding something about locally decodable codes. Oh well, those who can't put primes in their abstracts might as well stick to cannabis.)

Finally, the paper was not invited to the special issue. By this time, it was not something that had never crossed my mind; it was just a disappointment.

Conclusion. You should work on major open problems. If you solve them, it makes for papers which are reject-proof (well, almost). But don't count on that "one big result" plan, as you may be in for a nasty surprise. As a community, we are reliable enough to get some kind of average right, but we're not doing so well on L_∞.

So when you get your first STOC/FOCS paper, go open the champagne or light up the cigar. You are officially part of the game. Now, it's time to play.

Next round: November 19.

WebDiarios de Motocicleta