Thursday, August 30, 2007

Love thy predecessor (II): The comparison model

As mentioned already, the super-standard way to solve the predecessor problem is by binary search. This has led people to define the comparison model: you have a bunch of abstract values, and all you can do with them is compare two values; the cost of your algorithm is equal to the number of comparisons performed.

Needless to say, actual computers store values with finite precision, so there are a lot more things that you can do besides compare two values (for example, computing a hash function). The most famous example of a faster algorithm that uses more than comparisons is the beautiful data structure of [van Emde Boas, FOCS 1975], which I will talk about some other time. Until then, suffice it to mention that Don Knuth finds this data structure very exciting in his forthcoming volume 4.

Doubting Thomases. Now, a rather large section of the theory community has developed the belief that we should stick to the comparison model in designing algorithms, ignoring faster algorithms that do more than compare elements. The most vocal adherent to this doctrine that I've met is David. (oh yes, I'm forgetting an anonymous NSF panel...)

Their argument is roughly as following: In real life, algorithms that get implemented only use comparison search. The finite-precision algorithms aren't actually worth it in practice. So we should study what actually matters in practice, which is comparison algorithms.

We are doing the theory of the hard case, not the easy case. Say a programmer comes to you: "I have this array of 100 values, and I want to sort them before I run this exponential backtracking algorithm on them. What algorithm do you recommend?"... At this point, I hope you say "Buble-sort, dude!"... But after that, are you going to go to your office and study the beautiful properties of bubble sort?

Quite likely, all the smart algorithms we ever develop are not really needed in 99,99% of the cases, because that's not where the bottleneck is. If you want to do the theory of the easy case, we should stick to teaching intro to programming. We can include binary search in the course, and make 99,99% of the programmers happy.

Fortunately, there are hard problems in this world, and there is a place for us. And where there are hard problems, there are no rules. If somebody wants something faster than binary search, he wants a fast algorithm, he doesn't want a fast comparison-based algorithm.

Case study. Now, there is a highly compelling motivation for designing fast predecessor search: IP lookup. To route a packet on the Internet, the router has to figure out where the destination IP address fits in its routing table. This is precisely predecessor search -- details later. Thus, a predecessor search is required for every packet going through every router on the whole Internet. It is quite safe to say that: predecessor search is the most executed problem on the planet.

With Gigabit speeds, this is actually a hard problem (you have on the order of 100 machine cycles for each search). Comparison search is definetely out of the question, to the extent that it's not even a footnote in the papers, or a column in the table with experimental results.

A bunch of implementations used Patricia tries, and as you might expect adding heuristics to Patricia tries became a popular sport. Unfortunately, this was still not fast enough, and for serious routers, people threw in a lot of expensive hardware to tackle the problem.

The paper finally providing the theory sanity-check was [Mikael Degermark, Andrej Brodnik, Svante Carlsson, Stephen Pink: Small Forwarding Tables for Fast Routing Lookups]. The paper appeared in SIGCOMM, the authoritative forum of the networking community (yes, it is harder to get into SIGCOMM that into STOC/FOCS, though maybe for the wrong reasons).

Essentially, they are saying: take all that expensive hardware out, and start using a smarter algorithm. Their algorithm starts exactly with a van Emde Boas step (surprise), of course with the proper amount of engineering. In theory, van Emde Boas search is exponentially faster than Patricia tries, which we knew since 1975.


MiP said...

Appologies for the broken "recent comments" section. It is Blogger's problem, and they are slow in fixing it:

Anonymous said...

actual computers store values with finite precision...

People normally don't write programs in assembly for actual computers but this is almost what it takes to make many of the algorithms you're talking about sufficiently fast. (Is real world speed your real argument?) With the exception of a couple super-dooper important problem (like predecessor search or hashing) Karger is totally right. The fastest algorithm usually works in the comparison model (and usually requires fewer lines of code.)

MiP said...

1) I have spent the first 10 years of my computer scientist career doing Olympiads (where you only get points if you code up your idea correctly in a few hours). By virtue of this, I am much more aware of the simple-is-practical paradigm than your average theorist. So trust me, I am not fogetting that.

2) Can you name a few non-super-dooper important problems? For any data structure with running time about lg n, and for any algorithm with running time above nlg n, there is no distinction between comparison and non-comparison algorithms. (Let's not talk just yet about the algebraic models with arithmetic operations plus comparisons).

3) Yes, my real argument is practical efficiency. If a particular problem is a bottleneck in some application, it is not our job as theorists to tell people: "No, I can't really help you solve it faster, but here, I have a very simple algorithm that I really like".

Anonymous said...

Mihai, you say "predecessor search is the most executed problem on the planet" because of IP lookup. I see how trie-based data structures give you IP lookup (longest common prefix) for free, but the longest common prefix need not be a predecessor or successor. What's the simplest reduction from longest common prefix to predecessor search?

Thanks for any clarification you can provide.