Thursday, September 3, 2009

What is efficient?

In my previous post, I wrote about how calling P "efficient computation" is nothing more than bad philosophy. I think a lot more examples are in order.

Are problems constant size? Opinions that P is efficient often stem from the following understanding of what computers are used for. Let's say that I have to optimize the bus schedules in NYC, and there are some 200 bus routes to consider. Certainly, an O(n3) algorithm is worse than an O(n lg n) algorithm, but there is a really large gap between both of these, and an O(n!) algorithm. Indeed, with an O(n3) running time I will have to buy more/bigger computers, and let them run for a week instead of 30 seconds. But with an O(n!) algorithm, throwing more resources at the problem is simply not possible (I would need a computer the size of Earth). In general, as machines grow in power, more and more polynomial algorithms become usable, whereas exponential algorithms remain similarly useless.

The big fallacy of this argument is assuming that we know what we're doing. We don't! The problem, and the size of the problem changes, often faster that the speed of our computers.

If you give me an O(n3) algorithm for the bus scheduling problem, all of a sudden I will realize that I can ask for more. Before, I was assuming that each bus route should have a fixed frequency, but certainly I can get better results (in real life) if I allow variable gaps between consecutive buses. All of a sudden, the number of variables in your problem will explode (e.g., there might be 30 cycles on the same bus route during the day). Why did it explode? Only because you told me that you had a faster algorithm.

Twenty years ago, nobody could envision problems of the scale that Google solves today. Now we can, and it's all a fault of the hardware guys. They managed to push storage capacity (disk and RAM) so much, that it makes sense to ask questions about billions of web pages.

For as long as Moore's law is (was) active, we can at best expect the memory size and the processor "power" to grow at the same rate. Even worse, processor power will probably lag behind: doubling the density will roughly double my RAM capacity, but algorithmically I will probably not be able to do double the computation on the CPU (it's hard to exploit parallelism perfectly). With an exponential gap between memory and CPU, it's only linear time algorithms that survive.

We don't know what we're doing. A related issue, highlighted in the buses example, is that we don't know what we ultimately want to solve. It depends on what we can solve. If I give you a faster algorithm, you will want to model your problem with more details, making my algorithm too slow.

The core problem that Internet routers have to solve comes down to nothing more than a binary search (each subnetwork has an IP range, and I have to figure out in which of these ranges my destination IP lies). Could we solve it in O(lg n)? With current data rates and traffic, we probably could (barely). But if we speed it up, we can now maintain that interesting statistic, which allows us to bill customers more accurately. And if we could speed it up some more, we have time-per-packet to fit that extra component, which will allow us to track virus infections ex post facto.

Functionality is never fixed. The more you can do, the more I will ask you to do. It makes equal sense to improve an O(n!) algorithm to O(2n), or to improve an O(n3) algorithm to O(n2), or to improve hash tables by 20% (they're already constant time!). In all cases, you will notice that the problem size was just at the limit of what you could handle, and the new algorithm will make somebody very happy.

Apart from the many examples that I heard about at ATT, I can give another. I am working with an OR guy, the kind of person who teaches at Business School and spent significant time optimizing truck schedules for Amazon (not in theory, but actually chasing 2% improvements that they implemented). Do you know what problem he brought to me? He wants to improve an O(n lg n) algorithm.

I remember a quote from David Karger in Advanced Algorithms. He said that one week after you improve a running time from O(n2) to O(n lg2n), you should celebrate. But two years later, you should work on improving it to O(n); most likely, it is a very interesting problem by now.

Intellectual curiosity. Ultimately, all arguments about reality cannot compete with our (very developed) sense of the aesthetic. What lives and dies in a branch of mathematics is determined by what trigers our intellectual curiosity.

My ferm belief is that whenever you have a clean and ellegant problem, there will some fascinating behavior, in the eyes of somebody who loves to understand computation, at any one of the levels of the running time hierarchy.

I have heard the claim that P-vs-NP is the most interesting question, because it's the root node in our binary search tree for understanding problems --- the first thing we ask is whether this problem is in P or not, and then proceed with two different sets of techniques depending on the answer. The root node of the classification tree has to be the most interesting.

But this argument pales when you remember that we're really chasing the intellectually challenging cases. For many problems, a 5th grader who just learned programming can give a polynomial algorithm. This doesn't tell me anything interesting from a scientific perspective. The most interesting behavior happens right at the phase transitiong where the problem becomes hard. For some problems, it's at O(lglg n), while for some it's at O(2n).

A problem in current TCS. Of course, I cannot write a post without being provocative :) So let me state what I view as a significant problem in current TCS.

There is a field of study, namely approximation algorithms (which includes such newer branches as mechanism design), in which you can do fine without much understanding or appreciation of computation. The name of the game in this subfield is rounding LPs and SDPs, a task which requires geometric skills and discrete Maths skills, but not much algorithmic skills. Heck, many papers in the area don't even have a for loop in them! The only interesting algorithmic content is solving the LP or SDP, which few people really work on.

This has led to a hoard of TCS students with very little appreciation of algorithmic phenomena, who basically learned how to hide them all away under the "polynomial" hood. (I'm sure you've met people who are surprised at the idea of storing some forward and backward pointers between two pieces of information for navigation throughout the algorithm...)

I hope this doesn't ruin the future of our field. Once there's a significant group of such people, we could easily diverge into creating useless problems that can be solved by these techniques, and lose track of the fact that we were supposed to do algorithms and complexity. (As I said before, the field evolves based on what the field finds exciting. Another way to put it is that researchers are often in the business of drawing targets around the arrows that they manage to shoot, and I can see these new targets drifting away from the original big one.)

The encouraging bit is that the standards for success in approximation algorithms (for old researchers) still involve crystal clear algorithmic thinking. Consider for instance, the two recent algorithms researchers who won the Gödel Prize: Dan Spielman and Shanghua Teng. They moved from a major polynomial-vs-exponential result with mostly mathematical content (smoothed complexity of LP) to a much more algorithmic question, approximating the sparsest cut in linear time. This latter problem has picked up quite a bit of steam in recent years.

Anonymous said...

"I am working with an OR guy, the kind of person who teaches at Business School and spent significant time optimizing truck schedules for Amazon"

"The only interesting algorithmic content is solving the LP or SDP, which few people really work on."

Maybe I am picking the statements a bit out of context but I hope you didn't tell the OR guy about your opinion on LP or SDP.

Anonymous said...

You can add that in the context of crypto going from, say, a 2^n-time algorithm to a 2^{\sqrt{n}}-time algorithm for breaking a hard problem is a big deal...

Noam said...

(1) Most people working on the P=NP problem don't take it literally as you seem to. The common interpretation is: "Either prove a super-linear lower bound or prove a sub-exponential upper bound (for circuits OR multi-tape TMs or RAMs in deterministic OR randomized OR quantum variants". The very very wide range that is open here is the point.

(2) The most important *algorithmic* technique that I teach my students is "understand the problem structure better". This is what all these LP/SDP/rounding guys are doing. I don't see a reason to define this away from being algorithms.

(3) The point with approximation is to fix the *wrong* modeling of the word "solution" in the problem definition for algorithms: finding the "best" is usually the wrong formulation of what is desired -- "close to best" is more like it. (The 2%-chasing OR friend should be an example.

(4) It's a bit strange to see Mechanism Design mentioned as a sub-branch of approximation algorithms. Some MD papers use approximation, as does the rest of TCS, but beyond that there is no connection really.

Obelix said...

...in which you can do fine without much understanding or appreciation of computation

This is not just an issue with approximation algorithms but is the case in many other areas of TCS.

Are we at a point where we could/should distinguish between "theoretical computer science" and "mathematical computer science"? Just like physicists distinguish between "theoretical physics" and "mathematical physics".

Anonymous said...

If it's in P, it's good enough for me!

Here's part of my view on LP-based approximation algorithms, where I work mostly. A significant fraction of the best papers come out of carefully applying one simple new technique or a certain ordering of known techniques. Whether you think that is awesome or terrible is a matter of opinion. But from my perspective as a learner it is ideal, you are able to closely keep track of major developments, and build skills that complement each other. Maybe other sub-fields are the same way - I have little experience to speak of for any other one.

Also, approximation algorithms is reasonably close to real-life problems overall, but that doesn't force the entire theoretical field to have practical running times.

Also, randomly, Mihai, are you doing Code Jam this year? Have you found you are able to find enough time to do that stuff now that you have "a job" as they say?

Mihai said...

Maybe I am picking the statements a bit out of context but I hope you didn't tell the OR guy about your opinion on LP or SDP.

Of course I did! Approximation people I've met do agree with this, though usually in a positive light ("that's exactly why papers like algorithmic developments like in the X paper are so exciting").

Just to clarify, I think solving LP is a really cool problem. I'm just complaining about the unexciting algorithmic contents of many *applications* of LP.

Mihai said...

Noam:

(1) Most people working on the P=NP problem don't take it literally as you seem to.

I am probably missing the connection to what I wrote. No matter how liberally you interpret P vs NP, it's not talking about computing max flows, or the edit distance, or reachability in dynamic graphs.

(2) The most important *algorithmic* technique that I teach my students is "understand the problem structure better".

It's a great technique in geometry and graph theory, too :), and much work in approximation algorithm certainly uses it. Understanding the problem structure does not make an algorithm.

(3) The point with approximation is to fix the *wrong* modeling of the word "solution" in the problem definition for algorithms: finding the "best" is usually the wrong formulation of what is desired -- "close to best" is more like it.

I agree, but I think TCS is wrong in pushing this idea too much. Certainly, there's imprecision in the data and the modeling, so imprecision in the output is not a killer. But an exact solution never hurts anybody, if you can get it (it won't be an exact solution for your unmodelable real world problem, but it will be closer).

And imprecision in the modeling is fine enough that Amazon is still happy about a 2% improvement (this is not wiped out by noise, presumably). On the other hand, you hear TCS people use this idea to justify a 2-approximation!

4) It's a bit strange to see Mechanism Design mentioned as a sub-branch of approximation algorithms.

I yield to your opinion. I'm only repeating something that I heard from people in the area (that it often "feels" like approximation algorithms), though I perhaps misunderstood the connection.

Anonymous said...

No matter how liberally you interpret P vs NP, it's not talking about computing max flows, or the edit distance, or reachability in dynamic graphs.

Not that clear, there are results showing that strong (i.e. exponential) hardness of SAT implies some hardness for problems that are in P. (In particular I think peopled showed that then k-clique cannot be done in time n^{o(k)} ).

Mihai said...

Not that clear, there are results showing that strong hardness of SAT implies some hardness for problems that are in P.

If you look at the very recent list of SODA accepted papers, you will notice that I am quite familiar with this direction :)

Mihai said...

Obelix, I love the term "mathematical computer science." I would be much happier to classify, say, property testing as mathematical computer science than theoretical computer science.

Dave, it is exactly people like you that I'm trying to change :) As you say yourself, you have little experience with the other fields, which leads to little appreaciation of algorithmic work in those fields.

Membership in P is neither interesting nor useful for very many problems.

Also, approximation algorithms is reasonably close to real-life problems overall,

It is much more removed from real life than exact algorithms. There are quite a few success stories of exact algorithms in real life. Most success stories of approximation algorithms come from the OR community, not us --- and they do a different flavor of work.

Also, randomly, Mihai, are you doing Code Jam this year?

By doing code jam, do you mean compete? I haven't really taken part in any contests after high school, and I intend to keep it that way :) First of all, I am too old :) More importantly, contests beyond high school test coding too much --- I am a reasonable coder, but not a fast one. At IOI this was never a problem.

I have and do participate in scientific committees though.

Anonymous said...

Mihai, I presume you grant that time is not the only interesting algorithmic resource or objective? For example, space bounds are also interesting, right?

If you like, as far as approximation goes, think of "solution quality" as just another objective function. One can ask for algorithms that run quickly, use small amounts of space, only a little randomness, a few rounds of communication, etc. People studying approximation are asking for algorithms that maximize solution quality, instead of minimizing time.

And then there are tradeoffs between these objectives, such as between time and space. Does it not require an "understanding and appreciation of computation" to say that in quasi-polynomial time, I can give you an approximation scheme, in n^3 time, you can get a constant-factor approximation, and a linear-time greedy algorithm gives an O(log n) approximation?

To second Noam's point, understanding problem structure is certainly algorithmically important. You said in the previous post that you found the question "Can you beat backtracking for the following list of problems?" fascinating. Well, the reason one can beat backtracking for some other problems (pick your favorite dynamic programming example) is that they do exhibit a lot of structure. Without understanding that, as you probably agree, one wouldn't be able to design good algorithms.

And even the NP-complete problems have very different structures. Knapsack can be solved arbitrarily well, and Clique is basically hopeless in general graphs. A simple greedy algorithm gives a ln n approximation for Set Cover, and no polynomial time algorithm can do any better! Do you not think it interesting from an algorithmic perspective to know that some problems can be solved nearly optimally in close to linear time, while others can't be solved at all in n^10 time?

In fact, I agree with your comment about interesting behavior occurring at phase transitions; that's exactly right! Phase transitions of this type occur all the time in approximation algorithms; it's just that the relevant variable is not always time. For instance, Set Cover has a very interesting transition around the ratio of ln n. A very simple algorithm gives a 2-approximation to Vertex Cover, but doing any better seems remarkably difficult. Knapsack, by contrast, exhibits no such transition, with a smooth tradeoff between time and solution quality. Isn't understanding these problems, their phase transitions and how one can solve them, part of the algorithms and complexity that we're "supposed to do"?

Anonymous said...

Mihai, sure it is important to have diverse taste. Back in my frat days I did distributed algorithms, so I try to practice this in addition to preaching. I don't currently spend time following anything very closely outside approximation/LPs/combinatorial optimization though (depth/breadth tradeoff).

I would be really interested to hear your perspective on whether what I have seen also occurs in your area of expertise, whether a significant fraction of the best papers come out of carefully applying one simple new technique or a certain ordering of known techniques.

Anonymous said...

Also let indulge my inner troll and say that making claims on which theory subfield is the closest to applications is a little like millionaire rappers debating which one's the gangsta-est

Anonymous said...

(2) The most important *algorithmic* technique that I teach my students is "understand the problem structure better".

It's a great technique in geometry and graph theory, too :), and much work in approximation algorithm certainly uses it. Understanding the problem structure does not make an algorithm.

Actually, it does in many cases. In fact, it is imho, the only way to get a good algorithm. Think back about your favorite algorithms (matching? shortest paths? max flows?) and you would realize that understanding the structure is what leads to an interesting algorithm.

Your remark is like saying that spinning cotton to make fabric does not make your boxers. It may be true in the literal sense: someone has to design those boxers, cut the garments, stitch them, etc.

But then one can go a step further that designing a fast algorithm does not solve a problem. Someone has to code it up properly, and Intel has to make the chips that run that code without interference from cosmic rays, and Samsung has to make monitors to display the results, and I could go on and on.

Maybe I am mathematical computer scientist, but I think a work that reveals the structure of the problem (say ARV) is better than a paper containing an improvement from O(n log n) to O(n) for the problem of finding the longest multipermutation. My guess is that in the long run, the former would have much more impact on the real world than the former.

BTW, I am sympathetic to the argument that if everyone worked on making the fabric and no one knew how to make garments, it wouldn't be so good for the field. Maybe we are swinging too far to one side. But in approximation algorithms, the current time seems like a good time to be understanding structure and we may be right in concentrating our energies on that.

Anonymous said...

Mihai,

No offense meant, but I find arguments in your post somewhat shallow. In fact, I gave up reading the post after I was done reading about 70-75% of it...

It seemed to me that you ended up writing more text about a pre-formed opinion rather than a well-thought and conclusive argument about why thinking P as class of "efficient" algorithms is a bad philosophy. Once again, no offense meant, I do find quite a good fraction of your posts interesting :)

Mihai said...

Mihai, I presume you grant that time is not the only interesting algorithmic resource or objective?

Sure. But you grant that some are less interesting and meaningful than others. (Eg, running time on a Turing machine is certainly an objective, but not one I will work on.)

Judging based on my own understanding and taste of computation, focusing on approximation often (i.e. certainly not always) generates uninteresting computational questions.

Look, half the questions there are "how much do we need to relax this before the space of solutions become convex?"

Well, the reason one can beat backtracking for some other problems (pick your favorite dynamic programming example) is that they do exhibit a lot of structure.

For the record, I am a big fan of dynamic programming, and I'm disappointed that our community didn't have too much to say about it.

Isn't understanding these problems, their phase transitions and how one can solve them, part of the algorithms and complexity that we're "supposed to do"?

I promise to return to this issue, but for now let me say that you can find algorithms and phase transitions everywhere. This doesn't make them all interesting or important.

Mihai said...

@daveagp:

I would be really interested to hear your perspective on whether what I have seen also occurs in your area of expertise, whether a significant fraction of the best papers come out of carefully applying one simple new technique or a certain ordering of known techniques.

The best papers are certainly the ones that have one crisp idea that makes everything possible. Other good papers rely on a combination of known ideas. This is often much harder to quantify, but in time (with examples) the way of putting things together becomes a technique in itself.

Also let indulge my inner troll and say that making claims on which theory subfield is the closest to applications is a little like millionaire rappers debating which one's the gangsta-est

Your opinion is certainly common among young TCS folk. I've heard things like "this is a sport", "we're really mathematics with a better excuse for getting money" etc etc.

The reason I prefer TCS over mathematics is that it is a deep theoretical study of something real. If we stopped caring about the connection to reality, (for me) it would be a disaster.

Mihai said...

Maybe I am mathematical computer scientist, but I think a work that reveals the structure of the problem (say ARV) is better than a paper containing an improvement from O(n log n) to O(n) for the problem of finding the longest multipermutation. My guess is that in the long run, the former would have much more impact on the real world than the former.

The multipermutation problem is an algorithmic exercise, something that you may solve to keep your mind fresh. Nobody is promoting that as serious science.

Sparsest cut is certainly an important problem. Since you bring that up: (1) approximating this problems makes much more sense than your average problem; (2) we still don't know an algorithm with theoretical guarantees that matches things like METIS; (3) if you are going to lay out chips at Intel, you are never going to afford a quadratic algorithm, making my point that you always want faster running times.

Mihai said...

No offense meant, but I find arguments in your post somewhat shallow. In fact, I gave up reading the post after I was done reading about 70-75% of it...

No offense meant, but writing that something is shallow, you didn't even read it, and you have nothing in particular to say about it --- that's quite the definition of shallow :)

Anonymous said...

While one agree or disagree with some of your observations, the discussion about where TCS is going is valuable in itself. Arguably, at certain points in time AI and DB worked themselves into tight corners to the detriment of progress, jobs and funds there.

We should routinely ask questions such:

why are we doing complexity?
why algorithms?
why data structures?
why approximation algorithms?
why differential privacy?

What is their connection to the real world: direct/indirect/none?

Is the sub-field justified? if so are we asking the right questions, i.e., are we setting out a program which, if successful, will lead to big gains or are we mostly running around in circles proving irrelevant variations on results?