I know that whenever I write about TCS politics on this blog, it ends up bad. For instance, I get a comment such as the following one (left by an anonymous to my last post):
What makes it tough for some of the papers you cite is the view that shaving off log factors is often viewed as much less interesting than larger improvements.This, of course, makes my latin blood run even hotter, and I cannot help writing follow-up posts (this is the first). If only I could keep my promise of not writing about politics, my life would be so much simpler. (If only I could learn from history... I got to observe my father become a leading figure in Romanian Dermatology a decade before he could get a faculty position — mainly due to his latin blood. He got a faculty position well into his 50s, essentially going straight to department chair after the previous chair retired.)
So, let's talk about shaving off log factors (a long overdue topic on this blog). As one of my friends once said:
All this talk about shaving off log factors from complexity people, who aren't even capable of shaving on a log factor into those circuit lower bounds...There is something very deep in this quote. Complexity theorists have gone way too long without making progress on proving hardness, their raison d'être. During this time, drawing targets around the few accidental arrows that hit walls became the accepted methodology. For instance, this led to an obsession about the polynomial / non-polynomial difference, where at least we had an accepted conjecture and some techniques for proving something.
Complexity theory is not about polynomial versus non-polynomial running times. Complexity theory is about looking at computational problems and classifying then "structurally" by their hardness. There are beautiful structures in data structures:
- dictionaries take constant time, randomized. (But if we could prove that deterministically, dynamic dictionaries need superconstant time per operation, it would be a very powerful message about the power of randomness — one that computer scientists could understand better than "any randomized algorithm in time nc can be simulated deterministically in time n10c if E requires exponential size circuits.")
- predecessor search requires log-log time. The lower bound uses direct sum arguments for round elimination in communication complexity, a very "complexity topic." A large class of problems are equivalent to predecessor search, by reductions.
- the hardness of many problems is related to the structure of a binary hierarchy. These have bound of Θ(lg n) or Θ(lg n / lglg n) depending on interesting information-theoretic issues (roughly, can you sketch a subproblem with low entropy?). There are many nonobvious reductions between such problems.
- we have a less sharp understanding of problems above the logarithmic barrier, but knowledge is slowly developing. For instance, I have a conjecture about 3-player number-on-forehead games that would imply nΩ(1) for a large class of problems (reductions, again!). [This was in my Dagstuhl 2008 talk; I guess I should write it down at some point.]
- the last class of problems are the "really hard" ones: high-dimensional problems for which there is a sharp transition between "exponential space and really fast query time" and "linear space and really slow query time." Whether or not there are reductions among these is a question that has preoccupied people for quite a while (you need some gap amplification, a la PCP). Right now, we can only prove optimal bounds for decision trees (via communication complexity), and some weak connections to NP (if SAT requires strongly exponential time, partial match requires weakly exponential space).
Let's look at algorithms:
- Some problems take linear time (often in very non-obvious ways).
- Sorting seems to take super-linear time, and some problems seem to be as fast as sorting. My favorite example: undirected shortest paths takes linear time, but for directed graphs it seems you need sorting. Why?
- FFT seems to require Θ(n lg n) time. I cannot over-emphasize how powerful an interdisciplinary message it would be, if we could prove this. There are related problems: if you can beat the permutation bound in external memory, you can solve FFT in o(n lg n). The permutation bound in external memory is, to me, the most promissing attack to circuit lower bounds.
- some problems circle around the Θ(n sqrt(n)) bound, for reasons unclear. Examples: flow, shortest paths with negative lengths, min convolution with a mask. But we do have some reductions (bipartite matching is as hard as flow, bidirectionally).
- some problems circle around the n2 bound. Here we do have the beginning of a classification: 3SUM-hard problems. But there are many more things that we cannot classify: edit distance and many other dynamic programs, min convolution (signal processing people thought hard about it), etc.
- some problems have an n*sort(n) upper bound, and are shown to be X+Y-hard. Though the time distinction between n2 and n*sort(n) is tiny, the X+Y question is as tantalizing as they get.
- some problems can be solved in nω by fast matrix multiplication, while others seem to be stuck at n3 (all pairs shortest paths, given-weight triangle). But interestingly, this class is related to the n2 problems: if 3SUM needs quadratic time, given-weight triangle requires cubic time; and if min-convolution requires quadratic time, APSP requires cubic time.
- what can we say about all those dynamic programs that run in time n5 or something like that? To this party, TCS comes empty-handed.
- how about problems in super-polynomial sub-exponential running time? Ignoring this regime is why the misguided "polynomial / non-polynomial" distinction is often confused with the very meaningful "exponential hardness." There is much recent work here in fixed-parameter tractability. One can show, for instance, that k-clique requires nΩ(k) time, or that some problems require 2Ω(tree-width) time.
And what can we say about k-SUM and all the k-SUM-hard problems (computational geometry in k dimensions)? This is an important illustration of the "curse of dimensionality" in geometry. I can show that if 3SAT takes exponential time, k-SUM takes nΩ(k) time.
Finally, what can we say about PTAS running times? In my paper with Piotr and Alex, we showed that some geometric problems requires nΩ~(1/ε^2) running time. This has a powerful structural message: the best thing to do is to exhaustive search after a Johnson-Lindenstrauss projection.
- inside exponential running time, there is the little-known work of [Impagliazzo-Paturi] showing, for instance, that sparse-3SAT is as hard as general 3SAT. Much more can be done here.