Thursday, September 10, 2009

Labs vs Academia

The topic of the day seems to be labs vs academia (sparked by this, and indirectly by Muthu, picked up by Michael Mitzenmacher and Jon Katz). I thought I would contribute my own perspective, since I've been in labs for about a year, and went through university/lab and lab/lab transitions recently.

Location. For those of us with families, location is one of the main factors. It is not easy to get a job in a top lab, but it is much easier than in a top university, simply because a lot more people go through the lab system (see below). Consider the three prime markets: NY, Bay Area, and Boston.
  • In the Bay Area, Berkeley / Stanford weren't interested in interviewing me, but IBM gave me jobs (both a temporary and a full-time one, which was great).
  • In the NY area, Princeton / NYU / Columbia weren't interested, but, again, ATT gave me a job.
  • In Boston, the problem remains unsolved since MSR New England is not in the mood to hire young people. But arguably there are many more non-top-but-good universities, so this compensates (around San Francisco, for instance, this alternative is entirely lacking).
For me, location made the decision between IBM and ATT (since California tries quite hard to keep foreign medical doctors away).

To see the difference between the chance of getting a job in labs vs universities, think about the numbers: IBM, ATT, and MSR Silicon Valley were willing to hire in theory (and they did) on my particular year, but among the top universities only MIT did.

As for the non-prime locations, there is a wide range between almost-prime and really-avoid. I had offers from some almost-prime ones (Atlanta and San Diego), and I think I would've been happy with them.

However, I want to add that I don't buy Jon Katz's argument that non-prime locations are cheaper. Think about what you want; the cost depends very much on your utility function. For me, things that matter include flight time (and cost) to Europe, availability of Romanian/Eastern European wine and food, and closeness to mountains (the worst thing about NY). Things that don't matter include the price of houses in suburbs.

Career prospects. Let me make this explicit: the chance that you will grow old in a lab is slim. Think of your lab career as a 5-year postdoc, and be prepared to reapply to faculty jobs.

Sure, labs will go through a great deal of effort to tell you that this is not true. But look around: the vast majority old people in labs are either the big founder-of-the-group types, or people who are post-faculty positions (they had one and left). The number of people who go through the lab system is significantly higher than the number of people who stay (which is great for young job seekers, as I mentioned above).

I am not sure what causes this empirical fact, but I can offer some hypotheses:
  • Tradition and culture encourage you to leave for universities at some point;
  • Labs reliably go through periods of instability, where things turn really bad, and most of the group leaves. Some labs never recover, while others recovered quite successfully, but with a significantly changed group (see IBM Almaden and ATT).
  • If you spend too many years in a lab, there is a danger that you fade into TCS oblivion. (You get integrated more into practical groups, and stop being a theorist; you lose contact with the recent trends; etc.)
  • The perspective of having students seems much more attractive once you get tired of the nitty gritty details and would like to delegate some.
Thus, my recommendation to job seekers is to consider labs as very high risk venture, compared to a job in a university. On the one hand, if you choose the lab option, you will most likely seek to move to a university later, so make sure that your track record stays very competitive. On the other hand, if you choose a university, CS departments are exceedingly friendly and rumour has it that tenure is easy to come by (even if your record as Assistant Professor is one notch lower than it was as a PhD student, when you were hired --- as someone put it, "people who get jobs at top places are brilliant, while people who get tenure at top places are very good").

Freedom for in-depth work. If you're reading the discussion above as saying that a university job is always better, you're missing the fact that "risk" is always two-sided. The one aspect where labs excel is the ability to do in-depth work.

As faculty, you have to worry about your classes, your students, and your grants. These will slowly push you to be more stable, reliable, and risk-avoiding (turning you from brilliant to very good). No matter how insensitive you are, chances are you will care about your students at least a little bit, and you will think twice before hanging their life paths on some super-risky venture. (Not to mention that you have to train them before they can join you on said super-risky venture.) And ignoring the students is also hard to do in practice; they drain a lot of mental energy from you.

In a lab, you have to worry about not getting fired, and nothing else. Theory-wise, you can take large risks if you care to (you're not affecting anybody else), and you don't owe anybody any significant share of your brain.

In-breadth work. As I showed above, labs are much better for in-depth, risky, theoretical work. How about if you fancy yourself doing non-theoretical work? The situation is exactly the opposite.

Labs are the ideal place to collaborate on standard problems with people across Computer Science. There are many people in the lab doing practical things. You get brownie points for working with them, and they are very interested in working with you (I think they also get brownie points). They don't have an empire of students that they can't really leave; they have time for you. Their problems make sense, i.e. you won't dismiss them as "practice" (as someone put it, in universities, most of "practice" doesn't apply to practice, which is a big turn-off for some theoreticians who try to stick their heads outside theory). You can question their model as much as you like, since lab people have the real data, and they are open to revising their understanding of the world if you see that data in a different light.

But if you want to go broader and riskier, you're out of luck. You simply cannot work in theoretical linguistics, both because there's nobody around you whom you can talk to, and because nobody in the management wants to hear about you doing this. And you cannot even work on anything too non-standard in computer science -- companies are surprisingly risk averse and slow. (You may remember the story of a theory guy who came up with the Google page-rank scheme around the same time; he was in an industrial lab, while the Google founders were at a university. Which of them turned this into a billion-dollar product?)

Thus, if you fancy yourself as revolutionary outside theory, go for a university (and start by getting tenure). If you want to solve really cool problems inside a bigger system, a lab is the place to be. In a university, you will simply never hear about those problems, and in a start-up you will not face them. (A start-up has to do one thing very well, and other things less than horribly; they don't have much variety for deep thinkers.)

Salary. If that matters a lot for you, I have bad news. Salaries in labs are really not what they're rumoured to be. They are, indeed, higher than in universities, but only marginally so. From the data points I have, this seems to hold true at any level (e.g. comparing middle age lab people with Associate Professors.)

In fact, be prepared to negotiate for any salary advantage. My initial offer was equal to the initial offer I got for a 9-months salary at a university! It got increased significantly after I laughed them off. (I heard a very good phrase that I recommend: "I certainly prefer to come to your place on intellectual grounds alone; but my wife would find it very strange that I am refusing this other job with such nice benefits." -- somehow HR people never argue with the stereotype of a wife interested in money :)

Intellectual honesty. In some sense, the lab offers you a great feeling of intellectual honesty: you have seen the real problems (even if you don't work on them). You have informed opinions on what is really important outside theory, and you may get to do such work yourself.

But the labs can also be awful on the same grounds of intellectual honesty. You have to ask for permission and IP-clearance before you publish anything. If you do anything that might be interpreted as important, there is a risk that you are forced to patent it (a horrible thing from my own ethical perspective, due to the way patents are abused and misused in the US). And if you want to write a library for the world to use, chances are it will be very hard to release it.

Personal freedom and appreciation of your work. People complain about boring faculty meetings, and other such annoyances in the life of a professors. Well, labs are worse. There are, for instance, the unbelievably ridiculous things, like hour-long mandatory courses on the effectiveness of sending polite emails to coworkers. As opposed to a professors, whose relation to the university is more like a member in a coop, in a lab there are people above you who have no respect for you or your time.

Then there are the moderately ridiculous things, like cutting coffee to save money. Again, it comes down to the fact the decisions are made by people who don't know you or have any respect for you.

Finally, there is the issue of travel. Faculty may complain (rightly) about the horrors of periodic begging for grants, but it is nowhere are demeaning as begging for every single trip you want to make (whether your lab should pay, or some external source is paying). I'm not saying travel approvals are harder to write -- they are much easier than grant proposals; but their periodic, and insignificant nature makes them demeaning. (This issue is especially bad during times of economic hardship. Muthu once mentioned a deal of the form "add $15K to my salary and I'll never file a reimbursement form." In certain times, the deal you get at labs is really like "we'll add $0 to your salary, and you never file a reimbursement form.")

Of course, objectively things are not bad. We have high enough salaries to afford coffee, and even all scientific travel when it comes to that (it's tax deductible, so it's half the dollar cost to you). It's all about the fact that somebody far above you sees no difference between your job and a dishwasher's.

There are probably many aspects of the lab vs. university comparison that I missed, so feel free to comment or ask.

Wednesday, September 9, 2009

SODA / Data Structures

The SODA list of accepted papers, with abstracts, is here.

Here are the clear data structures papers:
  • Fully-Functional Succinct Trees (Kunihiko Sadakane, Gonzalo Navarro)

  • On the Cell Probe Complexity of Dynamic Membership (Ke Yi, Qin Zhang)

  • Data Structures for Range Minimum Queries in Multidimensional Arrays (Hao Yuan, Mikhail J. Atallah)

  • A locality-sensitive hash for real vectors (Tyler Neylon)

  • Deletion Without Rebalancing in Balanced Binary Trees (Siddhartha Sen, Robert E. Tarjan)

  • Cache-Oblivious Dynamic Dictionaries with Optimal Update/Query Tradeoff (Gerth Stølting Brodal, Erik D. Demaine, Jeremy T. Fineman, John Iacono, Stefan Langerman, J. Ian Munro)

  • Cell-probe lower bounds for prefix sums and matching brackets (Emanuele Viola)
    + A Lower Bound for Succinct Rank Queries (Mihai Patrascu)

  • Applications of Forbidden 0-1 Matrices to Search Trees and Path Compression-Based Data Structures (Seth Pettie)

The following are also data structures papers, though they can be seen as stradling two areas:
  • A Nearly Optimal Algorithm for Approximating Replacement Paths and k Shortest Simple Paths in General Graphs (Aaron Bernstein) -- the main contribution is better preprocessing time of a data structure

  • Counting Inversions, Offline Orthogonal Range Counting, and Related Problems (Timothy M. Chan, Mihai Patrascu) -- improves the preprocessing and offline times of some data structures

  • Highway Dimension, Shortest Paths, and Provably Efficient Algorithms (Ittai Abraham, Amos Fiat, Andrew Goldberg, Renato Werneck) -- distance oracles are a very important topic in data structures; this paper is more about modelling, though.

  • On the Exact Space Complexity of Sketching and Streaming Small Norms (Daniel M. Kane, Jelani Nelson, David P. Woodruff) -- whenever you talk about streaming and care about time complexity, this is clearly a data structure; if you solve the problem with k-wise independent hash functions, even more so.

  • The 1+β Choice Process and Weighted Balls-into-Bins (Yuval Peres, Kunal Talwar, Udi Wieder) -- more on the power of two choices, though the process is probably unrealistic in a data structures context.

  • Regular Expression Matching with Multi-Strings (Philip Bille, Mikkel Thorup) -- stringology is close enough :)

  • Fast distance multiplication of unit-Monge matrices (Alexander Tiskin) -- also stringology.

I find the number of papers to be on the low side. There are a couple of rejected papers where I may disagree with the wisdom of the PC, but overall I think the numbers just reflect the fact that the field is old, and progress comes hard.

Disclaimer: Note that this list is not so closely correlated with my interests. There are data structure papers that I am not interested in, and many non-data-structures papers that I am interested in (even among my own papers, only half can be found above).

Thursday, September 3, 2009

What is efficient?

In my previous post, I wrote about how calling P "efficient computation" is nothing more than bad philosophy. I think a lot more examples are in order.

Are problems constant size? Opinions that P is efficient often stem from the following understanding of what computers are used for. Let's say that I have to optimize the bus schedules in NYC, and there are some 200 bus routes to consider. Certainly, an O(n3) algorithm is worse than an O(n lg n) algorithm, but there is a really large gap between both of these, and an O(n!) algorithm. Indeed, with an O(n3) running time I will have to buy more/bigger computers, and let them run for a week instead of 30 seconds. But with an O(n!) algorithm, throwing more resources at the problem is simply not possible (I would need a computer the size of Earth). In general, as machines grow in power, more and more polynomial algorithms become usable, whereas exponential algorithms remain similarly useless.

The big fallacy of this argument is assuming that we know what we're doing. We don't! The problem, and the size of the problem changes, often faster that the speed of our computers.

If you give me an O(n3) algorithm for the bus scheduling problem, all of a sudden I will realize that I can ask for more. Before, I was assuming that each bus route should have a fixed frequency, but certainly I can get better results (in real life) if I allow variable gaps between consecutive buses. All of a sudden, the number of variables in your problem will explode (e.g., there might be 30 cycles on the same bus route during the day). Why did it explode? Only because you told me that you had a faster algorithm.

Twenty years ago, nobody could envision problems of the scale that Google solves today. Now we can, and it's all a fault of the hardware guys. They managed to push storage capacity (disk and RAM) so much, that it makes sense to ask questions about billions of web pages.

For as long as Moore's law is (was) active, we can at best expect the memory size and the processor "power" to grow at the same rate. Even worse, processor power will probably lag behind: doubling the density will roughly double my RAM capacity, but algorithmically I will probably not be able to do double the computation on the CPU (it's hard to exploit parallelism perfectly). With an exponential gap between memory and CPU, it's only linear time algorithms that survive.

We don't know what we're doing. A related issue, highlighted in the buses example, is that we don't know what we ultimately want to solve. It depends on what we can solve. If I give you a faster algorithm, you will want to model your problem with more details, making my algorithm too slow.

The core problem that Internet routers have to solve comes down to nothing more than a binary search (each subnetwork has an IP range, and I have to figure out in which of these ranges my destination IP lies). Could we solve it in O(lg n)? With current data rates and traffic, we probably could (barely). But if we speed it up, we can now maintain that interesting statistic, which allows us to bill customers more accurately. And if we could speed it up some more, we have time-per-packet to fit that extra component, which will allow us to track virus infections ex post facto.

Functionality is never fixed. The more you can do, the more I will ask you to do. It makes equal sense to improve an O(n!) algorithm to O(2n), or to improve an O(n3) algorithm to O(n2), or to improve hash tables by 20% (they're already constant time!). In all cases, you will notice that the problem size was just at the limit of what you could handle, and the new algorithm will make somebody very happy.

Apart from the many examples that I heard about at ATT, I can give another. I am working with an OR guy, the kind of person who teaches at Business School and spent significant time optimizing truck schedules for Amazon (not in theory, but actually chasing 2% improvements that they implemented). Do you know what problem he brought to me? He wants to improve an O(n lg n) algorithm.

I remember a quote from David Karger in Advanced Algorithms. He said that one week after you improve a running time from O(n2) to O(n lg2n), you should celebrate. But two years later, you should work on improving it to O(n); most likely, it is a very interesting problem by now.

Intellectual curiosity. Ultimately, all arguments about reality cannot compete with our (very developed) sense of the aesthetic. What lives and dies in a branch of mathematics is determined by what trigers our intellectual curiosity.

My ferm belief is that whenever you have a clean and ellegant problem, there will some fascinating behavior, in the eyes of somebody who loves to understand computation, at any one of the levels of the running time hierarchy.

I have heard the claim that P-vs-NP is the most interesting question, because it's the root node in our binary search tree for understanding problems --- the first thing we ask is whether this problem is in P or not, and then proceed with two different sets of techniques depending on the answer. The root node of the classification tree has to be the most interesting.

But this argument pales when you remember that we're really chasing the intellectually challenging cases. For many problems, a 5th grader who just learned programming can give a polynomial algorithm. This doesn't tell me anything interesting from a scientific perspective. The most interesting behavior happens right at the phase transitiong where the problem becomes hard. For some problems, it's at O(lglg n), while for some it's at O(2n).

A problem in current TCS. Of course, I cannot write a post without being provocative :) So let me state what I view as a significant problem in current TCS.

There is a field of study, namely approximation algorithms (which includes such newer branches as mechanism design), in which you can do fine without much understanding or appreciation of computation. The name of the game in this subfield is rounding LPs and SDPs, a task which requires geometric skills and discrete Maths skills, but not much algorithmic skills. Heck, many papers in the area don't even have a for loop in them! The only interesting algorithmic content is solving the LP or SDP, which few people really work on.

This has led to a hoard of TCS students with very little appreciation of algorithmic phenomena, who basically learned how to hide them all away under the "polynomial" hood. (I'm sure you've met people who are surprised at the idea of storing some forward and backward pointers between two pieces of information for navigation throughout the algorithm...)

I hope this doesn't ruin the future of our field. Once there's a significant group of such people, we could easily diverge into creating useless problems that can be solved by these techniques, and lose track of the fact that we were supposed to do algorithms and complexity. (As I said before, the field evolves based on what the field finds exciting. Another way to put it is that researchers are often in the business of drawing targets around the arrows that they manage to shoot, and I can see these new targets drifting away from the original big one.)

The encouraging bit is that the standards for success in approximation algorithms (for old researchers) still involve crystal clear algorithmic thinking. Consider for instance, the two recent algorithms researchers who won the Gödel Prize: Dan Spielman and Shanghua Teng. They moved from a major polynomial-vs-exponential result with mostly mathematical content (smoothed complexity of LP) to a much more algorithmic question, approximating the sparsest cut in linear time. This latter problem has picked up quite a bit of steam in recent years.

Wednesday, September 2, 2009

Food for thought

Here is a short movie (with subtitles) on Romanian parents whose kids emigrated to the US. It is hilarious, at least for people who have gone through this.

Funny enough, my own parents went through an abridged version of this, even though my mother has a Bachelor in Computer Science. She did her university studies just as Computer Science was introduced, and she got a degree after writing FORTRAN programs on punch cards. After graduating, her parents convinced her that there was no future in this discipline, so she became a mathematician.

Here is a funny article on religion published in the Cosmopolitan in the 1920s. The author was scared by religious fundamentalists who tried to introduce bills limiting the teaching of evolution. Sound familiar?

When it comes to social issues, I am always more comfortable in Romanian circles than in the US. I never feel the need to engage in a conversation on the existence of God, since I genuinely consider any person who believes such a thing to be intellectually inferior --- he prefers to rationalize a feeling rather than "truly feel" a logical conclusion. (Of course, my position is not politically correct in the US.) The truly interesting question, to me, is whether we should allow discrimination on religious grounds --- after all, would you want a professor who has proven himself to be intellectually inferior? While I cannot give a definite answer, I think it is OK to accept religious people as scientists, since there is a lot of technical work in science --- i.e. there are many useful skills, and they seem to be distributed independently of the particular feature on which religious people show inferiority. Thus, rejecting religious people would reduce the pool of talent, rather than concentrate it.

Via Luca, here is a bad philosophical piece on how efficient algorithms show that God exists. There are, of course, many cringe-inducing examples of bad philosophy that abuses concepts in computer science. This is just one that I saw today.

To quote some from last week, consider some talks from the Barriers workshop at Princeton. "P vs NP is really the deepest question in Mathematics, since it talks about the essence of Mathematical proofs." "The decades since P-vs-NP was posed really convinced us that this is the right question to ask about computational efficiency." "P!=NP is a law of the universe."

Now, "computational efficiency" means many things in many contexts. If you're sitting inside ATT routers, O(lg n) may be trivial to achieve, but a prohibitive cost. If you're aggregating over Google's data, O(n2) may be trivial, but prohibitive. And if you're optimizing the bus schedules in NY, O(n!) may be trivial, but prohibitive.

P is not the universal definition of "efficient." The P vs NP question is a misnomer for the question "Can you beat backtracking for the following list [...] of problems?", which is, by all accounts, a fascinating question. But substituting P with "efficient" and talking about the universe is just bad philosophy abusing sounds concepts in Computer Science.

Any attempts to ascribe metaphysical (or "metamathematical") meaning to this question is, like religion, a failure to accept the most plausible interpretation of reality, leading to an urge to replace it with a feel-good half-logical explanation.