Wednesday, December 10, 2008

Succincter

'Tis the season to be doing succinct data structures.


[P] My FOCS'08 paper demonstrated how to use recursion to achieve better redundancy. For many fundamental problems, it achieved a redundancy of n / (lg n / t)t bits with query time O(t). The toy problem that people remember from my paper is storing trits: store A[1..n], where A[i] in {1,2,3}, using close to n log23 bits and fast access to A[i].


[Golynski] In SODA'09, Alexander Golynski (PhD from Waterloo, currently at Google NY) showed a remarkable lower bound for succinct data structures. Suppose you want to represent a permutation π such that you can access π(x) and π-1(x) in time t. Then, your data structure needs redundancy Ω(n lg n / t2) bits. In particular, for constant access time, your data structure needs a constant factor more space than the minimum!

This is essentially the first "meaningful" lower bound for succinct data structures. [Gal-Miltersen] did prove a previous lower bound, but for a much harder problem: polynomial evaluation. (For this problem, we believe sublinear query time requires superpolynomial space, so never mind succinct data structures.)

Notice that Golynski's lower bound is much higher than my upper bound for storing trits and other problems. His bound is in fact tight for storing a permutation.


[Viola] Emanuele Viola looks at the trits problem in the bit-probe model and shows that, with bit probe complexity t, the redundancy needs to be at least n / 2O(t). My result implies a redundancy of n / 2O(sqrt{t}) if t is the bit-probe complexity. 

As I always want to point out, the true model is the word RAM, but the bit-probe model is sometimes interesting as a mathematical curiosity.


[Mitzenmacher-Wiener] Michael has a student working on an implementation of the results in my paper. You could've guessed it from here, I suppose.


[P.-Thorup] In recent ongoing work, we gave a very strong result for the trits problem: we can achieve a redundancy of exactly 2 bits, with O(1) query time. This completely kills the problem.

In the bit-probe model, it immediately implies an upper bound matching [Viola].

It is not yet clear whether such strong upper bounds can be extended beyond the trits problem (e.g., for dictionaries and arithmetic coding). But we have some ideas and we are working hard on it (i.e. drinking and hiking).


Monday, December 1, 2008

B.A.M. 2

The Berkeley Algorithmists' Meeting will continue on Tuesday at 3:30pm.

Scheduling. There have been enough calls to change the time that I will do a recount of the votes and probably change it starting next week. If you would like to come and have not sent me your time constraints, please send them ASAP to mip@alum-mit-edu

Lecture notes. There are definite plans to post lecture notes here, though scheduling constraints will delay this for another week or so.

New topic: Orthogonal Range Queries

I want to spend a couple of meeting talking about range queries, a hugely important topic (in my opinion) that we don't teach too much of. The theory that has developed around this topic is one of the deepest examples of a coherent theory in TCS (maybe close to PCPs, depending on your subjective perception of depth).

Some ideas we will discuss, in an order TBD:

  • persistence
  • predecessor search (especially van Emde Boas)
  • buffer trees
  • RMQ
  • the 3D structure of Nekrich
  • rank/select
  • lower bounds

Saturday, November 29, 2008

Changes in TCS

At the first BAM, I got an idea of how little complexity theory has changed in three decades: after I described Morgenstern's lower bound for arithmetic circuits with bounded constants, Christos noted that he gave a talk about the same paper in 1978, when he was a postdoc at Berkeley. We still have no clue how to prove a superlinear lower bound in the unrestricted model.

At the same meeting, I got to observe a change in the people that define our community, a change that fed right into my anxiety that "Theory of Computer Science" is becoming "Theory." After I defined and drew a butterfly graph, I asked how many people have seen this before; some 3 or 4 people raised their hands (which drops to a very small number once you subtract Christos and Dick Karp). But when I asked who knew that "Fourier characters are orthogonal," all but one had their hands up!

I, for one, grew up thinking supercomputers were cool, and knew all about routing messages, before I learned about those "linear systems" that people apparently wanted to solve on big machines. And, yes, I am suspicious of people who had a "theory is cool!" moment before 100 "CS is cool!" moments.

Sunday, November 23, 2008

Berkeley Algorithmists' Meetings

I am starting a new reading group at Berkeley. (Actually, I prefer the term "discussion group," since I was never too passionate about reading papers anyway; instead, I very much enjoy cafe-style discussion of open problems.)

For lack of a better name, this series will go by BAM (Berkeley Algorithmists' Meetings). Regarding coverage, I did not promise anything more specific than:

Since nobody can decide whether I'm doing complexity or algorithms, I will aim to keep this discussion group at the same level of ambiguity.

Presentations will transcend the boundaries of any single paper, as we will try to get a more comprehensive view of each topic, without getting bogged down in technicalities. We will talk amply about open problems.
I am hoping that all presentations will be transcribed as posts on this blog. Thus, feel free to request topics that you want to read about. You will get slightly less weight than the physical audience, but your comments will be heard.

The pilot meeting is this Tuesday, at 3:30pm. If you are in the area and would like to attend, check back tomorrow for the room details.

The topic for the first meeting is convolution and the FFT:
  • how to think combinatorially about the FFT
  • an Ω(nlg n) lower bound for arithmetic circuits with bounded coefficients [Morgenstern, JACM'73]
  • an Ω(nlglg n) lower bound for offline range counting [Chazelle, STOC'95]
  • discussion of open problems

Thursday, November 20, 2008

Three Days in San Francisco

I started the week with a Misfits concert, those 31-year-old punk classics that have influenced all-time favorites like Slayer and Metallica.

Historical note: If the following are familiar to you...
  • "Die, die, die, my darling..."
  • "Here in this place lies the genie of death"
  • "I got something to say: I killed your baby today"
... chances are it is the Metallica inflections that are now ringing in your head. However, it is worth remembering that these are originally Misfits songs.

I continued with Alistair Sinclair's "Mixing times for the solid-on-solid model". As usual, I think this is great theory, but not theory of computer science. (Here, Mikkel tries to make a similar point, cutting even deeper to define theory of computer science.)
Personal note: Attending such talks is always very interesting to me. In a short previous life, I was a Physicist (I even won 1st and 2nd prizes in the Romanian Physics Olympiad in '96 - '97). This left me with great physics intuition, but not that much mathematical intuition on how to formalize "physics proofs" (at that age, my appreciation of Math was very limited). It is very rewarding to catch up on the mathematical intuition, whenever I get a chance.

Finally, I had a chance to listen to some great music from Nels Cline and his band. This led to the interesting exercise of trying to triangulate Cline's heavy, academic jazz relative to black jazz and the Béla Bartók classical music.

The U.S. is a big country, but there are not many places where you can hear the Misfits, Sinclair, and Cline in a span of three days.

Monday, November 17, 2008

STOC

I decided yesterday not to submit anything to STOC, so I spent my most relaxing deadline day yet. For one paper, I found a deadly bug, while for another I found an improvement (moving it back to the column of things I need to think more about).

The main lesson for me was that there are limits to everything, and that dealing with an endless stream of stuff (from my defense, to writing a thesis, to visa issues, to searching for an apartment, to IBM sign on stuff, to paperwork for buying a motorcycle...) can really kill research.

Now I look forward to a year without deadlines (I am on the FOCS program committee). I hope to spend the first month catching up with things, and the rest doing what I actually enjoy doing, i.e. thinking about my problems. If this plan fails, I might have to elope to Africa again for a few months :)

As for this blog, I will probably get back to writing interesting technical stuff soon. I expect the non-technical stuff to be kept to a minimum. (For one, we've got too much blogging in the community, and I'm bored. Plus, I've felt disappointed with the TCS community for quite a while, and there's not much I can say in between lying and insulting some folks.)

Tuesday, November 11, 2008

Internet Access

Some 18 days ago, I set up a contract with ATT to get internet access at home. Now, finally, it works. The level of incompetence that ATT demonstrated during this process makes me very happy --- if they ever decide to cut jobs starting with the least competent people in the company, it will be a long while before my research job is threatened.

The main issue was that they sent me a non-working modem, in the most basic sense: the power led didn't come on when I plugged it in (you get an idea of how much quality control is being done). To fix this, I had to assure a lady that I did check several power plugs before waiting on the phone for half an hour to talk to somebody, and then we counted together 30 seconds while I was holding the reset button pressed. When she was finally convinced the thing was broken, she told me the "replacements department" was closed at that hour, and I had to call the next day. (Some operators need to be specially trained to fill out the replacement form?)

After I went through the same routine the second day, the replacement department assured me that they would mail a replacement with overnight delivery. And that's exactly what they did, 3 days later. (I eventually got a tracking number from them, which tells me exactly when the mailing happened.) Somebody should explain that they could probably save money and time by sending things via regular mail, just promptly.

And then comes the most unbelievable part: the package had the wrong address! Yes, I mean the package from the phone company, which should know my address pretty damn well in order to activate my line and bill me. And yes, the package from the same company that had already sent me a modem at the correct address. I just can't believe that the guy at the "replacement department" has to type in my address manually, again!

Of course, the original delivery date had to be on a Friday, so I couldn't pick up the package from UPS until Monday... Hence the whole issue was only resolved today.