Saturday, February 23, 2008

End of Trip

After almost 3 months, I'm back to MIT.

I flew out of Luxemburg this morning, biked to Hague for lunch with a high-school friend, and now I have a few hours before attending a big Romanian party at MIT. Life is good.

Since I mention Luxemburg, I must add that I found it a remarkably beautiful city, even compared to the big European classics. Definetely worth visiting, especially for those of you who occasionally go to Dagstuhl (also note that LUX is significantly closer than FRA!)

At Dagstuhl, I gave two talks (Onlinifying Ian and Hard Data-Structure Problems), had enlightening conversations, but above all, I presented my new Romanian outfit:

(Picture courtesy of Sariel.)

The border-control dog seemed horrified by the smell of my baggage and disappeared as quickly as it could, dragging the attendant with it. I am left guessing whether it felt some faint smell of wolf or sheep blood that we cannot sense.

Tuesday, February 12, 2008

Talk, Univ. de Vest, Timisoara

This Friday, February 15, I am giving a talk at Timisoara, hosted by Gabriel Istrate. The talk is roughly an overview of the world of dynamic graph algorithms (no background assumed) --- what are the milestone results, the current research, and the big challenges for the future.

When it comes to current research, I will understandably be biased towards my own work, and will talk primarily about:
  • how to generalize graph bridges to understand multiple edges going down (this paper with Mikkel Thorup). Big open problem: worst-case bounds.
  • how to handle node failures, not just edge failures (this paper with Timothy Chan and Liam Roditty). Big open problem: handling node failures better and for more queries.
Formal details for those who want to attend:
Locatia: Universitatea de Vest. Sala 045C (Institutul e-Austria)
Ora: 10:00 am (vineri, 10 februarie)
Titlu: Algoritmi pentru grafuri dinamice
---
In multe aplicatii naturale, grafurile cu care avem de-a face au o structura dinamica: conexiunile (muchiile) dintr-o retea de calculatoare se pot strica sau repara; o configurare gresita sau un reboot poate scoate din folosinta un router (nod in graf); etc. Aceste cerinte au dus la un studiu intens al algoritmilor pe grafuri dinamice, care dureaza de peste 20 de ani si nu pare sa se apropie de sfarsit.

In aceasta prezentare, nu vom presupune cunostiinte anterioare despre grafuri dinamice, si vom incepe cu o introducere a domeniului. Apoi ne vom concentra pe cateva rezultate recente, si vom discuta multitudinea de probleme care raman inca nerezolvate.

Monday, February 4, 2008

Introducing TCS?

What do you present in a 2-day course with an audience who is:
  • really smart
  • generally unaware of TCS, but willing to spend time listening to you
  • half (non-theory) computer scientists and half hard-core theoretical mathematicians?
With a few hours to decide and 2 mornings to prepare, I designed the course detailed below. I hope this contained enough brain-teasers, cool ideas, and real applications to be an effective introduction to TCS. But I am very curious to hear about some "must-teach" topics that I forgot --- leave a comment.

----- Day 1 -----

Nearest Neighbor in the Plane.
Voronoi diagrams, point location with O(n) space and O(lg n) time via persistent data structures.
  • historical references: the autoritative paper is [Driscoll, Sarnak, Sleator, Tarjan: Making Data Structures Persistent. JCSS 1989; STOC 1986]. The initial papers were [Sarnak, Tarjan: Planar Point Location Using Persistent Search Trees. CACM 1986]; [Cole: Searching and Storing Similar Lists. JALG 1986]
  • today these results are described in many places. See for example these lecture notes from David Karger's Advanced Algorithms.
Going below logarithmic time. Talk briefly about the field of TCS exploring sublogarithmic running times, which is also one of the few areas where we have good lower bounds. I only taught van Emde Boas as an example, but mentioned a lot of things:
  • my paper with Timothy Chan on sublogarithmic point location. Open problem: lower bound.
  • my paper with Mikkel Thorup giving optimal lower bounds for predecessor search. (In particular, van Emde Boas is optimal for near-linear space.)
  • my favorite algorithm for sorting in o(nlg n) time is signature sort, which sorts in O(n) time when the word is w = Ω(lg2+ε n). See for example these lecture notes from our Advanced Data Structures course.
  • if my presentation of van Emde Boas managed to confuse you, check out one of the many expositions, such as these lecture notes from our course.
Nearest Neighbor in Constant Dimension. Curse of dimensionality: Voronoi diagram grows like nO(d). Space-filling curves: how to get (1+ε)-approximation in any constant dimension, with linear space and predecessor query time.
  • this is mostly from [Chan: Closest-point problems simplified on the RAM. SODA 2002]. Paper here.
  • I blogged about this a while ago.
High dimensions. Brief motivation of high dimensions (machine learning "features" and such). Stories and intuition about Lp norms; how God seems to love L2. Johnson-Lindenstraus, with proof. Brief story about approximate nearest neighbor (LSH and n1/ε^2); lower bounds.
  • there are many expositions of JL. For some very complete proofs, see these lecture notes from Michel Goemans' class.
  • the latest LSH paper, with a good bibliography, is [Andoni, Indyk: Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions, CACM 2008]. You can see it on the new and nifty CACM webpage (click the Contents button, then it's on page 117).
  • our lower bounds on JL-based algorithms.
Norm estimation in the streaming model. Why we care: query plans, and optimization in SQL engines. Estimating the L2 norm via Johnson-Lindenstraus. Overview of other norms.
  • the original paper, now with a Goedel award: [Alon, Matias, Szegedy: The Space Complexity of Approximating the Frequency Moments. JCSS 1999, STOC 1996]
  • I believe the latest paper is [Indyk, Woodruff: Optimal Approximations of the Frequency Moments of Data Streams, STOC2005]
  • for a nice description of norm estimation, and a whole lot more about streaming algorithms, see Piotr's class

----- Day 2 -----

Define graph sparsity. Examples; handwave about planar separators and algorithmic applications. [What is a good citation for this topic?]

Expander graphs. Handwave about explicit constructions. 2-minute overview of pseudorandomness. Randomness-efficient amplification via random walks on expanders.
  • An excellent introduction to pseudorandomness is Salil Vadhan's course. If I got you confused, lecture 8 concerns error reduction by expander walks.
Metrics. Embeddability, any metric embeds into L1 with distortion O(lg n).
  • a whole course on metric embeddings, by Michel Goemans.
  • CRC bookchapter by Piotr Indyk and Jiri Matousek.
  • there is a ton of recent work (particularly lower bounds = nonembeddability) that I cannot hope to link to here.
Sparsest-cut problem. Why we care out graph partitioning, and bit about practice. Theory: cut metrics; L1 = cone of cut metrics; use LP to optimize over all metrics and then embed into L1. Handwave about negative-type metrics, SDPs, and O(sqrt(lg n)) approximation.
  • the famous paper that first used embeddings for sparsest cut is [Linial, London, Rabinovich: The Geometry of Graphs and Some of its Algorithmic Applications. Combinatorica 1995, FOCS 1994]
  • the sqrt(lg n) paper is [Arora, Rao, Vazirani: Expander Flows, Geometric Embeddings, and Graph Partitionings. STOC 2004]. See here.
  • a lot of recent (and not so recent) research has been done about this problem. A personal favorite here is [Spielman, Teng: Nearly linear time algorithms for graph partitioning, graph sparsification, and solving linear systems, STOC 2004]