What do you present in a 2-day course with an audience who is:

- really smart
- generally unaware of TCS, but willing to spend time listening to you
- half (non-theory) computer scientists and half hard-core theoretical mathematicians?

----- Day 1 -----

Nearest Neighbor in the Plane. Voronoi diagrams, point location with O(n) space and O(lg n) time via persistent data structures.

- historical references: the autoritative paper is [Driscoll, Sarnak, Sleator, Tarjan: Making Data Structures Persistent. JCSS 1989; STOC 1986]. The initial papers were [Sarnak, Tarjan: Planar Point Location Using Persistent Search Trees. CACM 1986]; [Cole: Searching and Storing Similar Lists. JALG 1986]
- today these results are described in many places. See for example these lecture notes from David Karger's Advanced Algorithms.

- my paper with Timothy Chan on sublogarithmic point location. Open problem: lower bound.
- my paper with Mikkel Thorup giving optimal lower bounds for predecessor search. (In particular, van Emde Boas is optimal for near-linear space.)
- my favorite algorithm for sorting in o(nlg n) time is signature sort, which sorts in O(n) time when the word is
*w*= Ω(lg^{2+ε}*n*). See for example these lecture notes from our Advanced Data Structures course. - if my presentation of van Emde Boas managed to confuse you, check out one of the many expositions, such as these lecture notes from our course.

^{O(d)}. Space-filling curves: how to get (1+ε)-approximation in any constant dimension, with linear space and predecessor query time.

- this is mostly from [Chan: Closest-point problems simplified on the RAM. SODA 2002]. Paper here.
- I blogged about this a while ago.

_{p}norms; how God seems to love L

_{2}. Johnson-Lindenstraus, with proof. Brief story about approximate nearest neighbor (LSH and n

^{1/ε^2}); lower bounds.

- there are many expositions of JL. For some very complete proofs, see these lecture notes from Michel Goemans' class.
- the latest LSH paper, with a good bibliography, is [Andoni, Indyk: Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions, CACM 2008]. You can see it on the new and nifty CACM webpage (click the Contents button, then it's on page 117).
- our lower bounds on JL-based algorithms.

- the original paper, now with a Goedel award: [Alon, Matias, Szegedy: The Space Complexity of Approximating the Frequency Moments. JCSS 1999, STOC 1996]
- I believe the latest paper is [Indyk, Woodruff: Optimal Approximations of the Frequency Moments of Data Streams, STOC2005]
- for a nice description of norm estimation, and a whole lot more about streaming algorithms, see Piotr's class

----- Day 2 -----

Define graph sparsity. Examples; handwave about planar separators and algorithmic applications. [What is a good citation for this topic?]

Expander graphs. Handwave about explicit constructions. 2-minute overview of pseudorandomness. Randomness-efficient amplification via random walks on expanders.

- An excellent introduction to pseudorandomness is Salil Vadhan's course. If I got you confused, lecture 8 concerns error reduction by expander walks.

- a whole course on metric embeddings, by Michel Goemans.
- CRC bookchapter by Piotr Indyk and Jiri Matousek.
- there is a ton of recent work (particularly lower bounds = nonembeddability) that I cannot hope to link to here.

- the famous paper that first used embeddings for sparsest cut is [Linial, London, Rabinovich: The Geometry of Graphs and Some of its Algorithmic Applications. Combinatorica 1995, FOCS 1994]
- the sqrt(lg n) paper is [Arora, Rao, Vazirani:
**Expander Flows, Geometric Embeddings, and Graph Partitionings.**STOC 2004]. See here. - a lot of recent (and not so recent) research has been done about this problem. A personal favorite here is [Spielman, Teng: Nearly linear time algorithms for graph partitioning, graph sparsification, and solving linear systems, STOC 2004]

## 11 comments:

(Warning: selfish plug.)

If you have database folks, they may enjoy this paper:

Kamel Aouiche and Daniel Lemire, A Comparison of Five Probabilistic View-Size Estimation Techniques in OLAP, DOLAP 2007, 2007.

http://arxiv.org/abs/cs.DB/0703058

I think there is a better and simpler result than the Speilman-Tang STOC 2004 paper now: check out:

Local graph partitioning using pagerank vectors,

FOCS 2006, R. Andersen, F. Chung, K. Lang

Optimization. One nice paper to talk about is the Edmonds paper giving the Blossom algorithm for finding maximum matching. The algorithm might be too hard to explain and you could stick to Hungarian method on bipartite graphs. It is easy to "digress" from here and mention that it is one of the first papers mentioning that there should be a difference between polytime algorithms and exponential time algorithms in any hierarchy. This later formally developed in to P vs NP question.

You can digress further and talk about approximation algorithms, both from positive and negative sides.

Seems a little esoteric for an *introduction* to TCS. How about more basic things like NP-completeness and the power of randomization? Or more basic algorithms and data structures? Or had the audience already seen stuff like that elsewhere? (I would assume the average mathematician had not.)

Indeed, I was not too clear on the level of the participants. The all know CS kind of stuff, like NP-completeness, shortest paths etc. The goal was to teach them that TCS did something nontrivial in the last 20 years :)

Maybe you could have tried something more colorful like zero knowledge proofs for NP.

Maybe you could have tried something more colorful like zero knowledge proofs for NP.To the contrary, this would've been a disaster. You suggest telling CS people something like "we have this really nifty tricks which are really cool if you love theory, but there's not particular way in which they will affect what you do."

I was looking for stuff with real applications. Everybody was enthusiastic about the relation between SQL enginges and norm estimation. It is a clean theoretical problem which does something for the real world.

Primes in P might have exciting and new.

Hi Mihai,

Just a note: in the norm estimation stream, the latest paper actually is --

"Smooth Histograms"

Vladimir Braverman, Rafi Ostrovsky.

FOCS 2007

It has a pretty interesting approach, and also improves the Indyk-Woodruff result.

PCPs could have been another interesting topic, but selecting what to talk about from that area would have been a challenging task, I reckon.

Given that I work in Cryptography, I can't stop myself from commening on the ZK stuff: You need to know a bit more about ZK. Recently Avi Wigderson presented a funtastic sereis of 3 lectures at UCLA, one of which included a significant part on ZK. The audience was simply amazed, and I got the impression that they truly enjoyed it (there was so much curiosity -- and there were no nifty tricks to explain!).

The audience was very much comparable to the one you had -- a lot of UCLA Math department faculty (inclduing Terrence Tao :) ), a good amount of CS Faculty from other areas, and a several students.

Omkant

Read "nifty" as "nasty" in my last post.

Omkant

Hi folks,

Just a clarification, in case someone will still read the last few comments: the (very neat) FOCS'07 "Smooth Histograms" paper addresses the "sliding window" model, where at any time, the output depends only on the last n elements in the stream. In contrast, our STOC'05 paper addresses the model where the output depends on the whole stream, and the stream is allowed to contain deletions. The two models are incomparable in general.

Cheers,

Post a Comment