tag:blogger.com,1999:blog-786333285568106173.post3709329180331169093..comments2024-01-25T01:05:59.968-05:00Comments on WebDiarios de Motocicleta: Basic HashtablesMihaihttp://www.blogger.com/profile/11599372864611039927noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-786333285568106173.post-30422793236454626822010-08-23T16:03:55.931-04:002010-08-23T16:03:55.931-04:00Those textbooks are obsolete. If you're going ...Those textbooks are obsolete. If you're going to do something that is not cache friendly, you should just do cuckoo hashing which has O(1) query time in the worst-case. There's no reason to do quadratic probing, etc. Those date back to the days when we didn't know any better :)Mihaihttps://www.blogger.com/profile/11599372864611039927noreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-4277805097440221972010-02-25T06:35:44.228-05:002010-02-25T06:35:44.228-05:00Dear Mihai,
in the textbooks they say linear pro...Dear Mihai, <br /><br />in the textbooks they say linear probing might not be so good due to clustering and recommend quadratic probing and "double hashing". Can you comment on that? I'm asking this because I regularly teach these matters and don't want to be obsolete.<br /><br />Thanks, MartinAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-86724023642026747532010-02-02T14:01:26.738-05:002010-02-02T14:01:26.738-05:00Hi Rasmus!
I slightly disagree on large items and...Hi Rasmus!<br /><br />I slightly disagree on large items and linear probing. You are still getting the advantage of the prefetcher. In our tests, linear probing with in-table record is still the fastest for items of 256 bytes (this was quite surprising to us). After that, you can switch to pointers.<br /><br />If you have large, variable-sized records, then I don't think there is any real difference between linear probing and chaining. You are traversing the same number of pointers.Mihaihttps://www.blogger.com/profile/11599372864611039927noreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-15986438778848323902010-02-02T10:10:40.862-05:002010-02-02T10:10:40.862-05:00Thanks for this nice expository post, Mihai! It co...Thanks for this nice expository post, Mihai! It compensates for my own lack of doing more on that front :-) A couple of comments on what to do when speed really matters:<br /><br />- If the items in your hash table are big, don't use linear probing, as the cache won't help you anyway.<br /><br />- A proper implementation of cuckoo hashing can have essentially the same cache-efficiency as linear probing for lookups, as the memory probes can be pipelined. But insertions are cache-inefficient, so don't use when there are many updates.<br /><br />- There are cases where chaining can be useful, e.g. if you are storing (large) variable-size data. It seems that only linear probing can accommodate variable-size data, among open addressing schemes, but the performance degenerates for large items.Rasmus Paghhttps://www.blogger.com/profile/05722627403861422993noreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-60991364608063826792010-02-01T12:23:04.747-05:002010-02-01T12:23:04.747-05:00You are absolutely right, I can't calculate :)...You are absolutely right, I can't calculate :) The mistake is:<br /><br />E[#elements colliding with x] = n/b.<br /><br />It is in fact (n-1)/b, since there are only n-1 other elements...Mihaihttps://www.blogger.com/profile/11599372864611039927noreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-8266568203302834452010-01-31T20:07:38.248-05:002010-01-31T20:07:38.248-05:00Most probably I miss something but please allow me...Most probably I miss something but please allow me to continue.<br />If we work without O(), from the first post we have that<br />E[Y_i^2] = 1/b - 1/b^2<br />and Σ E[Y_i^2] = Var[X] = n/b - n/b^2<br /><br />It is O(n/b) but not n/b as you proved for BAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-62837969296596810082010-01-31T15:11:56.941-05:002010-01-31T15:11:56.941-05:00B_i is what you call X in the first post, right?
...<i>B_i is what you call X in the first post, right? </i><br /><br />Yes.<br /><br /><i>However, the variance of X is somewhat different. </i><br /><br />Hopefully not :) I proved in the first post that Var[X] = O(n/b). Here I showed it without a constant, Var[B_i] = n/b.Mihaihttps://www.blogger.com/profile/11599372864611039927noreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-49014512339360364662010-01-31T15:09:43.883-05:002010-01-31T15:09:43.883-05:00Thanks a lot for the clarifications.
I have one m...Thanks a lot for the clarifications.<br /><br />I have one more question, though.<br />B_i is what you call X in the first post, right? However, the variance of X is somewhat different. Am I missing something?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-1961006218558387822010-01-30T17:08:20.460-05:002010-01-30T17:08:20.460-05:00Sure, requests for clarifications are always welco...Sure, requests for clarifications are always welcome.<br /><br />If Bi is the number of elements in bin i, then the number of pairs (i,j) that collide, in the sense that h(i)=h(j), can be written as Σ(Bi)^2 - n. If some bin has 7 elements, for instance, it account for 7*6 = 7^2 - 7 collisions. I have thus shown E[Σ(Bi)^2] = n + E[#colliding pairs] <br /><br />Now E[#colliding pairs] = E[#colliding pairs of the form (1,*)] + E[#colliding pairs of the form (2,*)] + ...<br /><br />So E[#colliding pairs] = n · E[#elements colliding with some fixed element]<br /><br />But the number of elements colliding with some fixed element is just what I calculated above, i.e. n/b.<br /><br />Does this make sense now?Mihaihttps://www.blogger.com/profile/11599372864611039927noreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-92135140130579772042010-01-30T16:33:42.093-05:002010-01-30T16:33:42.093-05:00I meant
"Can you please GIVE more details on ...I meant<br />"Can you please GIVE more details on the following"<br /><br />sorry about that.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-44297501669749911732010-01-30T16:32:54.297-05:002010-01-30T16:32:54.297-05:00Can you please more details on the following
E[Σ(...Can you please more details on the following<br /><br />E[Σ(Bi)^2] = n + E[#colliding pairs] = n + n · E[#elements colliding with x] = n + n^2/bAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-786333285568106173.post-84948556538074015212010-01-30T09:44:17.739-05:002010-01-30T09:44:17.739-05:00Very nice. I can't wait for the next installm...Very nice. I can't wait for the next installment.Anonymoushttps://www.blogger.com/profile/06328185728688465505noreply@blogger.com