[or-cvs] r17785: {tor} Document our Bloom filter parameter choices. (tor/trunk/src/common)
nickm at seul.org
nickm at seul.org
Fri Dec 26 17:35:19 UTC 2008
Author: nickm
Date: 2008-12-26 12:35:18 -0500 (Fri, 26 Dec 2008)
New Revision: 17785
Modified:
tor/trunk/src/common/container.c
Log:
Document our Bloom filter parameter choices.
Modified: tor/trunk/src/common/container.c
===================================================================
--- tor/trunk/src/common/container.c 2008-12-26 17:35:12 UTC (rev 17784)
+++ tor/trunk/src/common/container.c 2008-12-26 17:35:18 UTC (rev 17785)
@@ -1233,6 +1233,16 @@
digestset_t *
digestset_new(int max_elements)
{
+ /* The probability of false positivies is about P=(1 - exp(-kn/m))^k, where k
+ * is the number of hash functions per entry, m is the bits in the array,
+ * and n is the number of elements inserted. For us, k==4, n<=max_elements,
+ * and m==n_bits= approximately max_elements*32. This gives
+ * P<(1-exp(-4*n/(32*n)))^4 == (1-exp(1/-8))^4 == .00019
+ *
+ * It would be more optimal in space vs false positives to get this false
+ * positive rate by going for k==13, and m==18.5n, but we also want to
+ * conserve CPU, and k==13 is pretty big.
+ */
int n_bits = 1u << (tor_log2(max_elements)+5);
digestset_t *r = tor_malloc(sizeof(digestset_t));
r->mask = n_bits - 1;
More information about the tor-commits
mailing list