[metrics-bugs] #23367 [Metrics/Metrics website]: Onion address counts ignore descriptor upload overlap
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Sep 6 01:27:49 UTC 2017
#23367: Onion address counts ignore descriptor upload overlap
-------------------------------------+------------------------------
Reporter: teor | Owner: metrics-team
Type: defect | Status: new
Priority: Medium | Milestone:
Component: Metrics/Metrics website | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------+------------------------------
Comment (by teor):
Replying to [comment:1 karsten]:
> I'll have to dive deeper into this topic, but here are some quick
thoughts:
>
> - I don't think we're including anything from v3 in these statistics,
but we'd have to ask asn and dgoulet to be certain.
No, we're not. And perhaps we will end up collecting them using PrivCount
in Tor.
> - I believe we're taking descriptor overlap periods into account for
v2. See Section 5, "Extrapolating network totals" of the linked report:
"As an approximation, we assume that a hidden service publishes its
descriptor to ''twelve'' directories over a 24-hour period: the service
stores ''two'' replicas per descriptor using different descriptor
identifiers, both descriptor replicas get stored to ''three'' different
hidden-service directories each, and the service changes descriptor
identifiers once every 24 hours which leads to ''two'' different
descriptor identifiers per replica." And later in that section we say how
this is just an approximation.
>
> Do you think there's a defect in the v2 code?
Yes. In each 24-hour period, there is a 1-hour overlap where descriptors
are posted to the current and next HSDirs. So services with addresses that
correspond to the first or last hour (initial bytes 00-0B and F4-FF) can
be seen at 6 or 18 directories, not 12. But this probably balances out
over time.
This is how I fixed it in experimental PrivCount (there might be bugs):
https://github.com/privcount/privcount/pull/423/commits/4f1fb9191c9f3c5dc0ccbfe43c2b021a213a0c78
I also wonder if you need to account for the 1-2 hour delay between a
consensus being produced, and clients downloading and using it. But the
variance is probably small.
> And, independent of that question, is there anything in particular that
should we keep in mind when extending this code to v3?
* There is an overlap for 12 hours per day, from when the client receives
the 0000 consensus, for 36 hours (that is, approximately 0100-0200 for 36
hours)
* The hash ring changes every 24 hours based on the SRV
* You need the ed25519 relay ids from descriptors to calculate the hash
ring (they're not in the consensus)
There are a few more minor things that affect v2 and v3. I added a list to
experimental PrivCount's position weights script:
https://github.com/privcount/privcount/pull/423/commits/e4d5786469b12781a10b1c875d9228d65a17b2d9
#diff-a5cebcf3ce45960e58426e68588e82e1R41
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23367#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list