[metrics-bugs] #28305 [Metrics/Statistics]: Include client numbers even if we think we got reports from more than 100% of all relays
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Nov 28 15:51:34 UTC 2018
#28305: Include client numbers even if we think we got reports from more than 100%
of all relays
--------------------------------+------------------------------
Reporter: karsten | Owner: karsten
Type: defect | Status: accepted
Priority: High | Milestone:
Component: Metrics/Statistics | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor: SponsorV-can
--------------------------------+------------------------------
Changes (by karsten):
* owner: metrics-team => karsten
* status: new => accepted
Comment:
I think I now know what's going on: some relays report written directory
byte statistics for times when they were hardly listed in consensuses.
Here's a graph with all variables going into the `frac` formula, plus
intermediate products, and finally the `frac` value:
[[Image(frac-raw-2018-11-28.png, 500px)]]
Note the red arrow. At this point `n(H)` grows larger than `n(N)`. That's
an issue. By definition, a relay cannot report written directory bytes
statistics for a longer time than it's online.
I also looked at random relay `002B024E24A30F113982FCB17DFE05B6F38C0C79`
that had a larger `n(H)` value than `n(N)` value on 2018-10-28:
- This relay was listed in 3 out of 24 consensuses on 2018-10-28 (19:00,
20:00, and 21:00). As a result, we count this relay with `n(N) = 10800`
(we're using seconds internally, not hours).
- The same relay published an extra-info descriptor on 2018-10-31 at
09:28:04 with the following line: `dirreq-write-history 2018-10-30
08:04:04 (86400 s) 0,0`. We count this as `n(H) = 57356` on 2018-10-28.
A possible mitigation (other than the one I suggested above) could be to
replace `n(H)` with `n(N^H)` in the `frac` formula. This would mean that
we'd cap the amount of time for which a relay reported written directory
bytes to the amount of time it was listed in the consensus.
I'm currently dumping and downloading the database to try this out at
home. However, I'm afraid that deploying this fix is going to be much more
expensive than making the simple fix suggested above. I'll report here
what I find out.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28305#comment:5>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list