[metrics-bugs] #28305 [Metrics/Statistics]: Include client numbers even if we think we got reports from more than 100% of all relays
Tor Bug Tracker & Wiki
blackhole at torproject.org
Sat Nov 3 20:38:47 UTC 2018
#28305: Include client numbers even if we think we got reports from more than 100%
of all relays
------------------------------------+--------------------------
Reporter: karsten | Owner: metrics-team
Type: defect | Status: new
Priority: High | Milestone:
Component: Metrics/Statistics | Version:
Severity: Normal | Keywords:
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
------------------------------------+--------------------------
The estimated fraction of reported user statistics from relays has reached
100% and even went slightly beyond that number to 100.294% on 2018-10-27
and 100.046% on 2018-10-28.
The effect is that we're excluding days when this happened from
statistics, because we never thought this was possible:
{{{
WHERE a.frac BETWEEN 0.1 AND 1.0
}}}
However, I think this is most likely a rounding error somewhere, not a
general issue with the approach. Stated differently, it seems wrong to
include a number with a fraction of reported statistics of 99.9% but not
one where that fraction is 100.1%.
I suggest that we drop the upper limit and change the line above to:
{{{
WHERE a.frac >= 0.1
}}}
We'll be replacing these statistics by PrivCount in the medium term
anyway.
However, simply excluding data points doesn't seem like an intuitive
solution.
Thoughts?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28305>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list