[metrics-bugs] #28305 [Metrics/Statistics]: Include client numbers even if we think we got reports from more than 100% of all relays
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Nov 29 14:22:39 UTC 2018
#28305: Include client numbers even if we think we got reports from more than 100%
of all relays
--------------------------------+------------------------------
Reporter: karsten | Owner: karsten
Type: defect | Status: accepted
Priority: High | Milestone:
Component: Metrics/Statistics | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor: SponsorV-can
--------------------------------+------------------------------
Comment (by karsten):
Replying to [comment:6 teor]:
> Replying to [comment:5 karsten]:
> > Note the red arrow. At this point `n(H)` grows larger than `n(N)`.
That's an issue. By definition, a relay cannot report written directory
bytes statistics for a longer time than it's online.
>
> But relays that aren't listed in the consensus can still be acting as
relays.
You're right, there are cases where this is possible. These are just cases
we did not consider in the original design of the `frac` formula. But yes,
this is possible.
> > A possible mitigation (other than the one I suggested above) could be
to replace `n(H)` with `n(N^H)` in the `frac` formula. This would mean
that we'd cap the amount of time for which a relay reported written
directory bytes to the amount of time it was listed in the consensus.
>
> This seems like a reasonable approach: if the relay is listed in the
consensus for `n(N^H)` seconds, then we should weight its bandwidth using
that number of seconds.
Oh, you're raising another important point here: speaking in formula
terms, if we replace `n(H)` with `n(N^H)` we'll also have to replace
`h(H)` with `h(N^H)`.
Similarly, we'll have to replace `h(R^H)` with `h(R^H^N)` and `n(R\H)`
with `n(R^N\H)`.
Hmmmm. I'm less optimistic now that changing the `frac` formula is a good
idea. It seems like a too big change to make, and we're not even sure that
the result will be more accurate.
> > I'm currently dumping and downloading the database to try this out at
home. However, I'm afraid that deploying this fix is going to be much more
expensive than making the simple fix suggested above. I'll report here
what I find out.
>
> I'm not sure if it will make much of a difference long-term: relays that
drop out of the consensus should have low bandwidth weights, and therefore
low bandwidths. (Except when the network is unstable, or there are less
than 3 bandwidth authorities.)
Agreed.
Let's make the change I suggested above, in a slightly modified way:
{{{
-WHERE a.frac BETWEEN 0.1 AND 1.0
+WHERE a.frac BETWEEN 0.1 AND 1.1
}}}
The reason for accepting `frac` values between `1.0` and `1.1` is that, as
discussed here, there can be relays reporting statistics that temporarily
didn't make it into the consensus.
The reason for not giving up on the upper bound is that, as the graph
above shows, there are still single days over the years when `frac`
suddenly went up to `1.2`, `1.5`, or even `2.0`. We should continue
excluding these data points. Therefore we should use `1.1` as new upper
bound.
How does this sound?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28305#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list