[metrics-bugs] #26868 [Metrics/Statistics]: How does metrics get bridge statistics at a granularity of 1 user?
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Jul 24 07:55:51 UTC 2018
#26868: How does metrics get bridge statistics at a granularity of 1 user?
--------------------------------+------------------------------
Reporter: teor | Owner: metrics-team
Type: defect | Status: new
Priority: Medium | Milestone:
Component: Metrics/Statistics | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
--------------------------------+------------------------------
Comment (by karsten):
Replying to [comment:7 teor]:
> So, I believe he answer to my question is:
>
> "We approximate directory request numbers by multiplying the fraction of
unique IP addresses from a given country, transport, or IP version with
the total number of successful requests."
That would produce smaller numbers than 8, too.
Another answer is this part: "Split observations to the covered UTC dates
by assuming a linear distribution of observations."
We'd have to look at the raw data to say which one is the better answer.
But I assume your question is mostly answered by knowing that it's not a
too small number in the original data.
> But I think there are two missing steps:
> * Metrics appears to round/truncate/ceiling client numbers to the
nearest integer
Right, we're using integer truncation here. We should probably document
that under Step 4 of the [https://metrics.torproject.org/reproducible-
metrics.html#relay-users Relay users] section.
> * You say that you "Skip dates where frac is smaller than 10% and hence
too low for a robust estimate"
> * are the snowflake bridges less than 10% of total bridge usage? That
could be why their numbers vary so much.
> * how do you calculate 10% of bridge usage? (Bridges don't have
bandwidth, so do you use unique IP addresses?)
Wait, no, ''frac'' is the "estimated fraction of reported directory-
request statistics". It is unrelated to snowflake in particular and refers
to all bridge usage. The formula for computing ''frac'' is specified in
Step 3 of the [https://metrics.torproject.org/reproducible-metrics.html
#relay-users Relay users] section.
Please let me know if this makes more sense now, and if not, how we can
improve it. Thanks!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26868#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list