[metrics-bugs] #31422 [Circumvention/BridgeDB]: Make BridgeDB report internal metrics
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Jun 9 18:23:15 UTC 2020
#31422: Make BridgeDB report internal metrics
-------------------------------------------------+-------------------------
Reporter: phw | Owner: phw
Type: enhancement | Status:
| needs_information
Priority: Medium | Milestone:
Component: Circumvention/BridgeDB | Version:
Severity: Normal | Resolution:
Keywords: s30-o21a1, anti-censorship- | Actual Points:
roadmap-2020 |
Parent ID: #31274 | Points: 2
Reviewer: agix | Sponsor:
| Sponsor30-can
-------------------------------------------------+-------------------------
Changes (by phw):
* status: merge_ready => needs_information
Comment:
Replying to [comment:13 karsten]:
> I'm less sure about how useful they will be. The median will likely be
the most interesting statistic here, but the min and max will only tell
you about the smallest and largest outliers but not tell you much about
how the distribution looks like. Not sure how useful the standard
deviation will be.
>
> Would it be an option to add quantiles? Your comment suggests that you'd
have to require Python 3.8 in order to use the quantiles() function of the
built-in statistics module. But did you consider using SciPy/NumPy to
compute these? However, if neither of those is an option, I'd recommend
against computing quantiles yourself, because there are just too many ways
to screw up.
>
> If you have quantiles, you might want to include first and third
quartile as well as smallest and largest non-outliers within 1.5 inter-
quartile ranges from the median. That's the five values you'd also find in
a boxplot. We're computing these five values in our
[https://metrics.torproject.org/onionperf-latencies.html OnionPerf latency
statistics]. [https://gitweb.torproject.org/metrics-
web.git/tree/src/main/sql/onionperf/init-onionperf.sql#n187 Here]'s the
SQL code that we use. (I don't think we have Python code around for
computing the high and low values.)
[[br]]
Thanks for the feedback! I removed the standard deviation and added the
four metrics you suggest: 1st and 3rd quartile, and the upper and lower
whiskers.
[https://github.com/NullHypothesis/bridgedb/commit/0beed8953e7a72a69b72045b2623d81b926012f1
Here's the patch]. I used numpy to determine the quartiles. I originally
hesitated to add yet another dependency – especially a bulky one like
numpy – but we can remove it again once Python 3.8 (which has built-in
support for quantiles) is available in Debian stable.
On an unrelated note: Karsten, do we need to coordinate on when we deploy
this patch? Note that the patch bumps the key `bridgedb-metrics-version`
to 2 and adds several new fields for our internal metrics. Does this break
anything on the metrics side of things?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31422#comment:18>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list