[metrics-bugs] #34023 [Metrics/Onionperf]: Reduce the number of 50 KiB downloads (was: Kill the 50 KiB downloads)
Tor Bug Tracker & Wiki
blackhole at torproject.org
Mon Apr 27 14:51:11 UTC 2020
#34023: Reduce the number of 50 KiB downloads
-------------------------------+------------------------------
Reporter: karsten | Owner: metrics-team
Type: enhancement | Status: needs_review
Priority: Medium | Milestone:
Component: Metrics/Onionperf | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+------------------------------
Comment (by karsten):
Replying to [comment:2 robgjansen]:
> **If we remove 50KiB, also remove 1Mib?**
>
> My main concern isn't the added load on the network, but rather that you
are removing a metric that we have consistently used as a benchmark for
the last ~decade. It's useful to be able to compare against a consistent
benchmark over time.
>
> I notice from [https://trac.torproject.org/projects/tor/ticket/33076
#33076] the suggestion that we could use the 1MiB and/or 5MiB results to
compute the 50KiB times (using the incremental DATAPERC timestamps). That
seems reasonable to me. Following that logic though, why not remove the
1MiB file too? I think both the 50KiB and the 1MiB times could be computed
from the 5MiB results, since we have incremental timestamps for every 10%
of the download.
That's a good point. Here's the math for that suggestion:
- With only 5 MiB downloads we'd be downloading on average 5 MiB = 5120
KiB every 5 minutes, or 5120 * 8 * 1024 / (300 * 1000) = 140 kbps.
And yes, we do have some code somewhere to compute partial completion
timestamps from 5 MiB downloads.
> **Reasons to keep all of the files.**
>
> I think there are two strong reasons to keep all 3 file sizes:
> 1. You can specify a different timeout for each of the 3 sizes. That
let's you cancel the smaller files much sooner if they are hanging. And if
the timeouts are set realistically, it helps you get a better sense of how
often we fail to meet a target completion time.
In theory, we could retroactively apply timeouts by pretending that a
partial 50 KiB or 1 MiB download taking longer than `x` would have timed
out.
> 1. Diversity of circuits. If you follow the suggestion above and remove
50KiB and 1MiB and only keep 5MiB, and then you get a crappy circuit, data
points for all 3 download times will be affected. Previously that only
affected one data point.
We wouldn't change the frequency of making downloads. We would just
extract more than one time-to-last-byte timestamp from a given
measurement. But I see how we would want to document this very clearly to
help our users interpret our data.
> **Adjust download weights instead?**
>
> If you would like more data points for 1MiB and 5MiB files and fewer for
50KiB, have you considered adjusting the weights that are used in the TGen
model file instead of completely removing a file size?
[https://gitweb.torproject.org/onionperf.git/tree/onionperf/model.py#n90
The weights are specified here.] For example, if you want all file sizes
to have equal download probabilities, set the weight for each file size to
`1.0`.)
We did consider this to avoid increasing load on the measurement hosts and
network too much but then figured we can kill these downloads altogether.
But you raise some important points above that require more attention
before killing 50 KiB downloads entirely.
New plan: use a weight of 1.0 for all three download sizes until we figure
out how to kill 50 KiB and 1 MiB downloads.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/34023#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list