[metrics-bugs] #34023 [Metrics/Onionperf]: Reduce the number of 50 KiB downloads (was: Kill the 50 KiB downloads)

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Apr 27 14:51:11 UTC 2020


#34023: Reduce the number of 50 KiB downloads
-------------------------------+------------------------------
 Reporter:  karsten            |          Owner:  metrics-team
     Type:  enhancement        |         Status:  needs_review
 Priority:  Medium             |      Milestone:
Component:  Metrics/Onionperf  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:                     |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+------------------------------

Comment (by karsten):

 Replying to [comment:2 robgjansen]:
 > **If we remove 50KiB, also remove 1Mib?**
 >
 > My main concern isn't the added load on the network, but rather that you
 are removing a metric that we have consistently used as a benchmark for
 the last ~decade. It's useful to be able to compare against a consistent
 benchmark over time.
 >
 > I notice from [https://trac.torproject.org/projects/tor/ticket/33076
 #33076] the suggestion that we could use the 1MiB and/or 5MiB results to
 compute the 50KiB times (using the incremental DATAPERC timestamps). That
 seems reasonable to me. Following that logic though, why not remove the
 1MiB file too? I think both the 50KiB and the 1MiB times could be computed
 from the 5MiB results, since we have incremental timestamps for every 10%
 of the download.

 That's a good point. Here's the math for that suggestion:

  - With only 5 MiB downloads we'd be downloading on average 5 MiB = 5120
 KiB every 5 minutes, or 5120 * 8 * 1024 / (300 * 1000) = 140 kbps.

 And yes, we do have some code somewhere to compute partial completion
 timestamps from 5 MiB downloads.

 > **Reasons to keep all of the files.**
 >
 > I think there are two strong reasons to keep all 3 file sizes:
 > 1. You can specify a different timeout for each of the 3 sizes. That
 let's you cancel the smaller files much sooner if they are hanging. And if
 the timeouts are set realistically, it helps you get a better sense of how
 often we fail to meet a target completion time.

 In theory, we could retroactively apply timeouts by pretending that a
 partial 50 KiB or 1 MiB download taking longer than `x` would have timed
 out.

 > 1. Diversity of circuits. If you follow the suggestion above and remove
 50KiB and 1MiB and only keep 5MiB, and then you get a crappy circuit, data
 points for all 3 download times will be affected. Previously that only
 affected one data point.

 We wouldn't change the frequency of making downloads. We would just
 extract more than one time-to-last-byte timestamp from a given
 measurement. But I see how we would want to document this very clearly to
 help our users interpret our data.

 > **Adjust download weights instead?**
 >
 > If you would like more data points for 1MiB and 5MiB files and fewer for
 50KiB, have you considered adjusting the weights that are used in the TGen
 model file instead of completely removing a file size?
 [https://gitweb.torproject.org/onionperf.git/tree/onionperf/model.py#n90
 The weights are specified here.] For example, if you want all file sizes
 to have equal download probabilities, set the weight for each file size to
 `1.0`.)

 We did consider this to avoid increasing load on the measurement hosts and
 network too much but then figured we can kill these downloads altogether.
 But you raise some important points above that require more attention
 before killing 50 KiB downloads entirely.

 New plan: use a weight of 1.0 for all three download sizes until we figure
 out how to kill 50 KiB and 1 MiB downloads.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/34023#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list