[tor-scaling] Lessons Learned at Mozilla All-Hands

Tue Jul 2 18:04:00 UTC 2019

Mike Perry:> We arrived at five questions that were important to answer:
> 
>   0. What performance metrics do we need, and what can we add easily?
>   1. How many users does Tor have?
>   2. How fast is Tor and how consistent is this performance?
>   3. How many more users can we add?
>   4. How fast will Tor be with more users and/or more capacity?
> 
> (Spoiler: #4 is by far the most important point of this mail).
> 
>
> 4.  How fast will Tor be with more users and/or more capacity?
> 
> Ok, buckle up. This is the most important thing we analyzed all week.
> 
> We took this normalized utilization curve, and the performance curve,
> and examined our datapoints:
> 
> https://people.torproject.org/~mikeperry/transient/Whistler2019/4-utilization-context.png
> https://people.torproject.org/~mikeperry/transient/Whistler2019/4-boxplots-context.png
> https://people.torproject.org/~mikeperry/transient/Whistler2019/4-boxplots-compare.png
> 
> The hope was that we could use the utilization level to predict Tor
> performance, all other things being equal.
> 
> But, while there is some correlation, there were obvious
> discontinuities. In particular, January 2015 seems to be some kind of
> magic turning point, before and after which Tor performance is
> incomparable, for the same levels of network utilization.
> 
> Why is this? Possible explanations:
>   A. Feature changes caused this.
>      - No obviously responsible Tor features coincided with this time.
>   B. Network usage change.
>      - There was a botnet in 2013, but it was mostly gone by 2014.
>   C. Different sizes of the Tor network are completely incomparable.
>      - Utilization inside 2012-2015 and 2015-present is comparable.
>   D. Total utilization does not (solely) indicate performance.
>      - What about Exit utilization? Guard utilization?

During the meeting Friday, we came up with additional guesses to
investigate in the historical metrics archives:

  E. Do the Balancing CDFs look any different before/after Jan 2015?
     (ie
https://trac.torproject.org/projects/tor/wiki/org/roadmaps/CoreTor/PerformanceMetrics#BalancingMetrics)
  F. Did the siv onionperf instance change its ISP/internet connection
     around January 2015 (give or take a couple months)?
  G. Did Torperf/Onionperf have any major config changes or bugfixes
     around January 2015?
  H. Did we change any consensus parameters around January 2015
     (again, give or take a couple months)?

D-H all seem plausible. We should spend some time in Stockholm to rule
out F, G, and H (or arrive at them as the culprit), and also dig deeper
into historical metrics data for D and E.

I am also going to write another mail to this list to firm up the notion
of network performance "comparability" with a more formal definition
that can be verified with some python and/or proper statistical methods.

This "comparability" concept will provide a way forward to analyze D and
E, also help us evaluate Tor simulation model accuracy.

-- 
Mike Perry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-scaling/attachments/20190702/95d87138/attachment.sig>