[tor-dev] Metrics for evaluating sbws vs torflow? (was: Raising AuthDirMaxServersPerAddr to 4)

Tue Jun 4 02:54:00 UTC 2019

teor:
> Hi Mike,
> 
>> On 4 Jun 2019, at 06:20, Mike Perry <mikeperry at torproject.org> wrote:
>>
>> Mike Perry:
>>> teor:
>>>> I have an alternative proposal:
>>>>
>>>> Let's deploy sbws to half the bandwidth authorities, wait 2 weeks, and
>>>> see if exit bandwidths improve.
>>>>
>>>> We should measure the impact of this change using the tor-scaling
>>>> measurement criteria. (And we should make sure it doesn't conflict
>>>> with any other tor-scaling changes.)
>>>
>>> I like this plan. To tightly control for emergent effects of all-sbws vs
>>> all-torflow, ideally we'd switch back and forth between all-sbws and
>>> all-torflow on a synchronized schedule, but this requires getting enough
>>> measurement instances of sbws and torflow for authorities to choose
>>> either the sbw file, or the torflow file, on some schedule. May be
>>> tricky to coordinate, but it would be the most rigorous way to do this.
>>>
>>> We could do a version of this based on votes/bwfiles alone, without
>>> making dirauths toggle back and forth. However, this would not capture
>>> emergent effects (such as quicker bwadjustments in sbws due to decisions
>>> to pair relays with faster ones during measurement). Still, even
>>> comparing just votes would be better than nothing.
> 
> I don't know how possible this is: we would need two independent network
> connections per bandwidth scanner, one for sbws, and one for torflow.
> 
> (Running two scanners on the same connection means that they compete
> for bandwidth. Perhaps we could use Tor's BandwidthRate to share the
> bandwidth.)
> 
> I also don't know how many authority operators are able to run sbws:
> Roger might be stuck on Python 2.
> 
> And I don't know how often they will be able to switch configs.
> 
> Let's make some detailed plans with the dirauth list.

Ok. It looks like I am still on the dirauth list. Perhaps we can come up
with some way to use the dirauth-conf repo to switch things, but if we
lack the machines for separate sbws and torflow, I agree that we should
not try to have the same connections/machines running both.

In that case, we should just focus on tracking the metrics that are
important to us as we continue to add sbws and remove torflow instances.

>>> Do you like these metrics? Do you think we should be using different
>>> ones? Should we try a few different metrics and see what makes sense
>>> based on the results?
>> As additional metrics, we could do the CDFs of the ratio of measured bw
>> to advertised bw, and/or the metrics Karsten produced using just
>> measured bw. (I can't still find the ticket where those were graphed
>> during previous torflow updates, though).
>>
>> These metrics would be pretty unique to torflow/sbws experiments, but if
>> we have enough of those in the pipeline (such as changes to the scaling
>> factor), they may be worth tracking over time.
> 
> If we get funding for sbws experiments, we can definitely tweak the sbws
> scaling parameters, and do some experiments.
> 
> At the moment, I'd like to focus on fixing critical sbws issues, deploying
> sbws, and making sure it works at least as well as torflow.

Yes, that makes sense. A minimal version of this could be: don't do the
swapping back and forth, just add sbws and replace torflow scanners one
by one. As we do this, we could just keep a record of the metrics over
the votes and consensus during this time, and compare how the metrics
look for the sbws vs torflow votes vs the consensus, over time.

I'll work on precise formulae for the "Per Relay Spare Capacity" metric
and the "Measured to Observed Ratio" metric, and think more about how we
want to graph them so they are more easy to compare over time. I feel
like my previous mails were a little hand-wavy. Depending on how this
works out, I will either post that to tor-scaling with a complete list
of specific metrics equations, or write a separate post to tor-dev with
them just for sbws.

We won't finalize all of the performance experiment metrics until after
the Mozilla All Hands meeting (ie: ~3 weeks), but the two above can be
retroactively computed using router descriptor and extrainfo archives.

What were you thinking for the timeframe for the complete transition to
sbws?

-- 
Mike Perry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20190604/c6751e01/attachment.sig>