[metrics-team] Measuring bwscanner changes
Tom Ritter
tom at ritter.vg
Wed Jun 7 15:25:25 UTC 2017
On 6 June 2017 at 18:02, micah anderson <micah at riseup.net> wrote:
>
> Hi all,
>
> We have been working on making some changes in bwscanning, and we
> *think* that the changes are going to improve measurements on the
> network as a whole, but this is only a guess and we would like to find
> out if there are any ways to determine with some actual metrics what
> kind of affects the changes do actually have.
>
> As of right now, one of the bwauthorities is pulling its bandwidth files
> from two locations: Hong Kong, and Santiago, Chile. The Chile change was
> made relatively recently (approximately April 13th), and we are
> wondering if this had any measurable impact on the network.
To clarify, you mean you are round-robbining the servers you host the
bandwith files (32K and the like) from; not that you have two servers
performing measurements.
That is also about the time maatuska's bwauth disappeared, which would
mask changes in the above/below 90 day graph at the bottom of
https://consensus-health.torproject.org/graphs.html =/
> The theory is that Bandwidth Authority location matters because Tor
> sends more Guard and Middle bandwidth to relays close to the bandwidth
> scanner, and more Exit bandwidth to relays close to the bandwidth
> server.
>
> Additionally, the location impact on tor clients is likely impacted. The
> current bandwidth authority locations mean that relays in North America
> and Europe handle more traffic[0]:
>
> . The Tor network is faster for all clients. Clients are more likely to
> choose a path with relays that are near each other. This affects hidden
> services the most, because they have 6-hop long paths.
>
> . But Exits in North America and Europe can get overloaded, and slow
> down the network for all clients. This does not happen as much for
> Guards, because there are more Guards than Exits. (TODO: measure Exit
> congestion)
>
> . Tor clients in North America and Europe are even faster, because their
> Guard is closer (on average).
>
> . Websites with servers in North America or Europe are faster through
> Tor Exits (on average).
>
> . Websites that use a CDN are faster if the CDN DNS sends the connection
> to nearby data center, and if the CDN has many servers in North America
> or Europe.
>
> Another theory is that adding or moving bandwidth authorities changes
> relay measurements: relays closer to the new location will be measured
> higher. A bandwidth scanner affects Guard and Middle measurements. A
> bandwidth server affects Exit measurements.
>
> So... What happens if we add a bandwidth server in South America to a
> bandwidth authority?
>
> Teor sketched out a number of theories that it would be interesting to
> find out are valid with some metric data:
>
> . Adding a bandwidth server in South America will shift Exit bandwidth
> away from Europe, and maybe North America.
>
> . Websites in South America will become faster through Tor. (Tor is
> mainly used for web traffic.)
>
> . The average distance between middles and exits will increase, so tor
> will become slightly slower. But there will be less load on European
> Exits, which will make them faster for all clients. (TODO: work out
> which effect wins?)
>
> . Guards and Middles that are closer to Exits that are closer to South
> America get more bandwidth. But it matters much more if Guards and
> Middles are close to the scanner.
>
> . People will put more Exits (and relays) in South America, because they
> measure better.
>
> . There will be a small change to a small number of relays. We use the
> median measurement, so changing 1/5 bandwidth authorities will not
> change many relays. And changing 1/2 bandwidth servers on 1 authority
> makes the size of the impact small. We would need to change scanners and
> servers on 3/5 bandwidth authorities to change a lot of relay
> measurements.
>
> So how can we figure these things out? TorPerf/OnionPerf tracks the Tor
> network's overall speed, so they could help us work out whether the
> network is faster or slower after the change. Are we using TorPerf or
> Onion perf for reporting? Where are the current TorPerf measurement
> nodes?
>
> The current OnionPerf nodes are in EU/US/HK. We could put OnionPerf
> nodes in South America, Africa, and Australia to make the measurements
> more representative:
>
> https://trac.torproject.org/projects/tor/wiki/org/operations/Infrastructure/onionperf
>
> In addition to watching overall network performance with TorFlow, there
> are existing graphs for measured relays here:
> https://consensus-health.torproject.org/graphs.html#bwauthstatus
>
> But that's not really enough. I think it would also be helpful to know:
>
> Graph for number of relays that bwauths decided the median for
> https://trac.torproject.org/projects/tor/ticket/21882 (There is a sample
> graph on that ticket)
This is implemented actually, if you hadn't seen it, at the bottom of
https://consensus-health.torproject.org/graphs.html
There is historical analysis here:
https://ritter.vg/misc/bwauth-historical/graphs-historical.html
The full dataset is in a sqlite database here:
https://consensus-health.torproject.org/historical.db
> What is the distribution of a bandwidth authority's measurements?
> https://trac.torproject.org/projects/tor/ticket/21994 I'm not sure if we
> want this, so if you do, comment on the ticket!
I had been planning on implementing this at some point in the future.
I also have been planning on opening tickets to document the changes
needed to make the consensushealth graphs more in line with metrics
and potentially show them there also.
> In the future, the plan is to roll out a series of other bandwith
> authority changes, and see how they affect things. The idea would be to
> make one change at a time, and then observe what it does to the network.
>
> 1. bw server on a CDN, with one bandwidth authority
> 2. run a scanner with and without PreferIPv6 to measure IPv6 exit
> performance
> 3. Try to use the global, dual-stack, HTTP/2 CDN map
> 4. Change the default set of servers in the bwauth code to use the CDN
> 5. Convince all bwauthorities to use multiple redundant servers
> 6. Run a single-onion bw server
> 7. Run an ipv6 bw server
> 8. ?
If we do this, I would strongly urge us to log additional bwauth data.
This will let us do more in-depth analysis at a later date if we so
desire. Specifically:
a) Make sure you have this patch:
https://gitweb.torproject.org/torflow.git/commit/?id=7e4ef735858acf5d2fbb183b6f8418b7fc2b364a
b) Record the raw bwauth file the dirauth ingests. This will need to
be done manually on the bwauth (or dirauth), but there are tickets to
make this 'automatic' here:
https://trac.torproject.org/projects/tor/ticket/21377
https://trac.torproject.org/projects/tor/ticket/21378
Someone should probably look closely at that file and confirm that
there isn't anything else we may want to log.
Regrettably, I lost all of maatuska's historical data when the server died =(
-tom
More information about the metrics-team
mailing list