[tor-bugs] #12131 [Analysis]: Measure connectivity patterns between relays

Tue May 27 01:55:19 UTC 2014

#12131: Measure connectivity patterns between relays
----------------------+---------------------
 Reporter:  arma      |          Owner:
     Type:  project   |         Status:  new
 Priority:  normal    |      Milestone:
Component:  Analysis  |        Version:
 Keywords:            |  Actual Points:
Parent ID:            |         Points:
----------------------+---------------------
 https://lists.torproject.org/pipermail/tor-relays/2014-May/004598.html
 makes me wonder how many relays are firewalling certain outbound ports
 (and thus messing with connectivity inside the Tor network). It would be
 great if somebody would start scanning pairs of relays to see which of
 them can reach each other and which can't, with the goal of understanding
 how far from a clique our network topology actually is, and then helping
 with an awareness campaign to correct it if it's a problem.

 Tools that might be helpful building blocks here:
 - Meejah's exitscanner builds circuits, and makes sure it isn't building
 too many at once. Uses txtorcon and thus twisted.
 https://github.com/meejah/txtorcon/blob/exit_scanner/apps/exit_scanner
 /guard-exit-coverage.py
 - phw's exitmap does something similar, but with stem rather than
 txtorcon. https://gitweb.torproject.org/user/phw/exitmap.git/tree

 Other thoughts:
 - You likely want to turn on FastFirstHopPK on the client, so it doesn't
 waste cpu power on handshakes at the first relay.
 - If you make each relay connect to 6000 other relays in succession, and
 some of the relays can't handle 6000 open file descriptors at once, then
 you might mistakenly misinterpret "could not extend to that relay" as a
 property of the link between the relays when actually it's a property of
 the first relay. One option is to scan 500 and then move on to another
 first hop. Another option is to declare this a feature, and try to detect
 which relays can and which can't handle 6000 open file descriptors at
 once.
 - n^2 where n is 5000 is actually a heck of a lot of circuits. Should you
 just build circuits forever in the background, or are there some smarter
 algorithms for finding interesting patterns without making all 25 million
 circuits? In particular, there will be a background failure rate anyway,
 from e.g. relays that happen to be overloaded at that moment. So even 25
 million circuits won't be enough.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12131>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online