[tor-bugs] #6414 [Ooni]: Automating Bridge Reachability Testing

Sun Jul 22 01:55:23 UTC 2012

#6414: Automating Bridge Reachability Testing
--------------------------------------------------------------------------------+
 Reporter:  isis                                                                |          Owner:  isis
     Type:  project                                                             |         Status:  new 
 Priority:  normal                                                              |      Milestone:      
Component:  Ooni                                                                |        Version:      
 Keywords:  bridge-reachability metrics-db automation testing SponsorF20121101  |         Parent:      
   Points:                                                                      |   Actualpoints:      
--------------------------------------------------------------------------------+

Comment(by isis):

 Replying to [comment:1 karsten]:
 > As for the text that is now in the description,  I could imagine that it
 will turn into a tech report that we can make part of the sponsor
 deliverable.  How do you think about starting a LaTeX document by cloning
 my public tech-reports.git repo (branch fivereports) and creating a new
 directory 2012/automatic-bridge-reachability-testing/ there?  You could
 ask for your own public tech-reports.git repo to host your report sources.

 Cloned it, and it'll take me a second to re-remember for the zillionth
 time how LaTex works. I'll poke weasel for a repo.

 >
 > Also, I think I can help with two of your questions:
 >
 > > 1. Should this automation be considered part of OONI? Or BridgeDB? Or
 is it part of some other project?
 >
 > I'd think the scanner should be considered part of OONI.

 Hum. I should see what ioerror and hellais think. I don't want them to
 feel like OONI is getting cluttered with tickets that I'm the only one
 working on. And I know hellais is too busy to take on this project, but
 perhaps ioerror would want to hack on it as well. I'll poke them as well!

 > There's already a defined interface for BridgeDB to use the scanner's
 results (#5484).  Ideally, the scanner would output its results in that
 format, so that BridgeDB can make its decisions which bridges to give out
 to which users.  Also, metrics-db should learn about the very same file,
 sanitize any sensitive information in it, archive it, and make it public.
 >

 Seems simple enough: "BridgeDB will process a file with lines "fingerprint
 address:port cc,cc,cc" meaning that the bridge running on the given
 address and port is unreachable from the given countries."

 > Related to the overall architecture question, you briefly discussed a
 feedback loop between metrics and the scanner to learn about passively
 obtained reachability information.  Note that metrics-db is dumb, and it
 should stay dumb; it collects and sanitizes Tor network information, but
 it doesn't do smart things with them.  If the bandwidth scanner wants to
 extract information from collected bridge descriptors containing
 statistics, it should do that itself.  I'm happy to discuss how to extract
 that information, but the code should live in the bridge scanner codebase,
 maybe in a different module than the active scanning code.  Of course, if
 we want to archive the results from looking at passive stats and how they
 influence which bridges we scan, that would be something for metrics-db to
 collect, sanitize, archive, and publicize.

 Right. I was imagining that one would take the usage statistics, probably
 the number of connections, from metrics-db, and when the connections
 coming from a certain client drop drastically for a specific country, then
 the bridge scanner would jump in and try to figure out what was going on
 by testing a subset of bridges from that country. Also, the scanner would
 be the one polling metrics-db to figure out when connections are dropping.

 > At least that's what I came up with when thinking about the architecture
 a bit.  Does that make sense to you?

 Yep! Totally makes sense.

 > > 5. What percentage of current bridges are running on port 443?
 >
 > You can look this up in the sanitized bridge network statuses, similar
 to how you looked up the numbers in the microdescriptor consensus.  You'll
 find the last three days of sanitized bridge network statuses here:
 >
 > {{{
 > $ rsync -arz metrics.torproject.org::metrics-recent/bridge-
 descriptors/statuses/ statuses
 > $ cd statuses/
 > $ grep -B1 "^s.* Running" 20120719-103704-* | grep "^r .* 443 " | wc -l
 >      499
 > $ grep -B1 "^s.* Running" 20120719-103704-* | grep "^r" | wc -l
 >      998
 > }}}
 >
 > So, half of them.  (No, I'm not faking the numbers here, it's 50.0% for
 real!)

 Sweet, this makes it easier to test things by pretending to be an
 SSLObservatory, which is also already in Python.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6414#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online