[tor-bugs] #6414 [Ooni]: Automating Bridge Reachability Testing
Tor Bug Tracker & Wiki
torproject-admin at torproject.org
Sun Jul 22 01:55:23 UTC 2012
#6414: Automating Bridge Reachability Testing
--------------------------------------------------------------------------------+
Reporter: isis | Owner: isis
Type: project | Status: new
Priority: normal | Milestone:
Component: Ooni | Version:
Keywords: bridge-reachability metrics-db automation testing SponsorF20121101 | Parent:
Points: | Actualpoints:
--------------------------------------------------------------------------------+
Comment(by isis):
Replying to [comment:1 karsten]:
> As for the text that is now in the description, I could imagine that it
will turn into a tech report that we can make part of the sponsor
deliverable. How do you think about starting a LaTeX document by cloning
my public tech-reports.git repo (branch fivereports) and creating a new
directory 2012/automatic-bridge-reachability-testing/ there? You could
ask for your own public tech-reports.git repo to host your report sources.
Cloned it, and it'll take me a second to re-remember for the zillionth
time how LaTex works. I'll poke weasel for a repo.
>
> Also, I think I can help with two of your questions:
>
> > 1. Should this automation be considered part of OONI? Or BridgeDB? Or
is it part of some other project?
>
> I'd think the scanner should be considered part of OONI.
Hum. I should see what ioerror and hellais think. I don't want them to
feel like OONI is getting cluttered with tickets that I'm the only one
working on. And I know hellais is too busy to take on this project, but
perhaps ioerror would want to hack on it as well. I'll poke them as well!
> There's already a defined interface for BridgeDB to use the scanner's
results (#5484). Ideally, the scanner would output its results in that
format, so that BridgeDB can make its decisions which bridges to give out
to which users. Also, metrics-db should learn about the very same file,
sanitize any sensitive information in it, archive it, and make it public.
>
Seems simple enough: "BridgeDB will process a file with lines "fingerprint
address:port cc,cc,cc" meaning that the bridge running on the given
address and port is unreachable from the given countries."
> Related to the overall architecture question, you briefly discussed a
feedback loop between metrics and the scanner to learn about passively
obtained reachability information. Note that metrics-db is dumb, and it
should stay dumb; it collects and sanitizes Tor network information, but
it doesn't do smart things with them. If the bandwidth scanner wants to
extract information from collected bridge descriptors containing
statistics, it should do that itself. I'm happy to discuss how to extract
that information, but the code should live in the bridge scanner codebase,
maybe in a different module than the active scanning code. Of course, if
we want to archive the results from looking at passive stats and how they
influence which bridges we scan, that would be something for metrics-db to
collect, sanitize, archive, and publicize.
Right. I was imagining that one would take the usage statistics, probably
the number of connections, from metrics-db, and when the connections
coming from a certain client drop drastically for a specific country, then
the bridge scanner would jump in and try to figure out what was going on
by testing a subset of bridges from that country. Also, the scanner would
be the one polling metrics-db to figure out when connections are dropping.
> At least that's what I came up with when thinking about the architecture
a bit. Does that make sense to you?
Yep! Totally makes sense.
> > 5. What percentage of current bridges are running on port 443?
>
> You can look this up in the sanitized bridge network statuses, similar
to how you looked up the numbers in the microdescriptor consensus. You'll
find the last three days of sanitized bridge network statuses here:
>
> {{{
> $ rsync -arz metrics.torproject.org::metrics-recent/bridge-
descriptors/statuses/ statuses
> $ cd statuses/
> $ grep -B1 "^s.* Running" 20120719-103704-* | grep "^r .* 443 " | wc -l
> 499
> $ grep -B1 "^s.* Running" 20120719-103704-* | grep "^r" | wc -l
> 998
> }}}
>
> So, half of them. (No, I'm not faking the numbers here, it's 50.0% for
real!)
Sweet, this makes it easier to test things by pretending to be an
SSLObservatory, which is also already in Python.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6414#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list