[tor-dev] Why the seeming anticorrelation between obfs3 and vanilla bridges in metrics graphs?
David Fifield
david at bamsoftware.com
Sat Nov 1 00:44:46 UTC 2014
On Fri, Oct 31, 2014 at 03:42:06PM -0700, David Fifield wrote:
> On Sun, Oct 26, 2014 at 09:08:49AM +0100, Karsten Loesing wrote:
> > On 23/10/14 19:32, David Fifield wrote:
> > > In the past few months of bridge user graphs, there is an apparent
> > > negative correlation between obfs3 users and vanilla users: when one
> > > goes up, the other goes down. If you draw a horizontal line at about
> > > 5500, they are almost mirror images of each other. I don't see it with
> > > any other transport pairs. Any idea why it might be?
> >
> > I briefly looked at the raw data behind this graph, but didn't find any
> > obvious problems with the algorithm. I'm running out of time now, but I
> > can share some preliminary results in case you want to dig deeper:
> >
> > - https://people.torproject.org/~karsten/volatile/bridge-users-obfs3-or-mean.png
> > is the graph that you posted with a third line for mean values.
> >
> > - https://people.torproject.org/~karsten/volatile/bridge-responses.csv.xz
> > contains numbers of responses (for requested consensuses) by bridge,
> > transport, and time interval.
>
> I don't understand this file. Is it the number of times a bridge
> answered a directory request? The number of times a bridge appeared in a
> consensus? The number of times it was given out by BridgeDB?
Actually maybe I understand. It must be the number of times a bridge
answered a directory request, à la
https://research.torproject.org/techreports/counting-daily-bridge-users-2012-10-24.pdf.
> > - https://onionoo.torproject.org/details?fingerprint=231E2DE81DC4314F2035D2C0D0D043A425FF8999
> > is the bridge reporting those high numbers for <OR> responses. Is
> > PacificSunset maybe one of the bundled bridges?
>
> It is indeed:
> pref("extensions.torlauncher.default_bridge.obfs3.5", "obfs3 208.79.90.242:35658 BA61757846841D64A83EA2514C766CB92F1FB41F");
>
> I don't understand your line of reasoning in singling it out, though.
> What do high numbers for <OR> responses suggest to you?
This might be the key to the mystery. It must be that PacificSunset,
despite being an obfs3 bridge, doesn't have ExtORPort enabled, so all
its obfs3 connections are being counted as <OR> connections.
$ grep 231E2DE81DC4314F2035D2C0D0D043A425FF8999 bridge-responses.csv
231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-03 19:14:20,2014-08-04 00:00:00,21.4
231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-04 00:00:00,2014-08-04 19:14:20,86.6
231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-05 14:26:12,2014-08-06 00:00:00,3151.1
231E2DE81DC4314F2035D2C0D0D043A425FF8999,responses,<OR>,2014-08-06 00:00:00,2014-08-07 00:00:00,8054.7
...
That would mean that when a user configures obfs3, their tor (randomly?)
chooses one of the 7 default obfs3 bridges. When they happen to get
PacificSunset, the obfs3 user count goes down and the <OR> count goes up
by an equal amount, the sum remaining unchanged.
Here are the bundled obfs3 bridges and their hashed fingerprints:
https://gitweb.torproject.org/builders/tor-browser-bundle.git/blob/a7f6793b95dba3e6f4bb1f2128eacfc83eb95ffa:/Bundle-Data/PTConfigs/bridge_prefs.js
$ for a in A09D536DD1752D542E1FBB3C9CE4449D51298239 AF9F66B7B04F8FF6F32D455F05135250A16543C9 58D91C3A631F910F32E18A55441D5A0463BA66E2 BA61757846841D64A83EA2514C766CB92F1FB41F 1E05F577A0EC0213F971D81BF4D86A9E4E8229ED 4C331FA9B3D1D6D8FB0D8FBBF0C259C360D97E6A; do echo $a "->" $(python -c "import hashlib; print hashlib.sha1(\"$a\".decode(\"hex\")).hexdigest().upper()"); done
A09D536DD1752D542E1FBB3C9CE4449D51298239 -> 3E0908F131AC417C48DDD835D78FB6887F4CD126
AF9F66B7B04F8FF6F32D455F05135250A16543C9 -> 6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C
58D91C3A631F910F32E18A55441D5A0463BA66E2 -> FAEABF422ECB91C1D96492B06DE2539EDD6BFB0E
BA61757846841D64A83EA2514C766CB92F1FB41F -> 231E2DE81DC4314F2035D2C0D0D043A425FF8999
1E05F577A0EC0213F971D81BF4D86A9E4E8229ED -> A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87
4C331FA9B3D1D6D8FB0D8FBBF0C259C360D97E6A -> 73D8FF840444F84EC50DD755FBAD44CF1F0DE28B
It looks like PacificSunset is the only one of the bundle bridges that
has this bug (reports <OR> and no other transports). The others report
small numbers for <OR>, presumably because they get a few <OR> users
from BridgeDB.
$ for a in A09D536DD1752D542E1FBB3C9CE4449D51298239 AF9F66B7B04F8FF6F32D455F05135250A16543C9 58D91C3A631F910F32E18A55441D5A0463BA66E2 BA61757846841D64A83EA2514C766CB92F1FB41F 1E05F577A0EC0213F971D81BF4D86A9E4E8229ED 4C331FA9B3D1D6D8FB0D8FBBF0C259C360D97E6A; do grep -i ^$(python -c "import hashlib; print hashlib.sha1(\"$a\".decode(\"hex\")).hexdigest()") bridge-responses.csv; done | awk -F, "{print \$1,\$3}" | uniq
3E0908F131AC417C48DDD835D78FB6887F4CD126 obfs2
3E0908F131AC417C48DDD835D78FB6887F4CD126 obfs3
3E0908F131AC417C48DDD835D78FB6887F4CD126 obfs4
3E0908F131AC417C48DDD835D78FB6887F4CD126 <OR>
3E0908F131AC417C48DDD835D78FB6887F4CD126 scramblesuit
6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C obfs2
6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C obfs3
6CE1370EDFE977E7A3124B7C1E543B533A1C6E9C <OR>
FAEABF422ECB91C1D96492B06DE2539EDD6BFB0E obfs3
FAEABF422ECB91C1D96492B06DE2539EDD6BFB0E <OR>
231E2DE81DC4314F2035D2C0D0D043A425FF8999 <OR>
A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 obfs2
A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 obfs3
A72D5DB45D9DE4B244D3F6C4AD22A66F40BF5B87 <OR>
73D8FF840444F84EC50DD755FBAD44CF1F0DE28B obfs2
73D8FF840444F84EC50DD755FBAD44CF1F0DE28B obfs3
73D8FF840444F84EC50DD755FBAD44CF1F0DE28B <OR>
I'll also note that PacificSunset is one of the problematic bridges
listed at https://trac.torproject.org/projects/tor/ticket/13504#comment:2.
David Fifield
More information about the tor-dev
mailing list