[metrics-team] Where do the by-transport counts come from in userstats-bridge-combined?

Karsten Loesing karsten at torproject.org
Sun Jun 4 15:00:24 UTC 2017


Hi David,

On 31.05.17 05:00, David Fifield wrote:
> On Mon, May 29, 2017 at 10:04:25AM +0200, Karsten Loesing wrote:
>> On 24.05.17 23:15, David Fifield wrote:
>>> I am referring to https://bugs.torproject.org/19544:
>>>> What we could also do as first approximation is find a lower and upper
>>>> bound of users by country and transport. The lower bound would
>>>> probably be defined as something like max(0, PT + CC - 1) (not just 0
>>>> to account for cases where CC > 1 - PT) and the upper bound as min(PT,
>>>> CC), even though I could be convinced that other formulas are even
>>>> more correct.
>>>
>>> I thought I understood this but I guess I do not. Does PT come from
>>> dirreq-v3-reqs and CC from bridge-ip-transports? That wouldn't make
>>> sense to me, because they are measuring different things. Or is it that
>>> CC is still using bridge-ips (I don't know the current status of that;
>>> see https://bugs.torproject.org/18167).
>>
>> We're still using dirreq-v3-reqs and either one of the bridge-ip* lines
>> combined to get the number of requests per country or transport or IP
>> version.
> 
> Okay--I think I see. It's as covered in Section 5 "Breaking down to user
> numbers by country" in
> https://research.torproject.org/techreports/counting-daily-bridge-users-2012-10-24.pdf
> 	"We sum up unique IP addresses and calculate a fraction of IP
> 	addresses for every country and day."
> 
> So if I understand correctly, suppose we had
> 	dirreq-v3-resp ok=96,not-enough-sigs=0,unavailable=0,not-found=0,not-modified=8,busy=0
> 	bridge-ips aa=24,bb=24,cc=24,dd=24
> 	bridge-ip-transports obfs3=8,obfs4=32
> 	bridge-ip-versions v4=8,v6=8
> Then of the bridge's 96 total responses, we would say that
> 	25% (24/96) were from country aa, 25% from bb, 25% from cc, 25% from dd
> 	20% (8/40) were using obfs3, 80% obfs4
> 	50% (8/16) were using IPv4, 50% using IPv6
> In other words, the by-transport and by-version number of responses are
> assumed to be proportional to the corresponding number of unique IP
> addresses.

That's almost correct.  The only difference is that we're "unbinning"
reports, which are rounded up to the next multiple of 8, by subtracting
4 from reported numbers.  So, we assume that there are 92 "ok"
responses, 20 unique IPs from aa, etc.

> When you say you are "still" using dirreq-v3-reqs and either one of the
> bridge-ip* lines, is that because there now exists dirreq-v3-reqs, which
> breaks down the countries by number of directory requests, rather than
> number of unique IP addresses? (I.e., the subject of #18167.) If I'm not
> mistaken, there's no counterpart to dirreq-v3-reqs for transport and IP
> version, so even if dirreq-v3-reqs were used for countries, it would
> still be necessary to combine dirreq-v3-resp and bridge-ip-transports or
> bridge-ip-versions for transports and IP versions.

Uhm, when I said dirreq-v3-reqs above I meant dirreq-v3-resp.

And yes, there exist dirreq-v3-reqs lines with by-country numbers
reported by newer bridge versions (I don't recall which version).

But there are no similar lines for numbers by transport or version yet.
That's something we planned to do in the following ticket but didn't
implement yet:

https://trac.torproject.org/projects/tor/ticket/8786

>> If you think the result will be interesting for Metrics website
>> visitors, would you want to start working on a similar patch?
> 
> My immediate goal is just to be able to compare the IPv4 and IPv6 usage
> of a single bridge, one of the default obfs4 bridges, the only one that
> has an IPv6 address: https://bugs.torproject.org/22429. I was originally
> going to ask the operator to use a separate fingerprint for IPv4 and
> IPv6, to make it easier, but then I thought that it would be possible to
> get bounds using the existing statistics. It looks like for this, all I
> have to do is look at the ratio of v4 and v6 in bridge-ip-versions.
> 
> And, it looks like Onionoo already does what I was thinking of:
> https://onionoo.torproject.org/clients?fingerprint=D9C805C955CB124D188C0D44F271E9BE57DE2109
> 	"versions":{"v4":0.9999944}

This is correct.

Note that we're considering to remove this field in Onionoo, because
you're probably the first person who found (and publicly stated) that
this is useful information. ;)

https://trac.torproject.org/projects/tor/ticket/22033

But you can always look at bridge-ip-versions lines and learn this
information, as you did here, so you don't really need Onionoo for this.

Hope this helps!

All the best,
Karsten


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20170604/52ceaa9f/attachment.sig>


More information about the metrics-team mailing list