[tor-dev] CollecTor data: mapping bridge-network-status to bridge-server-descriptor to bridge-extra-info
Karsten Loesing
karsten at torproject.org
Thu Jul 9 08:26:55 UTC 2015
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 09/07/15 05:39, Roger Dingledine wrote:
> On Wed, Jul 08, 2015 at 07:45:04PM -0700, David Fifield wrote:
>> I'm trying to use CollecTor data to find out how much bandwidth
>> is offered by different pluggable transports over time. I.e., I
>> want to be able to say something like, "On July 1, bridges with
>> obfs3 offered X MB/s, bridges with obfs4 offered Y MB/s," etc.
>
> Great!
>
>> I'm having trouble because sometimes, a router digest listed in
>> a bridge-network-status document is not found in the same
>> tarball.
> [snip]
>> Here's an example of where it goes wrong.
>> bridge-descriptors-2015-07/statuses/01/20150701-060138-4A0CCD2DDC7995083D73F5D667100C8A5831F16D
>
>>
> Yeah, I'm not surprised it goes wrong, since the descriptor from
> 0701-06:01 was likely published in the previous month.
>
>> However, I did find it in the previous month's tarball,
>
> Yep.
I think you picked the wrong example for something going wrong,
because that descriptor is actually included in the 2015-07 tarball.
But there are indeed cases when a status published in 2015-07
references a server descriptor that was published in 2015-06, and that
server descriptor would be contained in the 2015-06 tarball. Example
from the same status:
bridge-descriptors-2015-07/statuses/01/20150701-060138-4A0CCD2DDC7995083D73F5D667100C8A5831F16D
contains a line:
r Unnamed ABQ4ZADwj8WkfgApkhVTFalGweU GqjwHG/sFpFzY4sx9SWuzVTcHag
2015-06-30 12:59:03 10.135.171.161 443 0
which references the following server descriptor:
bridge-descriptors-2015-06/server-descriptors/1/a/1aa8f01c6fec169173638b31f525aecd54dc1da8
>> It seems rare that the bridge-server-descriptor is missing. In
>> the 2015-07 tarball, it happened for 5891/477496 relays (1.2%).
> [snip]
>> How do you handle cases like this? I had a browse through the
>> Onionoo source code, but did not quickly understand it.
Onionoo typically reads descriptors from CollecTor's recent/ directory
which have been published in the past 72 hours, not the tarballs in
the archive/ directory that are organized by publication month.
>> Should I just always include the month preceding the earliest
>> month I want to process?
Yes, you should do that.
> How many of the 5891 cases does that resolve?
If you happen to find cases which are not explained by that, please
let me know.
All the best,
Karsten
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: GPGTools - http://gpgtools.org
iQEcBAEBAgAGBQJVnjBPAAoJEJD5dJfVqbCrfjYH/1kYG9hl10sekKpfhV7y3nAq
wjm/hhyz7bqz9uPJmXs9d8+rkgJBIhUGC+LWqdmmgU8VNRb4NpCq7vBO6MIRJQQG
a7C3XNYRw10+Bs+jfBiE5D6z4i2rLXGDqaFkmKCEbrh6To5pqo2ziJkWUP6Y/8gH
EHjsEINFB4doV2EAccAAAjN6L1cLQPLBEVVAPtN7Pm78hcNuZ9D+n8TA+XWfmOvV
JG26kerEMkA2XPj3nbPvBLTYM5AMvMr/lDQpAuaSZYHb0E8DiLcVlUcaX4Y/IpY8
SqwLmheZdrFItxCH3Fd8c3hxiZ/Qs6iVZ6EPFRuqbBSOu7VLvyo7N4aXrk2bt6c=
=OKle
-----END PGP SIGNATURE-----
More information about the tor-dev
mailing list