[tor-bugs] #18798 [Metrics/CollecTor]: analysis of descriptor completeness
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue May 24 19:19:07 UTC 2016
#18798: analysis of descriptor completeness
-------------------------------+-----------------------------------
Reporter: iwakeh | Owner: iwakeh
Type: task | Status: needs_information
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+-----------------------------------
Comment (by karsten):
Here are some random ideas why there might be missing referenced
descriptors on your CollecTor instance:
1. My CollecTor instance does not download descriptors from
194.109.206.212, because in May 2014 that directory authority left
connections open without writing any bytes, and we're not handling
timeouts very well yet. I don't expect that to still be the case, but if
you're seeing log lines with that authority indicating problems, maybe
take that address out of `DownloadFromDirectoryAuthorities`.
1. I'm not setting `DownloadAllServerDescriptors` and
`DownloadAllExtraInfoDescriptors` in my CollecTor instance. It might be
that these settings are the cause for your missing numbers going down once
per day. Maybe the logs tell you more, or maybe you'll need to add logs
for whenever your instance downloads all descriptors. By the way, here's
what I noted down when I disabled those settings: "By downloading "all"
descriptors, we only learn the most recent descriptors for all known
servers, not all known descriptors. That’s not exactly what we’d expect.
There’s also a potential problem with the result: the authority’s own
descriptor ends with a double newline which might confuse metrics-lib;
unless we split up concatenated descriptors differently in metrics-db.
Found out both things on May 8, 2014 when looking more into #11648."
1. I'm also not setting `CompressRelayDescriptorDownloads`. Here's where
I noted down why: "2014-04-29: changed to 0, because of
"java.io.EOFException: Unexpected end of ZLIB input stream"". I also
noted down this: "The reason for broken compressed downloads might have
been #11648, which should be fixed by August/September. The current
default in metrics-db for compressing downloads is 0. That's bad.
Consider fixing this once all directory authorities have upgraded. Have
they?"
And here are some suggestions for finding out more about the missing
descriptors:
- For serverdesc missing (referenced by votes), can you plot how many of
those are missing by how many votes? I wouldn't worry so much about
missing descriptors being referenced from a single vote but more about
missing descriptors being referenced from (almost) all votes.
- Regarding the extrainfo missing, have these been published by relays
that only published a single server descriptor or relays that have been
around for a longer time? Again, more worried in the latter case.
You'll notice that I'm mostly guessing here, because I don't know what
could be going wrong. But I think you're in a good position to spot a bug
or three here. Thanks for looking into this!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18798#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list