[tor-bugs] #18798 [Metrics/CollecTor]: analysis of descriptor completeness
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Apr 22 15:26:23 UTC 2016
#18798: analysis of descriptor completeness
-------------------------------+-----------------------------------
Reporter: iwakeh | Owner: iwakeh
Type: task | Status: needs_information
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: ctip | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+-----------------------------------
Comment (by karsten):
Thanks for the update! As mentioned briefly yesterday, but also for the
record here, the disk ran full on April 1st, so that's what caused those
problems.
I agree that there aren't many referenced descriptors missing. That's a
good outcome of this analysis, and it's a sign that the current logic to
fetch missing descriptors is working okay.
To be honest, I don't have a good explanation for those many missing
microdescriptors. The week-long pattern there is probably the result from
most microdescriptors being replaced after a week. I don't yet understand
why CollecTor wouldn't be able to fetch missing microdescriptors during
that week. We might be looking at a bug there, either in the collection
or in the logging. But I'd say put that under low priority for now,
because microdescriptors are the least important descriptors we're
collecting. In theory (!), we would be able to generate our own
microdescriptors from server descriptors, so missing some of them is not a
big deal.
But here's something that we're yet missing in the analysis and that just
crossed my mind! We're only looking at missing ''referenced''
descriptors, but we're totally ignoring missing ''referencing''
descriptors, namely consensuses, microdescriptor consensuses (less
important), and votes. There are no log lines for those descriptors, but
there should be 1 new consensus, 1 new microdescriptor consensus, and 9
new votes every hour. The part that makes it so important to get these
descriptors is that they become unavailable after that hour. That's
different from referenced descriptors, which is why we seem to be
recovering well from those disk-full problems. That might look different
with consensuses, microdescriptor consensuses, and votes.
In theory, finding out whether any of those are missing should be a matter
of fetching descriptor tarballs and counting files. Would you want to
make graphs for those counts and put them next to the missing
''referenced'' descriptors graphs?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/18798#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list