[metrics-bugs] #23421 [Metrics/CollecTor]: Use persistence functionality throughout all modules

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Nov 22 10:20:48 UTC 2017


#23421: Use persistence functionality throughout all modules
-------------------------------+-----------------------------------
 Reporter:  iwakeh             |          Owner:  metrics-team
     Type:  enhancement        |         Status:  needs_information
 Priority:  High               |      Milestone:
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:  metrics-2017       |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+-----------------------------------

Comment (by karsten):

 Replying to [comment:9 iwakeh]:
 > The thought that invalid descriptors are mainly due to CollecTor's
 parsing mechanism not recognizing them as valid is a good point in favor
 of storing and syncing invalid descriptors.
 > There might be invalid descriptors - mangled or not complying to the
 spec - but even these will be useful for analysis and troubleshooting.
 > As we only sync between highly trusted instances the possibility of
 maliciously malformed descriptors can be ruled out (well, if that happens
 there is another bigger problem to deal with).
 > So, given that syncing only takes place between trusted instances and
 data loss is the main evil to prevent the sync&store-all approach is fine:
 > Only during import of sensitive data descriptors that cannot be
 sanitized are skipped, other than that all descriptors are stored.

 Makes sense to me.

 > Possible next steps (if we agree on the above):
 > 1) Make webstats module use the above approach from the beginning, if it
 seems easier, also immediately change the over all sync-process.
 > 2) Unless the change was made for all in step one, make the entire sync-
 process keep all descriptors.
 > 3) Change and adapt all other CollecTor modules accordingly using
 persistence classes throughout.

 How about we don't enable syncing for the new webstats module at all?
 Let's face it, it's not a use case we're planning to support, so why
 should we write or keep the necessary code to do it?

 And going even one step further (out of scope for this ticket), how about
 we disable syncing for all other modules ''except'' for the relaydescs
 module where we turn it into yet another data source like downloading from
 the authorities or reading from cached descriptors files? We could still
 keep the code in a form that we can add re-use it in other modules in the
 future, but only as long as that doesn't make the overall code more
 complex than it has to be. But: new ticket. Just writing this here to
 discuss the general direction.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23421#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list