[metrics-bugs] #23421 [Metrics/CollecTor]: Use persistence functionality throughout all modules

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Nov 22 11:33:00 UTC 2017


#23421: Use persistence functionality throughout all modules
-------------------------------+-----------------------------------
 Reporter:  iwakeh             |          Owner:  metrics-team
     Type:  enhancement        |         Status:  needs_information
 Priority:  High               |      Milestone:
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:  metrics-2017       |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+-----------------------------------

Comment (by karsten):

 This discussion should probably take place elsewhere. Some quick thoughts
 anyway:
  - I'd want us to integrate synchronization closer into the relaydescs
 module as it is possible while keeping it general-purpose.
    - For example, I'd like to do it before downloading missing descriptors
 from directory authorities, because it might save us a few requests there.
    - Further, right now the sync run produces its own file in the
 `recent/` directory, but ideally there should only be a single such file
 per execution. That's mostly a cosmetic issue, though. The previous one is
 more serious.
  - Researchers or other non-Tor-Metrics users don't need synchronization,
 they need download functionality. They could simply use
 `DescriptorCollector`, which performs basic synchronization functionality
 like skipping files that exist remotely and deleting files that don't
 exist remotely anymore.
  - Even if we don't have to add much code now, we might be keeping a lot
 of code that we don't need, and that produces maintenance effort. This
 doesn't make it super urgent to get rid of that code. But it's worth
 thinking about removing that code in the future if we don't need it.
  - If we need to think about what to do with synchronization whenever we
 add or change a module, that creates overhead, too.
  - We should keep code if we believe that it either has a benefit now or
 will be useful in the future. Let's think about possible uses of
 synchronization outside of the relaydescs module (where it has turned out
 to be tremendously useful!). I don't see how we're using it elsewhere. I
 could be overlooking it. But if it turns out to be very unlikely that
 we're deploying synchronization for another module in the next 24 months,
 let's make plans to remove it and integrate it further into the relaydescs
 module.

 However, I think we agree on the actual topic which is persistence. So,
 let's move forward with that and table the (somewhat related)
 synchronization discussion.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23421#comment:12>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list