[tor-bugs] #32265 [Metrics/Exit Scanner]: MS: Format an exit list from a previous exit list and exitmap output

Wed Nov 20 14:49:24 UTC 2019

#32265: MS: Format an exit list from a previous exit list and exitmap output
----------------------------------+------------------------------
 Reporter:  irl                   |          Owner:  irl
     Type:  task                  |         Status:  needs_review
 Priority:  Medium                |      Milestone:
Component:  Metrics/Exit Scanner  |        Version:
 Severity:  Normal                |     Resolution:
 Keywords:                        |  Actual Points:
Parent ID:  #29654                |         Points:
 Reviewer:  karsten               |        Sponsor:
----------------------------------+------------------------------

Comment (by karsten):

 Replying to [comment:11 irl]:
 > Replying to [comment:10 karsten]:
 > > Replying to [comment:9 irl]:
 > > > Replying to [comment:8 karsten]:
 > > > > Actually, I think it's harmful to download exit lists from
 CollecTor and merging them with the scanner's own measurements. We should
 instead merge new scan results with previous local results. It's also yet
 another dependency to download something from CollecTor that is not really
 needed. I'd say kill this code.
 > > >
 > > > Ok, it's gone.
 > >
 > > But it's still merging with the last-written local exit list?
 >
 > Yes, it keeps a few on disk like OnionPerf does, but only reads the last
 one.

 Great!

 > > I don't think we're using it (I'd have to check), nor do I know about
 others using it. But I'd be careful removing it or filling it with
 approximately correct data.
 > >
 > > Can we somehow access the consensus used for scanning and fill in
 these fields as part of the merge script? Maybe we can extend exitmap to
 dump that consensus to disk at the time of making a list of relays to
 scan?
 >
 > We can make that change, but I'd say it is not a priority until we're
 further along. We still have to fix up check and the DNS server and if all
 the time is spent on the scanner we still end up with a broken service.

 Agreed. Please put this on the list somewhere, so that we don't forget.

 > > One question though: If scanning takes 45 minutes right now, can we
 schedule scans in a way that they will still work when scanning takes 75
 minutes (larger network) or 15 minutes (fewer/faster exits)? For example,
 we should avoid concurrent runs, and if we do scans continuously, we
 should avoid too frequent scans.
 >
 > {{{
 > while True:
 >     start = now()
 >     run_scanner()
 >     while now() < start + minutes(40):
 >         pass
 > }}}

 Cool!

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32265#comment:12>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online