[tor-bugs] #26002 [Metrics/Statistics]: Simplify graph with number of bytes spent on answering directory requests

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed May 2 10:10:35 UTC 2018


#26002: Simplify graph with number of bytes spent on answering directory requests
------------------------------------+--------------------------
     Reporter:  karsten             |      Owner:  metrics-team
         Type:  enhancement         |     Status:  new
     Priority:  Medium              |  Milestone:
    Component:  Metrics/Statistics  |    Version:
     Severity:  Normal              |   Keywords:
Actual Points:                      |  Parent ID:
       Points:                      |   Reviewer:
      Sponsor:                      |
------------------------------------+--------------------------
 While looking at the code that aggregates data for our
 [https://metrics.torproject.org/dirbytes.html Number of bytes spent on
 answering directory requests graph] I found two things:

  1. In contrast to the graph description we're only including directory
 traffic from directory mirrors, not from directory authorities.

  2. As the graph description says, we're extrapolating whatever statistics
 we get to an estimated network total; however, that formula is really
 complex and not very intuitive.

 I suggest we simplify this graph by a) showing traffic from all
 directories (including mirrors ''and'' authorities) and b) taking out the
 extrapolation step.

 For what it's worth, that extrapolation step was useful in the beginning
 when only few relays reported these statistics. But that was many years
 ago. By now, all running tor versions support these statistics, and they
 have always been turned on by default.

 I'm attaching a graph that compares the current approach to the approach
 suggested here. It only covers April 2018, because we don't have older
 data in the database anymore. I'd have to re-import the archives for this
 locally, which I'd be happy to do.

 The main advantage of making this change is that our data will be easier
 to specify and reproduce for others.

 Setting to needs_review to get input on the question whether we should do
 it. Because if there's a reason not to do it, I wouldn't start
 reprocessing the archives. But currently I don't see such a reason.
 Thoughts?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/26002>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list