[tor-bugs] #6232 [Analysis]: Make entropy-over-time graph

Tue Jul 17 13:30:30 UTC 2012

#6232: Make entropy-over-time graph
-------------------------+--------------------------------------------------
 Reporter:  arma         |          Owner:                
     Type:  enhancement  |         Status:  needs_revision
 Priority:  normal       |      Milestone:                
Component:  Analysis     |        Version:                
 Keywords:               |         Parent:                
   Points:               |   Actualpoints:                
-------------------------+--------------------------------------------------
Changes (by karsten):

  * status:  needs_review => needs_revision

Comment:

 A few comments after re-reading the whole ticket:

  - I merged George's patch (thanks!) that outputs degree of anonymity
 instead of plain entropy.  I'll run it shortly and will post the resulting
 graph once I have it.

  - Should we add a second graph that plots entropy and maximum entropy as
 two lines, as Rob suggested above?   That graph should probably consist of
 2 x 2 sub graphs for the four cases we distinguish.  Should be trivial to
 extend the script to output entropy and max_entropy along with their
 quotient.  I'll look into that and write the graphing code in a bit.

  - I wonder if entropies based on subsets of Exit and Guard flagged relays
 are correct.  I spent yesterday afternoon on trying to learn how path
 selection really works
 ([https://trac.torproject.org/projects/tor/ticket/5755#comment:11 #5755]).
 I think we'll have to take bandwidth weights as reported in the footer
 section of a consensus into account, too.  Those bandwidth weights
 influence, for example, how to weight the consensus weight of a relay with
 the Exit flag and a relay with Exit ''and'' Guard flag for the exit
 position.  In a consensus published yesterday, the former was weighted
 with Wee=1.0, whereas the latter was weighted with Wed=0.4272.  Similarly,
 bandwidth weights for the guard position were Wgd=0.2864 and Wgg=0.6446,
 so quite different.  If we only look at the Exit ''or'' Guard flag of a
 relay, we might be quite off.  But before we change anything here, I want
 to hear back from Mike or Roger if my understanding of path selection is
 correct.

  - The GeoIP database is part of the sources in metrics-tasks.git, right?
 Can we change that and have users provide their own geoip file?  I'm
 worried that the current "a1" madness influences the results, and I'd like
 to swap the current database with the one from February which didn't have
 "a1" relays all over.

  - Can we add AS-based entropy values, too?  There's an AS database from
 Maxmind that we could use here.  Again, users could provide that database
 file, so there's no need to commit it to the Git repo.

  - In the longer term, do we want to include family diversity?  That
 metric would consider all relays in the same relay family as one entity,
 similar to how we consider all relays in the same country as one entity in
 the country diversity metric.  I admit that it's hard to extract families
 using the current code, because we'd have to parse server descriptors for
 that, too.  I'm also not certain that the results will be meaningful.  So,
 longer-term.

  - A shorter-term goal could be to compute bandwidth diversity based on
 the relays' advertised bandwidths, not based on their consensus weights.
 Relays report their advertised bandwidth in their server descriptor; it's
 the minimum of bandwidth rate, burst, and observed bandwidth.  We'll want
 to compute bandwidth diversity for all relays and for exit/guard subsets
 as well as location diversity.  This is what Roger was referring to in the
 last but one paragraph of the ticket description.  Again, I admit that
 it's non-trivial to extract advertised bandwidths, because we'll have to
 parse server descriptors.  But it's easier to compute than relay families.

 gsathya, are you up for more coding fun?  Didn't you worry that this task
 might be too trivial for a thesis?  Hah! :)

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/6232#comment:32>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online