[tor-dev] Guard node security: ways forward (An update from the dev meeting)

Mon Feb 24 19:10:44 UTC 2014

George Kadianakis <desnacked at riseup.net> writes:

> A main theme in the recent Tor development meeting was guard node
> security as discussed in Roger's blog post and in Tariq's et al. paper [0].
>
> Over the course of the meeting we discussed various guard-related
> subjects. Here are some of them:
>
> a) Reducing the number of guards to 1 or 2 (#9273).
>
> b) Increasing the guard rotation period (to 9 months or so) (#8240).
>
> <snip>
>
> i) If we restrict the number of guards to 1, what happens to the
>    unlucky users that pick a slow guard? What's the probability of
>    being an unlucky user? Should we bump up the bandwidth threshold
>    for being a guard node? How does that change the diversity of our
>    guard selection process?
>
> <snip>
>
> To move forward, we decided that proposals should be written for (a)
> and (b). We also decided that we should first evaluate whether doing
> (a) and (b) are good ideas at all, especially with regards to (i).
>

I started working on evaluating whether reducing the number of guards
to a single guard will result in horrible performance for users. Also,
whether increasing the bandwidth threshold required for being a guard
node will result in a less diverse guard selection process.

You can find the script here: https://git.gitorious.org/guards/guards.git
                              https://gitorious.org/guards/guards

This is a thread to hear feature requests of what we need to find out
before going ahead and implementing our ideas.

For now, I wrote an analysis.py script, which reads a user-supplied
consensus, calculates the entropy of the guard selection process (like
we did in #6232), then it removes the slow guards from the consensus
(with a user-supplied threshold value), redoes the bandwidth weights,
and recalculates the diversity of the guard selection process. Then it
prints the two entropy values. This is supposed to give us an idea of
how much diversity we lost by pruning the slow guard nodes. For more
info, see the analysis() function.

Here is an execution of the script:
""""
$ python analysis.py consensus 400 # kB/s
WARNING:root:Entropy: 9.41949523094 (max entropy of 1999 guards: 10.9650627567).
WARNING:root:Before pruning: 1999 guards. After pruning: 1627 guards
WARNING:root:Entropy: 9.36671628404 (max entropy of 1627 guards: 10.6679985357).
WARNING:root:Original: 9.41949523094. Pruned: 9.36671628404
"""
In this case, the entropy of the original guard selection process was
9.4 bits, and after we pruned slow guard nodes (below 400kB/s), we got
an entropy of 9.3 bits. Is this good or bad? It's unclear, since
comparing two Shannon entropy values does not make much sense
(especially since they are in the logarithmic scale).

So here is a TODO list on how to make this analysis script more
useful:

* Instead of printing the entropy, visualize the probability
  distribution of guard selection. A histogram of the probability
  distribution, for example, might make the loss of diversity more
  clear.

* Find a way to compare entropy values in a meaningful way. We can use
  the maximum entropy of each consensus to see how far from max
  entropy we are in each case. Or we can try to linearize the entropy
  value somehow.

* Find other ways to measure diversity of guard node selection.

* Given another speed threshold, print the probability that the guard
  selection will give us a guard below that threshold.

  Even if we restrict guards to above 600kB/s, we still want to learn
  what's the chance you are going to encounter a guard below 1000kB/s.

* Fix errors on the current script.

  For example, I'm not sure if I'm using the correct bandwidth
  values. I'm currently using the value in 'w' lines of the consensus
  ('w Bandwidth=479'). I used to think that this is a unitless number,
  but looking at dir-spec.txt it seems to be "kilobytes per
  second". Is this the value I should be using to figure out which
  guards I should cut-off?

* Implement ideas (d) and (e) and use historical data to evaluate
  whether they are worth doing:
> d) The fact that authorities assign flags based on knowledge they
>    acquired while they were up. They don't use historical data to
>    assign flags, which means that an authority thas has been up for 1
>    month, only knows 1 month worth of information about each relay
>    (#10968).
> 
> e) We discussed introducing a weight parameter that makes guards that
>    have been guards for a long time, be more likely to be used as
>    guards.

What else do we need to do?

PS: Here is another execution of the script with a higher cut-off
    value (100MB/s):
    """
    $ python analysis.py consensus 100000 # kB/s
    WARNING:root:Entropy: 9.41949523094 (max entropy of 1999 guards: 10.9650627567).
    WARNING:root:Before pruning: 1999 guards. After pruning: 17 guards
    WARNING:root:Entropy: 3.76986046797 (max entropy of 17 guards: 4.08746284125).
    WARNING:root:Original: 9.41949523094. Pruned: 3.76986046797
    """