Proposal: Bring Back PathlenCoinWeight

Wed Apr 18 21:35:04 UTC 2007

Filename: 1xx-PathlenCoinWeight.txt
Title: Pathlen Coin Weight
Version:
Last-Modified:
Author: Mike Perry
Created:
Status: Open

Overview:

  The idea is that users should be able to choose a weight which
  probabilistically chooses their path lengths to be 2 or 3 hops. This
  weight will essentially be a biased coin that indicates an
  additional hop (beyond 2) with probability P. The user should be
  allowed to choose 0 for this weight to always get 2 hops and 1 to
  always get 3. 

  This value should be modifiable from the controller, and should be 
  available from Vidalia.

Motivation:

  The Tor network is slow and overloaded. Increasingly often I hear 
  stories about friends and friends of friends who are behind firewalls, 
  annoying censorware, or under surveillance that interferes with their 
  productivity and Internet usage, or chills their speech. These people 
  know about Tor, but they choose to put up with the censorship because 
  Tor is too slow to be usable for them. In fact, to download a fresh, 
  complete copy of levine-timing.pdf for the Anonymity Implications 
  section of this proposal over Tor took me 3 tries.

  There are many ways to improve the speed problem, and of course we 
  should and will implement as many as we can. Johannes's GSoC project 
  and my reputation system are longer term, higher-effort things that 
  will still provide benefit independent of this proposal.

  However, reducing the path length to 2 for those who do not need the
  (questionable) extra anonymity 3 hops provide not only improves
  their Tor experience but also reduces their load on the Tor network by
  33%, and can be done in less than 10 lines of code. That's not just
  Win-Win, it's Win-Win-Win.

  Furthermore, when blocking resistance measures insert an extra relay
  hop into the equation, 4 hops will certainly be completely unusable 
  for these users, especially since it will be considerably more
  difficult to balance the load across a dark relay net than balancing
  the load on Tor itself (which today is still not without its flaws).

Anonymity Implications:

  It has long been established that timing attacks against mixed
  networks are extremely effective, and that regardless of path
  length, if the adversary has compromised your first and last 
  hop of your path, you can assume they have compromised your 
  identity for that connection. 

  In [1], it is demonstrated that for all but the slowest, lossiest 
  networks, error rates for false positives and false negatives were 
  very near zero. Only for constant streams of traffic over slow and 
  (more importantly) extremely lossy network links did the error rate 
  hit 20%. For loss rates typical to the Internet, even the error rate 
  for slow nodes with constant traffic streams was 13%.

  When you take into account that most Tor streams are not constant,
  but probably much more like their "HomeIP" dataset, which consists
  mostly of web traffic that exists over finite intervals at specific
  times, error rates drop to fractions of 1%, even for the "worst"
  network nodes.

  Therefore, the user has little benefit from the extra hop, assuming
  the adversary does timing correlation on their nodes. The real
  protection is the probability of getting both the first and last hop, 
  and this is constant whether the client chooses 2 hops, 3 hops, or 42.

  Partitioning attacks form another concern. Since Tor uses telescoping
  to build circuits, it is possible to tell a user is constructing only
  two hop paths at the entry node. It is questionable if this data is
  actually worth anything though, especially if the majority of users
  have easy access to this option, and do actually choose their path
  lengths semi-randomly.

  Nick has postulated that exits may also be able to tell that you are
  using only 2 hops by the amount of time between sending their
  RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
  see from the OP. I doubt that they will be able to make much use
  of this timing pattern, since it will likely vary widely depending
  upon the type of node selected for that first hop, and the user's
  connection rate to that first hop. It is also questionable if this
  data is worth anything, especially if many users are using this
  option (and I imagine many will).

  Perhaps most seriously, two hop paths do allow malicious guards 
  to easily fail circuits if they do not extend to their colluding peers
  for the exit hop. Since guards can detect the number of hops in a
  path, they could always fail the 3 hop circuits and focus on
  selectively failing the two hop ones until a peer was chosen.

  I believe currently guards are rotated if circuits fail, which does 
  provide some protection, but this could be changed so that an entry 
  guard is completely abandoned after a certain number of extend or 
  general circuit failures, though perhaps this also could be gamed 
  to increase guard turnover. Such a game would be much more noticeable 
  than an individual guard failing circuits, though, since it would
  affect all clients, not just those who chose a particular guard.

Why not fix Pathlen=2?:

  The main reason I am not advocating that we always use 2 hops is that 
  in some situations, timing correlation evidence by itself may not be 
  considered as solid and convincing as an actual, uninterrupted, fully 
  traced path. Are these timing attacks as effective on a real network 
  as they are in simulation? Would an extralegal adversary or authoritarian 
  government even care? In the face of these situation-dependent unknowns, 
  it should be up to the user to decide if this is a concern for them or not.

Implementation:

  new_route_len() can be modified directly with a check of the 
  PathlenCoinWeight option (converted to percent) and a call to 
  crypto_rand_int(0,100) for the weighted coin.

  The Vidalia setting should probably be in the network status window
  as a slider, complete with tooltip, help documentation, and perhaps
  an "Are you Sure?" checkbox.

  The entry_guard_t structure could have a num_circ_failed member
  such that if it exceeds N circuit extend failure to a second hop, 
  it is removed from the entry list. N should be sufficiently high 
  to avoid churn from normal Tor circuit failure, and could possibly be
  represented as a ratio of failed to successful circuits through that
  guard.

Migration:

  Phase one: Re-enable config and modify new_route_len() to add an
  extra hop if coin comes up "heads".

  Phase two: Experiment with the proper ratio of circuit failures 
  used to expire garbage or malicious guards.

  Phase three: Make slider or entry box in Vidalia, along with help entry 
  that explains in layman's terms the risks involved.

[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf

-- 
Mike Perry
Mad Computer Scientist
fscked.org evil labs