Proposal: Path Selection Improvements
Fallon
leftinspace at gmail.com
Sun Jul 6 01:46:34 UTC 2008
Hi everyone,
Here's a proposal for improving path selection through collecting build time
statistics. We're still collecting data, so it's quite short. Comments are
appreciated.
Fallon
Filename: path-selection-improvements.txt
Title: Improving Tor Path Selection
Version:
Last-Modified:
Author: Fallon Chen, Mike Perry
Created: 5-Jul-2008
Status: Draft
Overview
The performance of paths selected can be improved by adjusting the
CircuitBuildTimeout and the number of guards. This proposal describes
a method of tracking buildtime statistics, and using those statistics to
adjust the
CircuitBuildTimeout and the number of guards.
Motivation
Tor's performance can be improved by excluding those circuits that
have long buildtimes (and by extension, high latency). For those Tor users
who require better performance and have lower requirements for anonymity,
this would be a very useful option to have.
Implementation
Learning the CircuitBuildTimeout
Based on studies of build times, we found that the distribution of
circuit buildtimes appears to be a Pareto distribution. The number of
circuits to observe (ncircuits_to_observe) before changing the
CircuitBuildTimeout will be tunable. From our preliminary measurements,
it is likely that ncircuits_to_observe will be somewhere on the order of
1000. The values can be represented compactly in Tor in milliseconds as
a
circular array of 16 bit integers. More compact long-term storage
representations can be implemented by simply storing a histogram with
50 millisecond buckets when writing out the statistics to disk.
Calculating the preferred CircuitBuildTimeout
Circuits that have longer buildtimes than some x% of the estimated CDF
of
the Pareto distribution will be excluded. x will be tunable as well.
Circuit timeouts
In the event of a timeout, backoff values should include the 100-x% of
expected CDF of timeouts.
Also, in the event of network failure, the observation mechanism should
stop collecting timeout data.
Other notes
Since this follows a Pareto distribution, large reductions on the
timeout
can be achieved without cutting off a great number of the total paths.
However, hard statistics on which cutoff percentage gives optimal
performance have not yet been gathered.
Issues
Impact on anonymity
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20080705/ab3d5ba3/attachment.htm>
More information about the tor-dev
mailing list