Proposal: Speed up Tor
Michael_google gmail_Gersten
keybounce at gmail.com
Fri Jun 1 17:30:32 UTC 2007
Proposal for Tor.
The goal of this proposal is to support the following goals:
1. An easy way to toggle between "At least speed X" (for
single-threaded web browsing) and "Any speed, many connections" (for
downloads).
2. A way to keep nodes from being CPU starved from the encryption
processing (high bandwidth nodes)
3. A way to keep nodes from being bandwidth starved (the main limit on
middle-speed nodes).
Motivation: Speed up Tor.
Design:
1. Add in a control message for switching torrc's. Add support in
Vidiallia to toggle these.
Flaw: Ideally the determination (high speed vs. high numbers) would
be made based on who is making the request. For example, while the
downloader is fetching 10 slow parts at once, I still want to browse.
2. If the protocol for extending a circuit to a new node does not
permit the new node to reject the connection, then add this ability.
Otherwise, start using it. Nodes can prevent being CPU starved by
refusing new connections when they are "full".
3. When a circuit is being built, estimates of bandwidth needed are
transmitted as well. Similar to #2, nodes will reject new connections
if the bandwidth isn't there.
Security implications: Absolutely no idea. How does having large
numbers of connections affect Tor's tracability?
Specification (incomplete):
1. New control message would either take a filename of the new torrc,
or the contents of the new torrc. I do not know Tor's inner workings,
and cannot tell which is "better".
2. Nodes can measure the CPU cost per circuit, and tell how many they
can afford CPU wise. There may be a configuration parameter to
indicate how much CPU it can use; maybe the output of "uptime" is read
to see what the CPU levels are (and Tor stops accepting when the load
is .8 or higher.)
3. The simplest way to handle this is to put numbers in the config
file, and pass them along. For example, if I'm in "single threaded
browsing", I'll have numbers specifying a max speed of 150 KB/s, a
burst speed of 100 KB/s, and an average speed of 10 KB/s. If I'm in
"multi-threaded download", I'll specify a max speed of 25 KB, an
average speed of 15 KB, and a burst speed of 18 KB.
What to do with these numbers? Well, if the sum of the averages of all
incoming circuits exceed my actual bandwidth, I say "No" when someone
tries to connect. Similarly, if I cannot support the burst speed, I
saw "No", to avoid slowing them down (this becomes the minimum speed
needed). Finally, I know that the worst case for this circuit is the
max speed, and I can do ... ? with it.
The idea here: On my DSL, I cannot get more than 150 KB/s. While I
want to get that full speed, I'll be happy to get 100 KB/s. On
average, while I'm surfing, I'm not fetching pages all the time --
hence, an average speed of 10 KB/s representing fetch, read, fetch,
read, fetch, read.
Now, it's not perfect. I'm thinking that "Busy percentage" might make
more sense -- 10% busy for web surfing, 95% busy for downloaders. This
would also help CPU overhead calculations. It also helps tell when to
say "This circuit has been idle for a while. It isn't active at all,
and while it is inactive, we will regard it as having a speed demand
of 0". This will prevent a node from being filled up with "idle"
connections, and becoming wasted.
I'm also realizing that my concept of "burst" isn't quite right, and
I'm hoping that someone else has a better idea. For downloading,
"burst" means that while the average demand for a 10 part download is
15 KB/s per circuit, there will be variance, and a node might see a
higher burst. Yet I will be happy even if a node can only give me 10
KB/s, because I have 9 other circuits that will each get slightly more
speed. So I think we need "This is my minimum acceptable speed, reject
this circuit if you can't give me this much", "This is my average",
and "This is my worst case / initial burst" (a lot of circuits will be
busy at first, and idle afterwards), as well as "percentage of time I
expect the circuit to be used".
Compatibility: The only change in how nodes talk to each other is in
circuit building. I am not familiar with the current system to know
how this will change things.
More information about the tor-dev
mailing list