Bandwidth throttling (was Re: Padding)

Mon Jul 8 08:12:26 UTC 2002

On Thu, Jul 04, 2002 at 04:24:53AM -0400, Roger Dingledine wrote:
> Bandwidth throttling and flow control are closely tied to traffic shaping.
> There are several issues to solve here. We need to make sure a node:
>  
> 1) doesn't send too many data cells per timeslice
> 2) doesn't send too many bytes per timeslice
> 3) doesn't receive too many bytes per timeslice

I've talked about 1, but not 2 and 3. Here goes.

(I start out assuming constant link padding and equal bandwidth between
all nodes. I remove these assumptions as I go.)

We want to limit nodes to sending a fixed rate of traffic at any given
time. In order to get good mixing, we send one cell to every other node,
every time click. So if we want to be sending 10 kilobytes per second
per connection, then that's 80 cells per second, or a cell every 12.5
milliseconds (round to 13). So every 13 milliseconds, you add another
128 bytes to the outbuf of each connection (either a non-padding cell
if you've got one, else a padding cell), and try to flush it.

If all senders obey the sending rate, then receivers won't need to limit
their receiving rate. But we must handle misbehaving senders. Nodes
use a standard token bucket to limit received cells: each node keeps a
"bucket" of tokens for each connection. Every second it adds 10k tokens
to the bucket (but not to exceed some max, e.g. 10 seconds worth), and
for each byte that it receives, it removes a token from the bucket. When
the bucket is empty, it stops reading.

So far so good. Now because "whether we send a packet" is completely
separated from "when we receive data cells", we don't have to send a
cell to every node each time. As long as we send cells out predictably,
we can send to each connection independently. [Somebody please tell me
that we're ok with the anonymity implications here. I think in order
for an adversary to take advantage of this, he needs to own lots of
nodes around us, in which case he's got better attacks already? But it's
critical that we investigate this question more.]

So each connection can have a different bandwidth set for it, and the
only complication this adds is that I have to keep track of timing for
each of them in the code. Now to add in the notion of traffic shaping.

When a connection is established, the two participants negotiate a
maximum bandwidth. This is the number of tokens the receiver adds to
his token bucket each second. The sender, on the other hand, generally
wants to send quite a bit less than this maximum.

Every five minutes, the sender decides how much bandwidth he wants to
use for the next 5 minute interval. He randomly determines a padding
chance $p$, between 10% and 24%. (This addresses Andrei's issues from
before. Whenever a node is going to send a data cell he checks $p$ and
possibly sends a padding cell instead.) If he's been using less than
33% of the channel for data cells, then he drops his bandwidth by one
class. If he's been using more than 66% of his bandwidth for data cells,
he increases his bandwidth by one class. Else he stays at the same place.

By "class", I mean a change by 10KB/s if under 100KB/s, else a change
by 10%. The bandwidth can't go less than 10, and if changing would imply
going above the max then it doesn't change.

So that's the idea. There are still plenty of details to be worked out
(and the numbers really did come out of nowhere).

Please break or clarify this.

Thanks,
--Roger