Refactored or_connection_t buffers into cell queues.
Nick Mathewson
nickm at freehaven.net
Tue Mar 27 22:43:20 UTC 2007
Hi, all. Roger asked me to brain-dump here to explain the refactoring
I just did on how Tor handles buffering on OR connections. (This went
in as svn revisions 9904 through 9907.)
Previously, we did all of our buffering on the OR connection
structures: We'd pull a cell from on or_connection_t (or package one
from the edge), {en|de}crypt it, use the relevant circuit to figure
out what or_connection_t it should go onto, and append it to that
or_connection_t's outbuf.
The patches I just checked in adds another step: after cells are
encrypted and the relevant circuit are found, the cells are queued in
a linked list at the circuit_t. When the or_connection_t's outbuf
falls below a certain size, we pull cells from the appropriate cell
queues until the outbuf is more full again.
There are a few more details here that I'll explain below. First,
I'll talk about why it's a good idea:
- The ring buffer implementation in buffer.c amortizes resizing costs
for buffers when doubling them when they're full and halving them
when they're 3/4-empty. Thus, many buffers spend most of their
time largely empty: this means that we wind up malloc'ing a lot
more memory than we need.
- There are a lot of features we want to add that require us to
assign different priority to different kinds of traffic. (For
example, some relay-incentive schemes need us to give priority to
traffic from other ORs.)
- We need to kill connections when their buffers use up too much
RAM. But when that happens, we kill every circuit on the
connection. It might be better to kill only the circuits that have
a lot of data queued.
- Tracking circuit queues separately from one another lets us fill
circuits in a fair fashion: We can start reading from exit
connections only at a rate that the circuits are able to flush
themselves, rather than at the maximum rate possible.
Implementation issues:
1) I've also done a patch to stop reading on edge connections when the
circuit gets too full, and start reading on the edge connections
again when the circuit gets emptier.
[Maybe we should aim to read from edge connections at constant
rates rather than stopping and starting like this. That will be
harder, but will probably give better TCP performance.]
2) The way the code is currently implemented, we check whether to add
more cells to an outbuf whenever we flush from that outbuf. This
doesn't work when adding the first few cells to circuit queues,
since there's nothing to flush. Therefore, whenever we add a cell
to a cell queue and the or_conn's outbuf is empty, we write the
cell straight to the outbuf.
[I think this rule should maybe become "add cells straight to the
outbuf whenever it is small", but I don't know whether that's really
an improvement.]
3) To avoid having to walk all the circuit on an or_connection (there
can be quite a lot of them), we keep the _active_ circuits (those
with pending cells) in a doubly-linked-ring on each or_connection.
An or_circuit is potentially connected to two or_connections, so
each or_circuit is up to two of these rings, and so has two next
and prev pointers. This makes the link and unlink logic a little
trickier than it needs to be, but I can't immediately see how to
simplify it. Have a look and drop me a line if you have a good
idea.
4) The stop-adding-cells and start-adding-cells thresholds for moving
cells from cell queues to or_connections are currently 32K and 16K.
The stop-adding-cells and start-adding-cells thresholds for
packaging cells from the edge are currently 256 cells and 64 cells.
[I have no good justification for any of these numbers besides the
16K one; for efficient bandwidth usage, we want as many full TLS
frames going out as possible.]
5) The code currently does all the crypto before adding cells to
queues, but does the cell-to-bytes encoding when adding cells to
outbufs. We could maybe get a little more ram-efficient by
packaging in advance too.
6) Cells are about to become the most heavily allocated and freed
structures in Tor. This is probably going to incur some malloc
overhead, especially on platforms where malloc() sucks. Since
cells are fixed-size, this arguably calls for arena allocation.
7) I stuck most of the cell queue logic in relay.c. Roger: let me
know if it belongs somewhere else instead.
8) There is no number eight.
peace,
--
Nick Mathewson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 652 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20070327/4676652f/attachment.pgp>
More information about the tor-dev
mailing list