bridge and bridge authority proposal
Roger Dingledine
arma at mit.edu
Mon May 7 03:42:25 UTC 2007
Hi folks,
Here are some details on my plans for bridges and bridge authorities.
They're still fluid because I haven't actually built it, so it's hard
to know if they will turn out to be the right plans when it comes down
to coding, but it's at least a start.
If you don't know what I'm talking about, go take a look at the
blocking resistance draft design (and slides and video if you prefer)
at https://tor.eff.org/documentation#DesignDoc
There are three components that need to be added: bridge directory
authorities; bridges themselves; and the client side ("bridge users" or
"bridge clients"). (We need a better name than "bridge user" -- perhaps
a suitably ethnic but suitably inoffensive version of Alice? Or does a
name exist that matches those two constraints? :)
Piece one: bridge directory authorities.
This part is easy. I've added a new config option
BridgeAuthoritativeDir. I've also revamped the code so you can opt
to be a V1 or V2 authority, or you can be a bridge authority, but you
don't need to be both. (In fact, I suspect we will want to think harder
about our logic ("what exactly do we serve") if some authority wants
to be both -- but I don't see a need for that quite yet, so I'm leaving
it for another time.)
There's a new wrapper authdir_mode_bridge() that tells whether we are
acting as a bridge directory authority. When we are, we decline to
generate or serve v1 directories/running-routers or v2 network statuses.
Otherwise we answer dir queries as normal, and we allow uploads of
server descriptors as normal.
One day we'll want some way to enumerate the bridges we hear about,
rather than just listing them internally and never publishing the
list. My first plan, once all the work described in this email
is finished, is to have bridge authorities write a list of bridge
descriptors to disk, and then the humans can manually tell the IP:ORPort
of a few bridges to testers and to people in need. After that works
we can produce a second plan.
Piece two: bridges.
Bridges are just Tor servers that publish to a different location.
My next step is to change the DirServer config syntax so it can
hear that an authority is a 'bridge' flag too.
Then we need a way to tell servers where they should publish. I was
thinking of adapting the PublishServerDescriptor config option for
that. Currently it's only used for controllers like Blossom, and
it's a boolean, but we might make it more general and let it take
"v1", "v2", and/or "bridge" arguments too. We could retain "0"
for "don't publish to anything" and "1" for "publish to whatever
you think best" for backward compatibility. Or we could retain "0"
for "don't publish to anything" and "1" for "a synonym for v2" for a
different sort of backward compatibility that we mark as deprecated. Or
if adapting this config option is dumb, we could add a separate
PublishServerDescriptorToWhere config option, but that seems overkill.
Bridges would likely want to set RelayBandwidthRate and
RelayBandwidthBurst. Good thing they mostly work now.
Piece three: clients.
This is the trickiest part. Users of bridges want to use a set of
bridges as their first hops -- rather than entry guards. So the easy
part is a new config option "UseBridges 0|1", and a new LINELIST
config option
"Bridge IP:Orport [fingerprint]".
Now, when UseBridges is set, it is necessary that all circuits
and dir fetches traverse a bridge as their first hop. In order to
be able to bootstrap, users need to be able to learn networkstatus
documents. They could do this by
a) connecting to the bridge and sending it a begin_dir request. Not
so good because now every bridge needs to be a dir cache.
b) connecting to the bridge and sending a begin request to exit to
a directory authority's port. Not so good because now bridges can't
just have "reject *:*" as their exit policy.
c) Doing a create-fast to the bridge, and then some sort of
extend-fast to the directory authority, and querying the authority
via begin_dir from there. Not so good because the Tor protocol
doesn't support that (and it wouldn't get the full security that
the Tor extend provides, because the bridge could bluff).
For the first solution, I suggest we go with a) -- if the bridge has
a defined dirport, then it mirrors dir info quickly and often, and if
it doesn't, then it mirrors dir info just as a normal Tor client does,
but in any case the bridge user can dip into the bridge's directory info
and learn enough to bootstrap. So long as the bridge can make circuits,
this means the bridge user should be able to make those circuits too.
To make things simpler for the first go, we can just demand that for
now bridges must define their dirport.
(This choice has implications for future designs where Tor clients
know different pieces of the directory -- it will be harder to keep
secret which pieces you know if your bridge clients can just query you.)
As a little bonus, if the bridge user fetches his dir info from the
bridge, he'll be sure to ask for descriptors that he can get (since
they're the ones the bridge is trying to get too), and he saves some
bandwidth for the bridge (though only download bandwidth so that
doesn't matter as much).
I'm inclined to keep the "bridges" list on bridge clients separate
from the "entry guards" list, on the theory that sometimes people will
require bridges and sometimes they won't and we don't want to mingle
things. But the parallel between "bridge users use a bridge as their
first hop and do a begin-dir to it to learn dir info" and the future
plans of "Tor users use an entry guard as their first hop and do a
begin-dir to it to learn dir info" is eerie, and I expect that down
the road we will evaluate whether to merge them somehow.
So somebody watching a bridge will see it make connections to a fixed
handful of nodes, and those are the circuits the bridge operator is
generating, and the other circuits are probably for the relayed traffic
from the bridge user. This introduces anonymity research questions
("what are the implications", "can we do better"), which I leave open
for now in the interest of getting a first prototype up. Feel free to
answer them, and we can change our mind down the road.
The details of keeping state inside Tor, remembering that you need
to build your circuits through a bridge, having "one-hop" circuits vs
"three-hop circuits", etc are going to get messy, and that's where the
bulk of the work will come in. A lot of that work is already underway
with client-side support for begin_dir.
We probably want a way to cache bridge descriptors in the datadir and
keep them separate from "main" Tor server descriptors. Which leads to
the next section.
Descriptor purposes: how to tell them apart.
It turns out we've encountered a similar issue in the past, when
controllers wanted to give us router descriptors that Tor shouldn't use
when it's making its own paths. We solved it then by adding a 'purpose'
to descriptors -- 'general' purpose is for normal descriptors, whereas
'controller' purpose is for others. When Tor chooses nodes for its
paths, it only chooses from the general-purpose descriptors.
The controller specifies the purpose it has in mind when it invokes
postdescriptor. The descriptor itself doesn't contain its purpose --
after all, a Tor server is a Tor server, and different people can use
it in different contexts.
So how does this apply here? When we learn a bridge descriptor,
e.g. from connecting to the IP:ORPort and using a begin_dir to ask for
/tor/server/authority, or from asking a bridge authority for a new one,
we tag it as a 'bridge' purpose so we can remember what to use it for.
The specific problem we're solving is how to make sure that the first
hop is a 'bridge' purpose when UseBridges is set. But the more general
case is that we want a way to tell Tor to use certain purposes in
certain positions in the path. When we have more purposes out there,
I can imagine that onion_populate_cpath() and friends could assign a
desired purpose in each step of the cpath, so when we choose a router
for that step we choose from among the right pool of routers. This
would let us handle N different Tor networks down the road, and we
could build paths that traverse several of them. And we could put tags
on dirserver lines to specify the purpose that should be assigned to
all the descriptors we learn from it. And eventually we will need a
better word than 'purpose' to describe what we're doing with it. But
no need to solve this stuff until we get closer to it.
Nick proposed that we add a little header section to each descriptor
before we write it to disk, explaining its purpose and maybe other
features about it. I think this is a great idea. Nick is better at
choosing formats for these things than I am, so I will propose a proof
of concept and let him improve it:
"Add the following two lines above the 'router' line:
local-status version-num
purpose foo
where version-num is the version of local-status we're using (always
1 for now), and foo is the purpose we'd like to remember for this
descriptor. Later we might add an 'origin' line or some other line. The
local-status section is over when we reach a 'router' line."
We also have need of writing other statistics about a given router,
such as for directory authorities that collect stats about uptime
periods -- but these stats will change significantly more often than
the descriptor itself, so we should probably store them in some other
file, so I'll ignore that topic here.
There. This should be enough to tackle for now.
--Roger
More information about the tor-dev
mailing list