Comments on proposals 121, 142, and 143.
Nick Mathewson
nickm at freehaven.net
Tue Jul 15 04:44:31 UTC 2008
Here are some comments on the open hidden service proposals. As
before, if you're replying about just one, please edit the subject
line in your reply.
PROPOSAL 121: Hidden Service Authentication
Proposal 121 adds generic authorization features to hidden services,
and describes a specific implementation of this authorization feature
for providing password-like authorization to a small set (say, up to
16) of users.
The following changes should be made to section one (the generic part):
- In 1.2, rather than making two kinds of INTRODUCE1 cells and using
voodoo and duct tape to tell them apart, introduce a INTRODUCE1V
relay cell type (the V stands for versioned) and make all new kinds
of INTRODUCE1 cells use this cell type. (Proposal 142 already suggests
something like this.)
- The replay avoidance approach can be far better. Instead of the
approach in the proposal, which still allows a number of replays,
try including a timestamp and a nonce in the authenticated portion
of the INTRODUCE2 cell. Require the timestamp to be no more than T
seconds in the past or future, and require that the nonce has not
been used for the last 2*T seconds. This requires less storage,
and prevents all replays. [Instead of a nonce, we can and should
use a cryptographic hash of the rendezvous cookie, or the g^x data
from the INTRODUCE2 cell, or the entire introduce2 cell contents.]
Karsten is revising section 2 a bit as well to discuss some motivation
issues, and we're going to figure out (but not necessarily build for
0.2.1.x) an authorization system that scales to more users better than
that of proposal 121's section 2. Such a system may not provide the
same security as the one in section 2: the goal is to do better than
the status quo for security, and better than section 2 for
scalability.
We'll also want to think here about what to do when we want
interactive authorization protocols, or to support methods requiring
more data than can fit in the space currently available in Tor cells.
PROPOSAL 142: Combine Introduction and Rendezvous Points
Karsten says this won't be ready on an 0.2.1.x timeframe, so I'll
gloss over it here. I think there is an important insight here, but I
have some questions:
- Chris -- it would be helpful if you could summarize more detail
from your thesis about the relevant timing issues. Since the
point of this proposal is to reduce latency, we really need to get
all the measurement we can of its efficacy. (If you can include a
URL for the thesis and a page reference, that would help people
who don't have a copy on hand.)
- As I read it, I don't see how the proposal results in a separate
circuit existing from the hidden server ("Bob") to the client.
When a normal RP gets an RENDEZVOUS2 cells, it splices the two
*circuits* together so that relay cells sent from Alice eventually
arrive at Bob, and Alice can send RELAY_BEGIN cells to open
streams to Bob's hidden service. The rendezvous point never sees
the content of these cells; it just passes them from one circuit to
another.
In this proposal, though, it looks like circuits from multiple
Alices wind up joined to a single circuit from Bob. (This is
complicated by the proposal saying "connection" when I think it
means "circuit". [*]) Tor doesn't work that way! If two Alices
send a RELAY_BEGIN cell with the same streamID, how is Bob to tell
the streams apart? When Bob sends a data cell back along the
circuit, which of the Alice circuits should the introduction point
send it to? Remember, the rendezvous point can't see the insides
of these RELAY cells.
There are possible fixes for this, but none of the ones I can
think of offhand are too attractive. For example, Bob could
pre-build numerous circuits to the introduction point and use one
of those to send the RENDEZVOUS2 cell.
PROPOSAL 143: Distributed storage improvements
This is an omnibus proposal with 8 separate ideas. Going one by one:
1. Report Bad Directory Nodes
This seems like a fine idea, though the additional complexity is
not insignificant.
I worry that a clever adversary could distinguish a publication
attempt from an HS authority. After all, hidden services do not
generally upload the same descriptor twice: if somebody sends you
a descriptor shortly after you received the same descriptor, and
it's a descriptor you're trying to censor, you can tell it's the
authority.
The "blacklist all nodes in the same /24 or /16" rules seem far
too harsh: they let an adversary cut out huge swaths of the
network using only one or two targeted hosts.
The voting rule listed makes the BadHSDir flag follow different
rules from all other networkstatus flags. This would require a
version bump in the voting method.
2. Publish Fewer Replicas
This is worthwhile, but no reason is given to think that the 85.7%
reliability figure will hold given future networks and network
conditions. It would be better to look into adaptive solutions
that will continue to work no matter what the reliability is in
the future. See my recent comments on proposal 151: most apply
here.
3. Change Default Value of Being Hidden Service Directory
Seems entirely reasonable. Overdue, even. :)
BTW, how many of the numbers in the rest of this proposal are
derived from the existing HSDir nodes? If the number of HSDir
nodes is small, then most of the measurements in the rest of this
proposal are based on a worryingly small sample set.
4. Make Descriptors Persistent on Directory Nodes
Plausible, but measurements are needed to make sure this is a good
idea. If a server goes down, how often does it occur that it
starts up again in time to serve the hidden service descriptors
it's holding? If the odds are good, this is a good idea.
Otherwise, not?
5. Store and Serve Descriptors Regardless of Responsibility
Good idea. We need an answer for DOS attacks here, though.
6. Avoid periodic descriptor re-publication.
Good idea. Seems obviously correct to me.
7. Discard Expired Descriptors
Good idea. Should descriptors contain an expiration time?
8. Shorten Client-side descriptor fetch history
I don't understand this one fully, I think.
[*] Long ago, we were saying "connection" to mean "an
application-level end-to-end connection relayed over Tor"; "a TCP
connection between two servers"; and "a multi-hop path along Tor
servers". This led to confusion until we wound up trying to call
the first thing a "stream", the second thing a "connection", and
the third thing a "circuit".
The original Onion Routing authors went through this too;
unfortunately, we weren't careful enough to follow their wisdom,
so we had to learn through experience. :)
More information about the tor-dev
mailing list