0.0.8pre1 works: now what?
Marc Rennhard
rennhard at tik.ee.ethz.ch
Wed Jul 28 18:36:05 UTC 2004
> In my eyes, there are three big issues remaining. We need to:
> 1) let clients describe how comfortable they are using unverified servers
> in various positions in their paths --- and recommend good defaults.
> 2) detect which servers are suitable for given streams.
> 3) reduce dirserver bottlenecks.
>
> ---------------------------------------------------------------------
> Part 1: configuring clients; and good defaults.
>
> Nodes should have an 'AllowUnverified' config option, which takes
> combinations of entry/exit/middle, or any.
I agree it's difficult to decide for a reasonable default value. Using
clients only as exit nodes only slightly increases a user's risk because
the exit node can't learn much without colluding with a server node that
may be picked as the first hop. Exit plus middle nodes as clients is
also quite safe. Picking clients as entries is risky as the client may
own/observe the web (or whatever) server. Using clients as both entry
and exit in a path is the highest risk because owning/observing the web
server is no longer needed for the adversary.
The quite paranoid default could therefore be "use clients as exit
and/or middle nodes in a path"; the less paranoid one is "use clients as
entry/middle/exit but never as entry and exit in the same path". Other
choices do not make much sense in my opinion.
What's funny is that the really paranoid user wants others to use his
node as entry node, but picking clients as entries is relatively high
risk. Similarly, most users would use clients as exits, but only a few
clients will be willing to act es exits. Could this mean that there
potential to "unload" traffic onto the clients is quite small? Or put it
differently, can we shift incentive for clients somehow from entry
(which currently gives you better anonymity) to exit (which currently
could give you troubles) nodes? I don't have an answer but I believe
this is a key problem to be solved on the way towards a hybrid Tor where
a significant (in fact most) of the traffic is handled by clients.
> ---------------------------------------------------------------------
> Part 2: choosing suitable servers.
>
> If we want to maintain the high quality of the Tor network, we need a
> way to determine and indicate bandwidth (aka latency) and reliability
> properties for each server.
>
> Approach one: ask people to only sign up if they're high-quality nodes,
> and also require them to send us an explanation in email so we can approve
> their server. This works quite well, but if we take the required email
> out of the picture, bad servers might start popping out of the woodwork.
> (It's amazing how many people don't follow instructions.)
Maybe not a bad choice to start until you get very many e-mails a day.
> Approach two: nodes track their own uptime, and estimate their max
> bandwidth. The way they track their max bandwidth right now is by
> recording whenever bytes go in or out, and remembering a rolling average
> over the past ten seconds, and then also the maximum rolling-average
> observed in the past 12 hours. Then the estimated bandwidth is the smaller
> of the in-max and the out-max. They report this in the descriptor they
> upload, rounding it down to the nearest 10KB, and capping anything over
> 100KB to 100KB. Clients could be more likely to choose nodes with higher
> bandwidth entries (maybe from a linear distribution, maybe something
> else -- thoughts?).
Sounds reasonable. Simply picking clients at random
(bandwidth-dependant) in a circuit may not be the best option, though.
Better would be classifying clients according their usefulness for
specific application. E.g. that 10KB node (or even less) that is online
for 23+ hours a day is a great choice for remote logins, and this 100KB
client that is usually online just one hour a day is just fine for web
browsing. The disadvantage is that this requires the circuit setup to be
application aware.
> Since uptime is published too, some streams (such as irc or aim) prefer
> reliability to latency. Maybe we should prefer latency by default,
> and have a Config option StreamPrefersReliability to specify by port
> (or by addr:port, or anything exit-policy-style), that looks at uptime
> rather than advertised bandwidth.
OK, that's about what I meant above :-)
> And of course, notice that we're trusting the servers to not lie. We
> could do spot-checking by the dirservers, but I'm not sure what we would
> do if one dirserver thought something was up, and the others were fine.
> At least for jerks the dirservers can agree about, maybe we could
> configure the dirservers to blacklist descriptors from certain IP spaces.
In general - although I fully agree that making use of the clients is a
necessary step -- including clients is extremely risky. First of all,
new users may soon be frustrated and leave if performance is poor
because of a few unreliable clients (I agree that we have the same
problem if the Tor-core, i.e. the servers get overloaded). This means
that the clients offering service should offer really good service,
which will be extremely hard to guarantee. The other issue is that some
users may find it funny to disrupt the QoS. I'm not so much concerned
about adversaries running many nodes (not until Tor really gets big),
but more about clients that drop the circuits of others randomly. This
would easily reduce Tor to its core again because nobody will use other
clients.
In any case, all your questions are extremely difficult to answerI
believe the right way to go is to come up with a reasonable (it won't be
perfect, only time and experience will tell) design to start with and
give it a try. Same way as Tor has done since its first public release.
falling back to Tor-core only is always an option. What follows depends
on the popularity of Tor. If 100 clients will act as relays, not many
problems should arise (despite poor service for the other 99900
clients... I'm very curious about how many of the clients will relay
data for others). Node discovery must likely change if 1000s of clients
are offering to relay data for others. Maybe there will be too many
clients offering poor service and Tor will performs poor. It may then be
time to think about a reputation scheme again...
--Marc
More information about the tor-dev
mailing list