[or-cvs] r16531: {} Added proposal draft (projects/hidserv/trunk/doc)
chrisw at seul.org
chrisw at seul.org
Wed Aug 13 21:11:34 UTC 2008
Author: chrisw
Date: 2008-08-13 17:11:34 -0400 (Wed, 13 Aug 2008)
New Revision: 16531
Added:
projects/hidserv/trunk/doc/xxx-hidden-service-performance-improvements.txt
Log:
Added proposal draft
Added: projects/hidserv/trunk/doc/xxx-hidden-service-performance-improvements.txt
===================================================================
--- projects/hidserv/trunk/doc/xxx-hidden-service-performance-improvements.txt (rev 0)
+++ projects/hidserv/trunk/doc/xxx-hidden-service-performance-improvements.txt 2008-08-13 21:11:34 UTC (rev 16531)
@@ -0,0 +1,167 @@
+Filename: xxx-hidden-service-performance-improvements.txt
+Title: Hidden Service Performance Improvements
+Version: $Revision: XXXXX $
+Last-Modified: $Date: 2008-08-13 22:09:20 +0000 (Wed, 13 Aug 2008) $
+Author: Christian Wilms, Karsten Loesing
+Created: 13-Aug-2008
+Status: Open
+Target: 0.2.2.x
+
+Change history:
+
+ 15-Aug-2008 Initial proposal for or-dev
+
+Overview:
+
+ This proposal deals with the performance of hidden services. It includes a
+ number of changes to improve the performance and accelerate the process of
+ accessing hidden services.
+
+ The proposal is structured as follows: The next section motivates why the
+ performance of hidden service needs to be increased. Then a number of
+ changes is described in detail, followed by security implication and
+ compatibility issues. At the end the implementation of the changes is
+ outlined.
+
+Motivation:
+
+ Tor enables users to anonymize communication by routing connection traffic
+ over a number of relays. Hence it provides anonymity at the expense of
+ performance experienced by the user. Anonymously connecting to a
+ public server uses a single circuit consisting of three Tor relays and leads
+ to a latency of a few seconds on average, compared to fractions of a
+ second for a not anonymized connection. A connection to an anonymous
+ hidden service uses several of those circuits, which decreases the latency
+ of connecting to a hidden service to a mean of 39.24 seconds including
+ the transfer of a simple user request and the associated response. This
+ high average latency is accompanied by a high variability of latencies, so
+ that users have no indicator, if the hidden service connection is
+ established after ten seconds or if they have to wait for two minutes and
+ more until a successful connection establishment is achieved.
+
+ This performance issue becomes a usability issue, if users are not willing
+ to wait so long or are frustrated by the variability and therefore quit
+ requesting or providing hidden services.
+
+ The changes proposed in this proposal aim at the improvement of hidden
+ service performance to make them more attractive to users. All changes are
+ hidden service specific only and do not accelerate the anonymous access to
+ public websites. Improvements of general path selection for example, which
+ increase performance of the latter scenario, automatically affect hidden
+ service performance, because the hidden service protocol uses the same
+ circuit building routines to build a number of circuits.
+
+
+Details
+
+ Several different changes follow, that independently improve the hidden
+ service performance.
+
+ (1) Client contacting two introduction points over different circuits
+
+ In the current protocol the client cannibalizes a single preemptively built
+ circuit to contact the hidden service's introduction point. Every second
+ of extending this circuit fully matters to the user. This becomes
+ especially annoying, when extending the circuit fails, which is discovered
+ after 60 seconds, the general circuit building timeout.
+
+ We propose that the client extends two different pre-built circuits to
+ contact two of the three introductions points, published in the rendezvous
+ service descriptor, simultaneously. This also takes advantage of the
+ distribution of circuit extension times. A pre-evaluation has shown that
+ building two introduction circuits to the same introduction point decreases
+ the average connection establishment time by 4.9 seconds.
+ %TODO: cite http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf
+
+ By using two different introduction points instead of one we additionally
+ avoid a delay, caused by an introduction point not being available.
+
+ To be able to cannibalize two instead of one circuit to contact the
+ introduction points, it is necessary to have one more preemptively built
+ internal circuit available, i.e. three instead of two in the current
+ protocol (the third circuit is used to establish a rendezvous point).
+
+ After receiving a hidden service request and fetching the rendezvous
+ service descriptor from a hidden service directory, the client randomly
+ selects two introduction points from the descriptor. I cannibalizes two
+ circuits and tries to extend them to the two introduction point. Both
+ circuits are labeled as introduction circuits for the specific hidden
+ service. The first successfully opened circuit is actually used as
+ introduction circuit and an INTRODUCE1 cell is sent over it. When the
+ second circuit is opened later, it is relabeled as circuit for general
+ purpose and can be used later. The only difference to other circuits is,
+ that it now consists of four hops instead of three.
+
+ (2) The hidden service preemptively opens a fixed number of two circuits
+ that can be cannibalized later to contact the requesting clients rendezvous
+ point. A very popular hidden service can receive so many client requests,
+ that it cannot build new circuits fast enough. If that's the case the full
+ circuit building also affects the user's waiting time instead of the circuit
+ extension only. Analyzing circuit building times shows that this leads to an
+ increase of hidden service connection establishment by 4.7 seconds for
+ popular hidden services.
+
+ To avoid a performance worsening for popular hidden services we propose to
+ dynamically determine the number of pre-build internal circuits on hidden
+ service side instead of using a fixed number. The hidden service keeps
+ track of its requests and increases the number if necessary. It also
+ decreases the number, if fewer requests arrive.
+
+ (3)...
+
+Security implications:
+
+ (1) The introduction points do not learn that they are the fourth hop of a
+ second introduction circuit, so for them it looks like being a normal last
+ node in a three-hop circuit.
+
+ (2) Approximately the same number of circuits is build in the current
+ protocol and the proposed change. In the change the circuits are built
+ earlier only and extended when needed.
+
+
+
+Compatibility:
+
+ (1) There are no compatibility issues with other versions, because opening
+ two introduction circuits affects the client only, but not the relays used
+ in its circuits nor the introduction points or other hidden service
+ protocol roles.
+
+ (2) Determining the number of pre-built internal circuits on hidden
+ services is a hidden service only change. It does not affect the hidden
+ service protocol and therefore there are no compatibility issues with
+ relays using other versions.
+
+
+Implementation:
+
+ (1) All changes for the two introduction circuits are made in the client
+ implementation.
+
+ /1/ Circuit opening
+ - Increase the number of pre-built internal circuits from two to three
+
+ /2/ Circuit extension
+ - Select two introduction points from rendezvous service descriptor
+ - Cannibalize two pre-built circuits to the two introduction points
+
+ /3/ On circuit completion
+ - Check whether another introduction circuit was already opened for this
+ hidden service
+ - If this is the first circuit, go on with regular protocol
+ - If this is the second circuit, relabel it as general purpose circuit
+
+ (2) Basically the implementation is done by writing and reading the dynamic
+ value at the right time.
+
+ /1/ Keeping track of hidden service requests
+ - Whenever a hidden service request arrives this affects the dynamic value
+ - Depending on how much time passes between the request, the value is
+ changed using exponential smoothing. This means the frequency of recent
+ requests has more impact on the number than the frequency long ago.
+
+ /2/ Reading dynamic value
+ - The circuit_predict_and_launch function is called every second and only
+ needs to use the dynamic value instead of the fixed value in the current
+ implementation
More information about the tor-commits
mailing list