A Brief Study on Circuit Construction Speed and Reliability
Ringo Kamens
2600denver at gmail.com
Sun Dec 17 19:42:22 UTC 2006
Thanks for that. It's interesting to have that data visualized.
On 12/16/06, Mike Perry <mikepery at fscked.org> wrote:
> While testing the latest relese of my Tor scanner, I decided to do a
> study on circuit reliability and how long it takes to construct a
> circuit then fetch the html of http://tor.eff.org, and also to
> fetch http://tor.eff.org via that same constructed circuit.
> Using tor- (actually SVN r9067), I sorted the routers by their
> bandwidth capacity, divided them up into 15% segments of the network
> from 0 to 90, and for each segment timed 250 circuits as well as used
> the new failure tracking abilities of my scanner to track the failure
> rates of nodes as well as the failure reasons for circuits and
> streams.
> Times are seconds:
> RANGE 0-15 250 build+fetches: avg=20.89, dev=31.23
> RANGE 0-15 250 fetches: avg=3.66, dev=2.69
> RANGE 15-30 250 build+fetches: avg=33.44, dev=47.01
> RANGE 15-30 250 fetches: avg=7.28, dev=12.86
> RANGE 30-45 250 build+fetches: avg=81.47, dev=79.55
> RANGE 30-45 250 fetches: avg=12.66, dev=38.63
> RANGE 45-60 250 build+fetches: avg=63.56, dev=67.56
> RANGE 45-60 250 fetches: avg=7.51, dev=12.80
> RANGE 60-75 250 build+fetches: avg=40.85, dev=42.76
> RANGE 60-75 250 fetches: avg=10.13, dev=11.28
> RANGE 75-90 250 build+fetches: avg=48.87, dev=56.11
> RANGE 75-90 250 fetches: avg=6.82, dev=7.48
> As you can see, the high bandwidth nodes in 0-15% are much quicker
> than the rest both at using existing circuits and at building new
> ones. My guess is that the circuit build speed increase is likely due
> to the fact that running a fast node requires a fast machine to be
> able to do all the crypto, and thus crypto-intensive circuit builds
> execute faster on these nodes.
> The rest of the results for circuit construction and speed seem only
> loosely tied to bandwidth, however. Probably other factors like
> network connection and stability come into play there. A few bad nodes
> can slow those averages down a lot, as is hinted at by the large std
> deviation in some of the classes.
> So what of the failure rates and reasons then? Lets have a look at the
> FAILTOTALS line from each class:
> 0-15.naive_fail_rates: FAILTOTALS 131/473 54+6/603 OK
> 15-30.naive_fail_rates: FAILTOTALS 224/750 135+40/726 OK
> 30-45.naive_fail_rates: FAILTOTALS 559/1221 130+29/737 OK
> 45-60.naive_fail_rates: FAILTOTALS 273/845 138+22/752 OK
> 60-75.naive_fail_rates: FAILTOTALS 140/592 85+33/678 OK
> 75-90.naive_fail_rates: FAILTOTALS 187/637 76+18/656 OK
> By looking at the README for the scanner, we see the format of these
> lines is:
> So it looks that nodes in the 30-45% range seemed to have a good deal
> higher rate of circuit failure than the rest (if you're wondering, the
> overall circuit failure rate is 33%).
> Looking at the top of the 30-45.naive_fail_rates file shows us a
> handful of nodes with slightly higher failure rates than normal, but
> several of the other classes have a few bad nodes also. So why was
> this class so much slower?
> It turns out if you look at the naive_fail_reasons file, the largest
> portion of failures comes from CIRCUITFAILED:TIMEOUT reason:
> 250 REASONTOTAL 522/1277
> or 522 timeout failures out of all the total node failures. Note that
> reason-based failure counting and reason totals are node-based, where
> as the FAILTOTALS lines just count circuits and streams, hence the
> large number there.
> In general, the most common failure reasons were circuit timeouts,
> stream timeouts, and OR connection closed (TCP connections between
> nodes mysteriously dying or failing to open).
> Here's the top failure reasons by class. When there are 3 reason terms
> paired together, the reason was reported from an upstream node and not
> deduced locally.
> 0-15:
> 1. CIRCUITFAILED:OR_CONNECTION_CLOSED (174/322 node failures)
> 2. CIRCUITFAILED:TIMEOUT (72/322 node failures)
> 3. STREAMDETACHED:TIMEOUT (41/322 node failures)
> 15-30:
> 30-45:
> 45-60:
> 60-75:
> 75-90:
> So if you total the two OR_CONN_CLOSED (local and remote), you see
> that for some reason node to node TCP connections are fairly
> unreliable and prone to being closed (or are difficult to
> open/establish?). This is strange...
> I should also note that stream failure reasons are only counted for
> the exit node, where as circuit failure reasons are counted for 2
> nodes - the last successful hop and the first unsuccesful one. So in
> effect, the STREAMDETACHED reason really is 2x more common than in
> those lists. On the other hand, it is mostly alleviated by making
> compute_socks_timeout() always return 15 (this was not done for this
> study, however).
> Well that's about all the detail I have time to go into right now. The
> complete results are up at
> http://fscked.org/proj/minihax/SnakesOnATor/speedrace.zip
> As soon as I finish polishing up my README and change log, I will put
> up the new release of SoaT itself up. Should be by sometime today.
> --
> Mike Perry
> Mad Computer Scientist
> fscked.org evil labs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-talk/attachments/20061217/e2c79fef/attachment.htm>
More information about the tor-talk
mailing list