[tor-bugs] #4490 [Analysis]: Sensitivity analysis of different ways to sample relay capacity for simulations
Tor Bug Tracker & Wiki
torproject-admin at torproject.org
Tue Mar 13 18:37:55 UTC 2012
#4490: Sensitivity analysis of different ways to sample relay capacity for
simulations
------------------------------------+---------------------------------------
Reporter: arma | Owner:
Type: project | Status: new
Priority: normal | Milestone: Sponsor F: July 15, 2012
Component: Analysis | Version:
Keywords: performance simulation | Parent:
Points: | Actualpoints:
------------------------------------+---------------------------------------
Comment(by robgjansen):
I believe we solve the sampling issue in the draft of our CSET paper
(Section 3.2 and Figure 4). Its not ready for distribution, so I'll
describe our approach here briefly.
We sample k of n relays by breaking a list of n relays (sorted by
consensus weights) into k bins, and choosing the median of each bin. This
improves the "deciles" approach (10 bins) and in fact produces the
"optimal" sampling, in that the distribution on relay weights in the
sample is closest to the original population. The argument is that if our
sampled weight distribution is the same as in Tor, then client load will
be distributed approximately the same as in Tor.
We then assign bandwidth for the sampled relays using their reported
observed values from the server descriptors. Whats missing in our paper is
an analysis of the resulting capacity when sampling, which the above
algorithm does not currently try to __directly__ optimize (because the
consensus weights are sometimes bad estimates of capcity).
Should we be sampling based on weights AND observed bandwidth? What else
could we do here to improve the analysis?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/4490#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list