[tor-relays] Rampup speed of Exit relay

Dennis Ljungmark spider at takeit.se
Thu Sep 22 18:55:44 UTC 2016


On Thu, Sep 22, 2016 at 6:30 PM, teor <teor2345 at gmail.com> wrote:
> Hi,
>
> I wanted to highlight the following parts of my reply:
>
> The Tor network is quite happy to allocate around 47 MByte/s (yes, 376 MBit/s) to your relay based on its bandwidth measurements, but it won't do that until your relay shows it is actually capable of sustaining that traffic over a 10-second period (the observed bandwidth). At the moment, your relay can only do 19.83 MByte/s, so that's what it's being allocated.
>
> Maybe your provider has good connectivity to the bandwidth authorities, but bad connectivity elsewhere?
> Maybe your provider is otherwise limiting output traffic?

The provider shouldn't be throttling/limiting. There are other
consumers on the net, and I've been able to maintain long-term high
volume traffic so far without it.


>
> Do you run a local, caching, DNS resolver?
> That could be a bottleneck, as almost all of your observed bandwidth is based on exit traffic.

Now that is something I can try to replace, Thank you, I'll look into that.


>
> The details are below:
>
>> On 22 Sep 2016, at 03:58, D. S. Ljungmark <spider at takeit.se> wrote:
>>
>> On tor, 2016-09-22 at 06:29 +1000, teor wrote:
>>>>
>>>> On 22 Sep 2016, at 05:41, nusenu <nusenu at openmailbox.org> wrote:
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> So, how do we get tor to move past 100-200Mbit? Is it just a
>>>>>>> waiting game?
>>>>
>>>> I'd say just run more instances if you still have resources left
>>>> and
>>>> want to contribute more bw.
>>>> (obviously also your exit policy influences on how much your
>>>> instance is
>>>> used)
>>>
>>> In my experience, a single Tor instance can use between 100Mbps and
>>> 500Mbps.
>>> It's highly dependent on the processor, OpenSSL version, and various
>>> network factors.
>>
>> Acknowledged, the question is, how do you measure that.


Thank you, I'll look further into this during this upcoming week.

> There's a tool called "chutney" that can calculate an effective local bandwidth based on how fast a Tor test network can transmit data.
> Using it, I discovered that two machines that were meant to be almost identical had approximately 1.5 Gbps and 500 Mbps capacity, because one had a slightly older processor.
>
> This is how I test the local CPU capacity for Tor's crypto:
>
> git clone https://git.torproject.org/chutney.git
> chutney/tools/test-network.sh --flavor basic-min --data 10000000
>
> This is how I test the local file descriptor and other kernel data structure capacity:
>
> chutney/tools/test-network.sh --flavor basic-025
>
> This is how I test what a single client can push through a Tor Exit (I think Exits limit each client to a certain amount of bandwidth, but I'm not sure - maybe it's 1 Mbit/s or 1 MByte/s):
>
> tor DataDirectory /tmp/tor.$$ PidFile /tmp/tor.$$/tor.pid SOCKSPort 2000 ExitNodes <your-exit-1>
> sleep 10
> curl --socks5-hostname 127.0.0.1:"$1" <a-large-download-url-here>
>
>
>>> There are also some random factors in the bandwidth measurement
>>> process, including the pair selection for bandwidth measurement. And
>>> clients also choose paths randomly. This means some relays get more
>>> bandwidth than others by chance.
>>
>> That's interesting, how does the bandwidth scaling / metering work?
>> Where/how does it decide how much bandwidth is available vs. what it
>> announces to the world?
>
> Tor has 5 bandwidth authorities that measure each relay around twice per week.
> Then the median measurement is used for the consensus weight, which is the weight clients use to choose relays.
> (The relay's observed bandwidth is also used to limit the consensus weight.)
>
> In particular, your relay is currently limited by its observed bandwidth of 19.83 MByte/s.
> Mouse over the "Advertised Bandwidth" figure to check:
> https://atlas.torproject.org/#details/5989521A85C94EE101E88B8DB2E68321673F9405
>
> The gory details of each bandwidth authority's vote are in:
> https://collector.torproject.org/recent/relay-descriptors/votes/
>
> Faravahar Bandwidth=10000 Measured=43800
> gabelmoo Bandwidth=10000 Measured=78200
> moria1 Bandwidth=10000 Measured=47200
> maatuska Bandwidth=10000 Measured=40100
> longclaw Bandwidth=10000 Measured=49500
>
> (I usually look these up by hand, but I'm sure there's a faster way to do it using stem.)
>
> So the Tor network is quite happy to allocate around 47 MByte/s (yes, 376 MBit/s) to your relay based on its bandwidth measurements, but it won't do that until your relay shows it is actually capable of sustaining that traffic over a 10-second period (the observed bandwidth). At the moment, your relay can only do 19.83 MByte/s, so that's what it's being allocated.


That's some interesting numbers, and they sort of go against what I've
seen so far on the measures. Except for a few spikes where I've seen
16k sockets allocated at once, I'm not sure where.


>
> Maybe your provider has good connectivity to the bandwidth authorities, but bad connectivity elsewhere?

Could be, I'm not in control of peering.

> Maybe your provider is otherwise limiting output traffic?

Shouldn't be, at least not with the measures I've done elsewhere. (
iperf and similar )
>
> Do you run a local, caching, DNS resolver?
> That could be a bottleneck, as almost all of your observed bandwidth is based on exit traffic.
>
>> Right now I can comfortable pull/push 700Mbit in either direction on
>> this node, so that's where I left the setting, if there is a penalty
>> for stating more bandwidth avialable than the network can measure, then
>> I have a problem.
>
> No, the network agrees with your setting - 376 MBit/s / 40% average Tor network utilisation is 940 Mbps.
> (It's slightly higher than your actual bandwidth, possibly because it's an exit, or in a well-connected network position.)

Okay, that's good to know.


>>> If you want to use the full capacity of your Exit, run multiple Tor
>>> instances.
>>
>>> You can run 2 instances per IPv4 address, using different ports.
>>> Many people choose ORPort 443 and DirPort 80 as their secondary
>>> instance ports (this can help clients on networks that only allow
>>> those ports), but you can choose any ports you want.
>>
>> True, but there's a limit to how many nodes you can have in a /24, and
>> I really want scaling up on a single node before adding more resources.
>
> No there's not, not for a /24.
> You can have 253 * 2 = 506 Tor instances in an IPv4 /24.
>
>> Throw more resources at it cause we don't know why it sucks seems like
>> such a devops thing to do.
>
> Add another instance on the same machine.
> Not more resources, but more processes, more threads using the existing resources.
>
>>> Have you considered configuring an IPv6 ORPort as well?
>>> (It's unlikely to affect traffic, but it will help clients that
>>> prefer IPv6.)
>>
>> Not sure right now, I've had _horrid_ experiences with running Tor on
>> ipv6,  ranging from the absurd ( Needing ipv4 configured to set up
>> ipv6)
>

> IPv4 addresses are mandatory for relays for a few reasons:
> * Tor assumes that relays form a fully-connected clique - this isn't possible if some are IPv4-only and some are IPv6-only;
> * some of Tor's protocols only send an IPv4 address - they're being redesigned, but protocol upgrades are hard;
> * until recently, clients could only bootstrap over IPv4 (and they still can't using microdescriptors, only full descriptors);
> * and IPv6-only clients have poor anonymity, because they stick out too much.


>> to the inane ( Config keys named "Port" not valid for both ipv6
>> and ipv4, horrid documentation)
>
> We're working on the IPv6 documentation, and happy to fix any issues.
> What particular Port config?
> What was bad about the documentation?

DirPort tor.modio.se:888  won't actually bind on ipv6, even when it's
resolving to both ipv4 and ipv6

DirPort 888 ;  won't actually bind on ipv6.
DirPort [::]:888;  Won't actually bind on ipv6


Adding more DirPort statements means that you have to pick and choose,
ipv6 or ipv4. Can only broadcast one.

ORPort  same as above.

Having to actually hard-code ipv6 (in the dynamic nature that it is)
in the config files is a pure failure, and I ended up writing a way
too annoying template file to fill it in a boot, simply because Tor
can't bind to a port when it's told.


>
> Feel free to log bugs against Core Tor/Tor, or we can log them for you:
> https://trac.torproject.org/
>
>> I've got quite a few ipv6 -only- networks, and I'd gladly put up a
>> number of relays/Exits there ( using ephemeral addressing) however
>> that's impossible last I tried it.
>
> Yes, it is, see above for why.
>
>> My general consensus from last I looked in depth at this is that "Tor
>> doesn't support ipv6. It claims to, but it doesn't."
>
> Choosing anonymous, random paths through a non-clique network (mixed IPv4-only, dual-stack, and IPv6-only) is an open research problem. We can't really implement IPv6-only relays until there are some reasonable solutions to this issue. Until then, we have dual-stack relays.
>
> And IPv6-only Tor clients can connect using IPv6 bridges, or bootstrap over IPv6 with ClientUseIPv4 0 and UseMicrodescriptors 0.
>
> That's pretty much the limit of what we can do with IPv6, until researchers come up with solutions to the non-clique issue.

You can fix the daemon to actually be capable of binding to ports, and
to not require us to jump through annoying settings just to get it to
even _listen_ to ipv6.



>>>>>> How long has the relay been up?
>>>>>
>>>>> 4 years or so. ( current uptime: 11 hours since reboot, it
>>>>> reboots weekly )
>>>>
>>>> This relay (5989521A) has been first seen on 2014-04-10 according
>>>> to
>>>> https://onionoo.torproject.org (still long enough).
>>>>
>>>> Why do you reboot weekly? Memory leak workaround?
>>>
>>> If you reboot weekly, you will take time each week to re-gain
>>> consensus weight and various other flags. For example, you will only
>>> have the HSDir flag for ~50% of the time. (The Guard flag is also
>>> affected, but it's somewhat irrelevant for Exits.)
>>
>> Fancy that. "Don't upgrade your software because our software can't
>> handle it" is one of those things that really bug me.
>
> That's not what I said.
> Upgrade your software. Have your relay go up and down as much as you like. The network will handle it fine. Tor clients will be fine.
>
> But if you want to optimise your traffic, then fewer restarts are one thing to try. A restart per week certainly isn't typical of most Tor relays, so your relay looks less stable by comparison.

Right, I've readjusted that setting,  let's see if that fixes anything
here, we'll see after a month or so I guess.

>
>> How much downtime can the node have before losing consensus
>> weight/flags?
>
> A restart loses the HSDir flag for 72 hours, and the Guard flag for a period that is dependent on how old your relay is. (It should be inversely related, currently it seems to be positively correlated, which is a bug we're working on fixing.)
>
>> Is it just for restarting the tor process as well?
>
> Yes. Try sending it a HUP instead, when you've just changed the config.

HUP appearantly isn't enough to not make it restart when it's
resolving addresses on ipv4/ipv6.  Turns out its closing sockets
first, then resolving the domainnames, realizing it can't listen, and
then stop.


>
> Why do you (need to) restart weekly?
Same schedule as all infra. We've got a weekly maintainance (well,
two) for restarting machines. Tests recovery, failover and similar.
Not perfect, but better than being afraid of restarting.

>
>>> I'd avoid the reboots if you can, there's a known bug affecting at
>>> least the Guard flag and restarts, where a long-lived stable relays
>>> are disproportionately impacted compared with new relays. I haven't
>>> seen any evidence that it affects other flags or consensus weight,
>>> but you could try not restarting and see if that helps.
>>
>> Right, I can tune that for a week and see.
>
> Thanks. Hope it works out for you.

Me too, I'm hoping to see what the heck is going on to get a stable
load before I start adding more relays/exits.

The performance characteristics of an exit are... not very well documented.


More information about the tor-relays mailing list