[tor-bugs] #24782 [Core Tor/Tor]: Set a lower default MaxMemInQueues value
Tor Bug Tracker & Wiki
blackhole at torproject.org
Mon Jan 8 21:04:46 UTC 2018
#24782: Set a lower default MaxMemInQueues value
---------------------------------+------------------------------------
Reporter: teor | Owner: ahf
Type: defect | Status: assigned
Priority: Medium | Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor | Version:
Severity: Normal | Resolution:
Keywords: tor-relay, tor-ddos | Actual Points:
Parent ID: | Points: 0.5
Reviewer: | Sponsor:
---------------------------------+------------------------------------
Comment (by teor):
Replying to [comment:5 dgoulet]:
> We could also explore the possibility for that value to be a moving
target at runtime. It is a bit more dicy and complicated but because Tor
at startup looks at the "Total memory" instead of the "Available memory"
to estimate that value, things can go badly quickly if 4/16 GB of RAM are
available which will make Tor use 12GB as a limit... and even with a
fairly good amount of swap, this is likely to be killed by the OOM of the
OS at some point.
>
> On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of
RAM that Tor can use at startup won't be "fast" for much long before the
OOM kicks in and start killing old circuits.
This is not what I have observed. I have some fast Guards. Under normal
load they don't ever use much more than 1 - 2 GB total RAM.
> It is difficult to tell what a normal fast relay will endure in terms of
RAM for Tor overtime but so far of what I can tell with my relays, between
1 and 2 GB is usually what I see (in non-DoS condition and non-Exit).
I usually see 1-2 GB for non-exits, and closer to 2 GB for exits.
> I do believe right now that the network is still fairly usable because
we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to
me if firing up the OOM more frequently would improve the situation but we
should be very careful at not making every relays using a "too low amount
of ram" :S.
If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB
of RAM is 100 seconds of traffic. Is it really useful to buffer 100
seconds of traffic? (Or, under the current load, tens of thousands of
useless circuits?)
So I'm not sure if using more RAM for queues actually helps. In my
experience, it just increases the number of active connections and CPU
usage. I don't know how to measure if this benefits or hurts clients. (I
guess I could tweak my guard and test running a client through it?)
Here's what happened when I followed my own advice in this thread:
https://lists.torproject.org/pipermail/tor-relays/2018-January/014021.html
I have a few big guards that are very close to a lot of the new clients.
They were using 150% CPU, 4-8 GB RAM, and 15000 connections each. But they
were not actually carrying much useful traffic.
I tried reducing MaxMemInQueues to 2 GB and 1 GB, and they started using
3-7 GB RAM. This is on 0.3.0 with the destroy cell fix. (But on my slower
Guards and my Exit, MaxMemInQueues worked really well, reducing the RAM
usage to 0.5 - 1.5 GB, without reducing the consensus weight.)
I tried reducing the number of file descriptors, that reduced the CPU to
around 110%, because the new connections were closed earlier. It pushed a
lot of the sockets into the kernel TIME_WAIT state, about 10,000 on top of
the regular 10,000. (Maybe these new Tor clients didn't do exponential
backoff?)
I tried DisableOOSCheck 0, and it didn't seem to make much difference to
RAM or CPU, but it made a small difference to sockets (and it makes sure
that I don't lose important sockets, like new control port sockets, so I
left it on).
I already set RelayBandwidthRate, but now I also set
MaxAdvertisedBandwidth to about half the RelayBandwidthRate. Hopefully
this will make the clients go elsewhere. But this isn't really a solution
for the network.
So I'm out of options to try and regulate traffic on these guards. And I
need to have them working in about a week or so, because I need to run
safe stats collections on them.
I think my only remaining option is to drop connections when the number of
connections per IP goes above some limit. From the tor-relays posts, it
seems like up to 10 connections per IP is normal, but these clients will
make hundreds of connections at once. I think I should DROP rather than
RST, because that forces the client to timeout, rather than immediately
making another connection.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24782#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list