[tor-bugs] #24737 [Core Tor/Tor]: oft given MaxMemInQueues advice is wrong
Tor Bug Tracker & Wiki
blackhole at torproject.org
Sun Dec 31 17:10:08 UTC 2017
#24737: oft given MaxMemInQueues advice is wrong
----------------------------+----------------------------------
Reporter: starlight | Owner: (none)
Type: defect | Status: new
Priority: Medium | Milestone: Tor: unspecified
Component: Core Tor/Tor | Version:
Severity: Normal | Resolution:
Keywords: doc, tor-relay | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
----------------------------+----------------------------------
Comment (by starlight):
Replying to [comment:1 teor]:
> I'm not sure what you want us to do in response to this ticket.
> If you can write up a short wiki page with some advice, we could point
to it rather than trying to guess the right setting.
I suggest adding some verbiage to the Tor Manual where most people would
look first when adjusting MaxMemInQueues.
>
> I don't think percentages are helpful - I think creating a table with
free RAM to MaxMemInQueues values would be more helpful. (See below.)
> . . .
> To be more precise: MaxMemInQueues doesn't track destroy queues, nor
does it track various other Tor data structures,
> So you have to set it at a level that allows space for a few hundred
megabytes of Tor data, and then some destroy queues.
>
> At 1024 MB per instance, this means 512 MB or less.
>
> But with 10 GB per instance, it really is ok to allow 5-7 GB in queues.
> (I have a relay that allows the default 8 GB in queues, and it's fine.)
>
My observation is that when MaxMemInQueues triggers a circuit kill, the
daemon will have consumed in physical memory approximately twice the
setting value. Of course YMMV on the precise amount, but this
observational rule-of-thumb is far away from the suggestion that 120-130%
of MaxMemInQueues will be used.
> > the aforementioned incorrect advice was followed in #22255 and the
operator continues to experience OOM failures
>
> Are you the operator?
> Have they tried 0.3.2.8-rc and reopened another ticket?
Not the operator on that ticket. It came up in a search and seems to me
his MaxMemInQueues is too high relative to RAM.
> The tor daemon will assert and exit if malloc returns NULL.
Ah, well then vm.overcommit_memory=2 will cause the daemon to die sooner
rather than later instead of a more graceful response such as killing one
circuit. Still better then allowing Linux OOM handler choose a victim to
kill.
Alternately, my advice for hardy souls willing to expend such effort:
1) leave the default vm.overcommit_memory=0 in effect
2) write a script to set /proc/<pid>/task/<tid>/oom_adj to -17 for every
process in the system
3) have a script set oom_adj=0 for a process you would rather have die
than the tor daemon
3b) if one sets -17 for every process, then Linux will suspend the memory
requester until some becomes available; this could result in a hung
system, a crashed system, or it could result in a semi-graceful recovery
in the case where socket buffer memory is freed as queues drain
Additionally one should set vm.min_free_kbytes=131072 or even =262144. By
default Linux sets this value so low that a sudden surge in arriving
network traffic will use up all free memory so fast OOM killer and dirty-
cache writes can't keep pace and the system will OOPs (hard crash).
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24737#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list