[tor-relays] preventing DDoS is more than just network filtering
Scott Bennett
bennett at sdf.org
Tue Nov 15 07:26:06 UTC 2022
Chris <tor at wcbsecurity.com> wrote:
> <html>
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
> </head>
> <body>
> <p><br>
> </p>
> <div class="moz-cite-prefix">On 11/10/2022 2:38 AM, Scott Bennett
> wrote:<br>
> </div>
> <blockquote type="cite"
> cite="mid:202211100738.2AA7cw7d026293 at sdf.org">
> <pre class="moz-quote-pre" wrap="">Toralf F?rster <a class="moz-txt-link-rfc2396E" href="mailto:toralf.foerster at gmx.de"><toralf.foerster at gmx.de></a> wrote:
>
> </pre>
> <blockquote type="cite">
> <pre class="moz-quote-pre" wrap="">On 11/8/22 10:57, Chris wrote:
> </pre>
> <blockquote type="cite">
> <pre class="moz-quote-pre" wrap="">The main reason is that a simple SYN flood can quickly fill up your
> conntrack table and then legitimate packets are quietly dropped and you
> won't see any problems thinking everything is perfect with your server
> unless you dig into your system logs.
> </pre>
> </blockquote>
> <pre class="moz-quote-pre" wrap="">
> Hhm, my system log doesn't show any problems, maybe due to (or
> regardless of?):
> CONFIG_SYN_COOKIES=y
> ?
I surmise that the above is a LINUXism that is approximately equivalent to
a pf rule using synproxy.
> </pre>
> </blockquote>
> <pre class="moz-quote-pre" wrap="">
> On FreeBSD 12.3 I use pf and have gone back to using synproxy on the
> "pass in" statements for the ORPort and DirPort, but I doubt it has actually
> made any difference </pre>
I should clarify my statement above by stating that the SYN packets still have
to be received from my ISP before the rule can be applied, so yes, a SYN flood attack
can still tie up my Internet connection, but that does not appear to be the kind of
attacks that my relay was experiencing. Specifying synproxy on the "pass in" rules
for tor means that the kernel simply drops any pending connection that fails to
complete the SYN-SYNACK handshake within a short time instead of passing it on to tor
to deal with; IOW, no incoming connections are passed to tor unless they complete that
handshake first.
The second reason I made that statement was that all the attacks I have seen in
recent months have tied up my inbound (and sometimes outbound) data capacity for some
time, and the next appearance of a set of heartbeat messages from tor show an increase
in the INTRODUCE2 rejections of 2,000 to 3000 or occasionally more. I suspect the
"occasionally more" cases occur when two of the bot attacks hit my relay at the same
or overlapping times. All of the above was true before I began using synproxy again
and appears to be the case still. If you have seen SYN flood attacks, then that is
grounds enough for me to continue to leave it in the rules for tor indefinitely. The
cost to the system for using synproxy is too small to be detected, but the potential
for sparing cost to tor appears to be significant.
> </blockquote>
> <p><font size="-1"><font face="Arial">The quote about SYN Flood is
> actually from my post which went only to toralf and wasn't
> displayed on the group. My bad. To explain further, I didn't
> say the current attack includes SYN floods, what I meant was
Ah. I see.
> whenever we have some conntrack rules in our iptables, it's
> prudent to have some rate limiting rules before it, because if
> the attacker knows we rely on conntrack and intends to do some
Not being a LINUX user, I am unaware of what "conntrack" does. pf has a "keep"
flag that tells it to keep state for each connection, but many years ago pf was changed
to keep state anyway, whether one tells it to or not, so nowadays it is effectively a
comment. I don't know of any method by which one can tell pf *not* to keep state.
> damage, the attacker can easily flood our conntrack table with
> SYN flood and then we start dropping legitimate packets
> without notice. However you're correct, currently there are no
> SYN floods.</font></font><font size="-1"><font face="Arial">?</font></font><br>
Understood. Thank you for the clarification.
> </p>
> <p><br>
> </p>
> <blockquote type="cite"
> cite="mid:202211100738.2AA7cw7d026293 at sdf.org">
> <pre class="moz-quote-pre" wrap="">because the only attacks I've seen so far were coming
> via other relays and triggered tor's rejections of INTRODUCE2 cells by the
> thousands. Instead, what has been very effective has been to increase the
> NumCPUs count drastically. </pre>
> </blockquote>
> <p><font size="-1"><font face="Arial">You're correct yet again. The
> number of CPUs make a huge difference. Tor automatically
> detects up to 16 CPUs if you have them. Anything above that,
> Tor can't see. I've never tried adding it to my torrc though,
> it might see more if you tell it to look for them.</font></font></p>
It only looks for the number of CPU threads actually available if you don't
specify a value for NumCPUs. You can put any natural number there that you want,
unless there's some upper limit I don't know about, e.g., 255.
> <p><font size="-1"><font face="Arial">On my relays which are run on
> VMs, I simply added more CPUs to the VM and somewhere around
> 10 CPUs seemed to be the magic number when all the warning
> messages disappeared. They are currently happily running on
> 12.</font></font></p>
> <p><br>
> </p>
> <blockquote type="cite"
> cite="mid:202211100738.2AA7cw7d026293 at sdf.org">
> <pre class="moz-quote-pre" wrap="">On a non-hyperthreaded quad-core CPU I now have
> it set as "NumCPUs 20". </pre>
> </blockquote>
> <p><font size="-1"><font face="Arial">OK I'm confused now, Are you
> saying that it's possible to tell Tor to use non existent CPUs
> and it actually works? That would be really cool. Is it
> because Tor assigns multiple worker threads to the same CPU?<br>
Of course, it's possible. NumCPUs only tells tor how many worker threads to
start. tor does not assign any CPU affinity, so everything gets handled by the OS's
scheduler. When the main thread encounters an onionskin that must be decrypted, it
places that onionskin onto a queue for some worker thread to pick up as soon as a
worker becomes available. Apparently how fast that occurs determines whether tor
begins dropping connections and issuing warning/error messages, so having a lot of
workers means that one is usually available or becomes available very soon, so the
timeout for decryption of that onionskin to begin doesn't happen. IOW, the timeout
seems to depend upon how long the queued onionskin waits for decryption to *begin*,
not to *complete*.
Anytime I've seen lots of workers active in top(1), they've been showing less
than 1% CPU usage apiece, so they usually have a higher priority than the main thread
unless, of course, the main thread is waiting for a select(1) or some other I/O
operation to be posted complete, in which case the main thread will have a priority
in the single digits anyway, but isn't actually doing anything at the time. Given
that they use less than 1% CPU, it is frankly rather difficult to find one actually
running at any given instant with top(1). Instead they are usually in "kqueue" state
or some similarly waiting state. When tor is being assaulted with an INTRODUCE2
attack, the main thread is usually running at 8% to 15% CPU usage. (These are
attacks coming via other relays, so naturally the synproxy condition is satisfied and
has no effect.)
All that having been written, I would like to point out that greatly increasing
NumCPUs does not *solve* the problem of the INTRODUCE2 attacks, nor do I have any
suggestions for how this type of attack can be prevented/stopped. It is just a
workaround that provides a way for a relay to survive them and keep running, though
at the cost of many thousands of unnecessary, undesirable onionskin decryptions.
On that scale, onionskin decryptions do become significantly expensive and the moreso
the larger the capacity of the relay's Internet connection(s).
> </font></font></p>
> <p><br>
> </p>
> <blockquote type="cite"
> cite="mid:202211100738.2AA7cw7d026293 at sdf.org">
> <pre class="moz-quote-pre" wrap="">Each worker thread uses almost no CPU time, but
> haveing enough of them waiting to grab an onionskin off the queue instantly
> seems to stop all messages about cells, onionskins, or connections being
> dropped.
> During an attack I often saw all workers in top(1) screen updates with
> "NumCPUs 16", so I increased to 20 for the next restart, but I hadn't gotten
> any of the aforementioned error/warn messages at 16. Unfortunately, I have
> yet to see what happens at 20 because before the next restart Comcast made
> a change that blocks me from running a relay. :-( I intend to find out very
> soon whether I can afford to switch to their business network right away, so
> that I might resume running my relay or will have to wait until things happen
> next summer that should free up some of my limited income first.
>
BTW, it is generally poor practice to post HTML to mailing lists. I usually
skip and delete HTML messages, but my eyeballs and brain are feeling fresher than
usual this evening, and your Subject: line was one I had responded to previously,
so I decided to wade through your message after all.
Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet: bennett at sdf.org *xor* bennett at freeshell.org *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good *
* objection to the introduction of that bane of all free governments *
* -- a standing army." *
* -- Gov. John Hancock, New York Journal, 28 January 1790 *
**********************************************************************
More information about the tor-relays
mailing list