[tor-bugs] #24665 [Core Tor/Tor]: sched: In KIST, the extra_space kernel value needs to be allowed to be negative
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Dec 21 03:29:06 UTC 2017
#24665: sched: In KIST, the extra_space kernel value needs to be allowed to be
negative
--------------------------+------------------------------------
Reporter: dgoulet | Owner: dgoulet
Type: defect | Status: needs_review
Priority: Very High | Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor | Version: Tor: 0.3.2.1-alpha
Severity: Normal | Resolution:
Keywords: tor-sched | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
--------------------------+------------------------------------
Comment (by yawning):
Replying to [comment:3 dgoulet]:
> Replying to [comment:2 yawning]:
> > The branch looks sensible to me.
> >
> > My inner pedant will say that "If a connection to a relay was
unreliable meaning tor was struggling to flush bytes towards the relay" is
misleading at best, since the congestion window shrinking (by quite a bit)
is an expected part of how TCP/IP works and not particularly indicative of
an overloaded condition on it's own.
>
> Ah I think I failed to explain my comment correctly. The point of that
line was a reason for KIST to actually consider the `notsent` queue size
*because* it could be that the connection is struggling towards the relay.
>
> How would you phrase it in a proper English?
"The KIST scheduler did not correctly account for data already enqueued in
each connection's send socket buffer, particularly in cases when the
TCP/IP congestion window was reduced between scheduler calls. This
situation lead to excessive per-connection buffering in the kernel, and a
potential memory DoS. Fixes bug 24665; bugfix on 0.3.2.1-alpha."
Maybe not human friendly enough.
> > While you're here, assuming the scheduler is called significantly
faster than the RTT of most links (read that as "If 10 ms is lower than
the RTT of most if not all links"), you can/should reduce
`sock_buf_size_factor` as well, because you aren't going to get a full
congestion window worth of ACKs back between scheduler calls in common
cases.
>
> Interesting... if the channel is quite active, yes the scheduler tick
for it should be 10ms.
>
> What is a reasonable size factor in your opinion? It seems we can get
some RTT information with the `getsockopt()` call within `struct
tcp_info`, maybe we could adjust a scaling factor based on those values?
(If that is an idea, we should open a ticket to make way for this one to
be merged)
There isn't a good "one size fits all" solution. Setting it too low will
gimp performance on fast low latency links, setting it too high right now
bloats the various buffers. I would personally opt more toward avoiding
the latter given all the Fun that's happening.
As you noted, `tcpi_rtt` gives the smoothed RTT estimate (and
`tcpi_rttvar` the RTT variance if you need it), which is probably
sufficient to give a better reasonable guess here, as a first pass, I
would recommend doing something based on the the scheduler interval to
smoothed RTT ratio, with a hard maximum at `1.0`, but as you noted this is
probably best discussed in a separate ticket.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24665#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list