[metrics-bugs] #34257 [Metrics/Onionperf]: Analyze unusual distribution of time to extend to first hop in circuit
Tor Bug Tracker & Wiki
blackhole at torproject.org
Sun May 31 02:35:21 UTC 2020
#34257: Analyze unusual distribution of time to extend to first hop in circuit
-------------------------------+--------------------------------
Reporter: karsten | Owner: metrics-team
Type: defect | Status: new
Priority: Medium | Milestone:
Component: Metrics/Onionperf | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor: Sponsor59-must
-------------------------------+--------------------------------
Comment (by arma):
Neat stuff. I haven't looked into all the data, but here are some first-
thought theories, to make sure you've considered them:
(1) When you're establishing the first hop of the circuit, you might need
to establish a TCP connection and ORConn with the first relay. That
process is a lot of extra round-trips -- first to do TCP, then to do TLS,
then to do Tor's v3 handshake with netinfo cells and so on. In fact, you
might even see the bimodal distributions that Dennis points out, where a
few circuits build quickly because there's already an underlying ORConn in
place, and most of them build slowly because you need to set up the ORConn
too. 100ms latency to the guard, amplified by n round-trips, could make a
real difference. That's part of why we build preemptive circuits, because
it means building preemptive ORConns too.
(2) Guards might be overloaded with CREATE cells, causing them to respond
more slowly to circuit create attempts than other relays do. This overload
might happen because they have a higher consensus weight than other relays
(which is what helped get them the Guard flag), and thus attract more
users.
(2a) Alternatively, another difference for guards is that they have to
handle many more TLS handshakes because they get direct connections from
clients. We've somewhat-recently changed the parameters for when we expire
these TLS conns (#17592, 0.3.1.1-alpha, prop#251). Maybe these more
frequent TLS conns are using enough of their CPU that they are worse at
handling CREATE cells. In theory the CREATE cells are handled in worker
threads though, so there shouldn't be direct competition. But maybe
they're not tuned well, so the worker threads don't keep up well when the
main thread is distracted.
I don't think (2) or (2a) look like good explanations though, because they
should be happening to the dutch onionperf too. But (1) looks plausible.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/34257#comment:14>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list