[tor-bugs] #21394 [Core Tor/Tor]: connection timeouts are affecting Tor Browser usability
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Oct 26 13:27:54 UTC 2017
#21394: connection timeouts are affecting Tor Browser usability
-------------------------------------------------+-------------------------
Reporter: arthuredelstein | Owner: (none)
Type: defect | Status: new
Priority: Very High | Milestone: Tor:
| 0.3.3.x-final
Component: Core Tor/Tor | Version:
Severity: Normal | Resolution:
Keywords: tbb-performance, tbb-usability, | Actual Points:
performance, tbb-needs |
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------------------+-------------------------
Changes (by teor):
* milestone: Tor: 0.3.2.x-final => Tor: 0.3.3.x-final
Comment:
Replying to [comment:23 arthuredelstein]:
> Replying to [comment:22 teor]:
> > Replying to [comment:20 arthuredelstein]:
> > > I did some more experiments:
> > >
> > > ...
> > > Indeed I got 9/50 timeouts for the domain with http or https, but no
timeouts for IPv4 and only a single timeout for IPv6.
> > >
> > > Does this ring any bells for Tor core experts? What might be
happening with DNS here?
> >
> > Some exits may be overloading their resolvers. Or our code may be
buggy. It would be helpful to identify the particular exits that are
experiencing these timeouts, and work out if they are in the same AS or
using the same resolvers.
>
> Makes sense. If the DNS resolve fails at an exit, does the exit send an
error message back to the client? Or does it silently fail, meaning the
client has to wait for the full 10-second timeout?
It depends on how it fails.
If the resolve times out at the exit, it also times out at the client,
If the resolve fails fast, a an error cell is sent to the client.
I don't think we can make this faster.
> > I also wonder if we should ask bandwidth authorities to use DNS
whenever possible, so they see DNS timeouts, and downgrade exits that have
them. See #24010.
>
> Nice idea. Would it also be feasible to have exits periodically run
diagnostics to see if their DNS resolution is working properly
Yes. Exits already check DNS at startup, and turn off exit traffic if it
fails. I opened #24014 in 0.3.3 to make them check periodically.
> and if not, report the problem to bandwidth authorities
There's no way for relays to report anything directly to the bandwidth
authorities.
Instead, relays modify their descriptors in response to self-checks.
In this case, the relay would disable exit traffic until a DNS check
succeeds, and clients would find out about it when they next download its
(micro)descriptor after the next consensus.
> and notify their relay operator?
Yes, this would be part of #24014: we will log a warning when we disable
exit traffic.
> > The only node in a tor path that uses DNS is an exit, so if DNS
breaks, it causes issues at the exit.
>
> That seems sensible. I'm only a little puzzled that it seems more common
than I would expect that I saw not a single timeout, but a double, triple
or quadruple timeout (see instances of 2,3,4 in my raw data). Presumably
it's switching to a new exit node after each individual timeout, so why do
I frequently see multiple timeouts for a single connection? Maybe it's
just bad luck, but it made me wonder if I'm seeing something that goes
wrong for the whole connection attempt and not just individual circuits.
You could also have a slow guard, or a site that has slow DNS,
But the most likely explanation is that some exits are massively
overloaded, and DNS bears the brunt of that overloading.
We could encourage relay operators to use a local DNS cache, but threads
on this come up every month or two on tor-relays, so I'm not sure starting
another would be useful.
Another task that's in progress is to shift exit bandwidth away from the
US east coast and Western Europe, because there's an over-allocation in
that area at the moment. (It is where most bandwidth authorities have
their HTTPS servers.)
I would suggest that we find a way of monitoring this, so we can check if
our fixes make a difference.
This might be a task for metrics, I'll leave it to you to open a ticket,
because you know what needs to be done to test for timeouts.
There's nothing in this ticket that core tor can bugfix in 0.3.2, so I'm
moving it to 0.3.3.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21394#comment:24>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list