[tor-bugs] #24767 [Core Tor/Tor]: All relays are constantly connecting to down relays and failing over and over

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Feb 5 21:20:04 UTC 2018


#24767: All relays are constantly connecting to down relays and failing over and
over
-------------------------------------------------+-------------------------
 Reporter:  arma                                 |          Owner:  dgoulet
     Type:  enhancement                          |         Status:
                                                 |  accepted
 Priority:  Very High                            |      Milestone:  Tor:
                                                 |  0.3.3.x-final
Component:  Core Tor/Tor                         |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:  must-fix-before-033-stable, tor-     |  Actual Points:
  relay, tor-dos, performance                    |
Parent ID:                                       |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------
Changes (by dgoulet):

 * keywords:  must-fix-before-033-stable => must-fix-before-033-stable, tor-
     relay, tor-dos, performance
 * owner:  (none) => dgoulet
 * status:  new => accepted
 * priority:  Medium => Very High


Comment:

 I can take this one. But I want to get this clear. If I summarize what we
 have here and what we need to decide:

 * Fortunately, we do track OR connection failure in `rephist`, see
 `rep_hist_note_connect_failed()` so we could use the `or_history_t` to
 know if we should try to connect.

 * We'll need to decide on a "how long before I retry to reconnect" timing.
 We could do a simple "backoff algorithm" like 1 sec, 5 sec, 10 sec and
 give up.

 * Once I've identified that tor can't reach a specific relay, how long
 should we consider it "unreachable"? Or should we repeat the above if no
 retries are pending?

 * Should we keep the EXTEND cells while waiting for a reasonable amount of
 time before reconnecting? That is, if we have a failure to connect, should
 we just DESTROY the circuit or retry to connect and send the cell(s)
 later?

 * Should we consider the disconnects and "I've just waited 60 sec before I
 got anything back from you" as potential failures also to note down?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24767#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list