[tor-bugs] #30291 [Core Tor/Tor]: Optimize our path selection code

Thu Apr 25 18:45:16 UTC 2019

#30291: Optimize our path selection code
-------------------------+-------------------------------------------------
     Reporter:  dgoulet  |      Owner:  (none)
         Type:           |     Status:  new
  enhancement            |
     Priority:  Medium   |  Milestone:  Tor: unspecified
    Component:  Core     |    Version:
  Tor/Tor                |   Keywords:  tor-performance tor-hs path-
     Severity:  Normal   |  selection refactoring tor-dos
Actual Points:           |  Parent ID:  #30221
       Points:  10       |   Reviewer:
      Sponsor:           |
  Sponsor27-can          |
-------------------------+-------------------------------------------------
 (This is a more complete ticket than #13739 and thus supersedes it.)

 An onion service can be ask to open many rendezvous circuit by a client.
 With a vanilla Tor, the resource consumption there is roughly 1:2
 (service:client) because the client needs to open two circuits, one to the
 RP and one to the IP and then sends one cell on that circuit to introduce.
 The service will do a bit less than that because its introduction circuit
 is already established so it simply needs to open a rendezvous circuit.

 However, an attacker (DoS) can make a client skip the RP circuit creation,
 pin relays in a path to the introduction point and simply open an
 introduction circuit each time through that path (which in theory should
 be made very fast and also controlled by the attacker).

 This means that the work ratio goes from 1:2 to 1:1 because the client
 only opens one circuit, send a single cell, close it and repeat.

 But, in reality, it gets worst for the service because of the performance
 of our path selection code. It is heavy. For starter, we need to randomly
 select 3 nodes from the consensus for each rendezvous circuit. But to
 select those nodes, we go over all of them for each node to exclude some,
 include others, go through a series of tests such has `tor_addr_family()`
 and `node_has_preferred_descriptor()` that are quite heavy.

 The CPU profile of an overloaded service shows that the majority of CPU is
 spent in selecting these nodes. Here is a perf report output for the top
 5:

 {{{
 +   11.10%  tor      tor                    [.] smartlist_remove
 +   10.96%  tor      tor                    [.] fmonty
 +    8.77%  tor      tor                    [.] tor_memeq
 +    6.56%  tor      tor                    [.] tor_addr_family
 +    5.40%  tor      tor                    [.] tor_addr_is_null
 }}}

 The top `smartlist_remove()` comes from:

 {{{
    - smartlist_remove
       - 11.10% smartlist_subtract
            router_choose_random_node
            choose_good_middle_server
            onion_extend_cpath
            onion_populate_cpath
            circuit_establish_circuit
          - circuit_launch_by_extend_info
             - 10.93% launch_rendezvous_point_circuit
 }}}

 Amazingly enough, we spend more time selecting path than doing crypto on a
 service that gets DDoS by introduction requests (#29607).

 There aren't many straight forward approach here. Optimizing the code path
 bits by bits could be an approach. Most likely, our first step is to have
 a benchmark for our current path selection code and try to improve on it
 incrementally.

 There are also more hackish approach like pre-building a list of RP nodes
 at each consensus (or when torrc options are changed like `ExcludeNodes`)
 and then randomly picking nodes from there would go faster because they
 wouldn't need to go through long iterations of tests, it would be already
 done.

 One example is `router_choose_random_node()` that goes over the entire
 nodes list just to pick nodes to exclude, then again to test each nodes
 `router_add_running_nodes_to_smartlist()` and then potentially does a
 series of `smartlist_subtract()` that iterate over the entire list again.
 This is in theory `O(n) + O(n) + O(n)` actually theoretically being `O(n)`
 but reality is far from the theory here in CPU consumption ;).

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30291>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online