[tor-dev] Journey to the core of Tor: Why does Roger has so many guards?
George Kadianakis
desnacked at riseup.net
Mon Jun 23 23:51:48 UTC 2014
During our meeting in Iceland, we talked a lot about guard nodes. Some
of that discussion eventually turned into proposal 236 [0].
During our discussions, we looked into the state file of Roger, and we
noticed that there are 50 or so guard nodes in there. And that made us
wonder: "Why does Roger have so many guards?".
Roger is not the problem in this case; my state file also has many
guards. Most people who don't use bridges or hardcoded EntryNodes have
shitloads of guards. This post tries to explain why.
So, Tor, in its memory, has an ordered list of entry guards (the
global `entry_guards` smartlist in `src/or/entrynodes.c`). This list
can be lengthy: it usually contains more than $NumEntryGuards entry
guards. You can see this beautiful list just on your right below that
beautiful stalagmite:
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l64
This happens because in its first startup, Tor adds $NumEntryGuards
nodes to that list. However if one of them is not Stable and Tor needs
to build a Stable circuit, Tor will need to append a Stable guard to
the list. Similarly, if one of the guards is down, Tor will need to
compensate for that and append [1] one more guard to the list. The
same happens if Tor needs to fetch directory documents, but its guards
are not directory mirrors.
So, if Tor walks to the end of the guard node list and it still hasn't
found enough guard nodes with the needed property to make a pick, it
picks a random entry guard from the consensus and adds it to the
list. It's amazing and yet real, look straight ahead (and don't look
directly into the light):
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l1092
But this still does not explain why Roger has so many guards. Usually
a list of 5 or 6 nice guards is sufficient to satisfy the needs of any
circuit (alive, stable, fast, directory mirror).
The reason for Roger's surplus of guards, is the following very
interesting functionality of Tor: Consider the following scenario, you
start Tor while your network is down, Tor starts picking nodes from
your list and attempts to connect to them. All connections fail, since
your network is down. So now, Tor needs to add a new guard node to the
list. There are two cases now:
If Tor fails to connect to this new guard node (your network is still
down), Tor removes the new guard node from the entry guard list
(that's good; otherwise the list would be full of nodes added while
the network is down). Look on your left, you can see this beautiful
phenomenon happening here:
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l741
However, let's say that your network is back up, and Tor manages to
connect to this new guard node! That's great! But should Tor keep the
connection to this guard? The answer is probably that it shouldn't:
Tor should recognize this problem and attempt to reconnect to the
primary guards on the top of the list.
And that's exactly what Tor does. Nature is truly amazing! Just relax
and witness this behavior happening right in front of your eyes:
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l776
So, when Tor manages to connect to this newly added entry guard, it
assumes that the network is back, and walks through the list of entry
guards and marks them all as "needs to be retried". It also marks the
connection to the new entry guard as rotten and kills it. This to me
is very interesting, because it ensures that the primary guards (the
ones at the top of the list) are going to be tried again after the
network is back up; otherwise we would leak connections to new guards
all the time!
And all that fluff is related to this post, because this new guard
(that made us realise that the network is back up) actually stays in
our guard list. So, basically every time the network goes down and Tor
does this little dance, a new entry guard is appended to our list and
our statefile. And that's why Roger has so many guards! Or at least,
that's why *I* have so many guards [2].
Apart from this being wonderful on its own, there are two interesting
points here:
a) There is always a bug:
As this thing happens more times, our guard list gets bigger and
the time to walk it increases.
Dig this race condition:
Tor starts up with the network being down, so the connections to
our primary guards fail, but the network comes back while we are
walking our entry guard list and trying to connect to the rest of
our guards. If we manage to connect to one of the guards in our
list (the lucky guard), the code at
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l776
doesn't get triggered because `first_contact` is not true (that
node was already in the guard node list). So, we stick with that
lucky guard even though it's not our primary guard, and since the
network is back up, a connection to our primary guards would work
too.
What stinks here is that all the guards above that lucky guard are
marked as unreachable, so next time Tor starts up, it will ignore
them and jump directly to the lucky guard.
This probably needs to be fixed somehow. I opened trac ticket
#12450 for this issue [3].
b) While writing proposal 236 we were thinking about how new guard
nodes should be picked. Should we pick new guard nodes at the point
they are needed? Or should we pick a surplus of guard nodes in the
beginning, and then when the primary ones expire, we use the extra
ones? You can read more about this behavior here:
https://gitweb.torproject.org/torspec.git/blob/2ecd06fcfd883e8c760f0694f3591d854ba40045:/proposals/236-single-guard-node.txt#l47
The insight here is that apparently we are already doing the latter
approach, because all these guard nodes that get added when our
network goes back up will remain in our guard list. And when our
primary guards expire, the ones on the bottom will rise on the top
(till they expire themselves).
So if you are wondering "when does Tor add new entry guards?", the
answer is "when you move your laptop to a new location; just before
you connect to the wifi" ;)
Greetings from the core,
have a good day!
[0]: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/236-single-guard-node.txt
[1]: Note that the word "append" is vital here. The extra guards are
appended to the end of the list, and when Tor wants to pick a
guard node it walks the list from the top. So, these newly added
guards have lower priority so to say (most of them will not even
be considered if the ones above are sufficient for building a
circuit).
[2]: Here is a grep of my logs. Look at how the guard counter
increments by one everytime we hit
https://gitweb.torproject.org/tor.git/blob/d064773595f1d0bf1b76dd6f7439bff653a3c8ce:/src/or/entrynodes.c#l776
$ zgrep "Marking earlier" /var/log/tor/notices.log.3.gz
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 0/2 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 0/3 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 3/4 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 4/5 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 5/6 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 8/9 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 6/8 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 7/9 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 8/10 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 9/11 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 10/12 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 11/13 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 12/14 entry guards usable/new.
[warn] Connected to new entry guard XXX. Marking earlier entry guards up. 13/15 entry guards usable/new.
[3]: https://trac.torproject.org/projects/tor/ticket/12450#ticket
More information about the tor-dev
mailing list