[tor-dev] Clarifications on guard-spec.txt
George Kadianakis
desnacked at riseup.net
Fri May 19 11:30:59 UTC 2017
Roger Dingledine <arma at mit.edu> writes:
> Hi folks!
>
> I'm catching up on my proposals, and I really like guard-spec.txt.
> Nicely done!
>
> I made some fixes that I hope are straightforward:
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html
>
Hello,
all three commits above LGTM. Thanks for cleaning up the proposal!
> And I have four remaining questions/topics that could use some feedback
> from the spec authors/implementors rather than just having me bust in
> and change things.
>
> ---------------------------------------------------------------
>
> A) We don't really say what we mean by "adding" a guard to an ordered
> list. In particular:
>
>> We add new members to {CONFIRMED_GUARDS} when we mark a circuit
>> built through a guard as "for user traffic."
>
> When we say "add", do we mean "append" here?
>
>> To compute primary guards, take the ordered intersection of
>> {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
>> {N_PRIMARY_GUARDS} elements. If there are fewer than
>> {N_PRIMARY_GUARDS} elements, add additional elements to
>> PRIMARY_GUARDS chosen _uniformly_ at random from
>> ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
>
> Similarly, does "add" mean "append"?
>
>> Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
>> until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed
>> elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
>
> This one looks like a bug as currently stated: by "proceed", do we mean
> "precede"?
>
All the suggestions above seem correct.
We should open a ticket and make a patch.
> ---------------------------------------------------------------
>
> B) In Sec 4.10, whenever we get a new consensus, we:
>
>> For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
>> {FIRST_UNLISTED_AT}.
>
> In the old design, we also believed the Running flag in the new
> consensus, that is, we marked the guards as reachable again. It looks
> from sampled_guards_update_from_consensus() like we no longer do that.
>
> Is that old behavior considered a bug and we intentionally stopped,
> or did we not consider it?
>
> I am ok with "that was stupid behavior, so we stopped", but if we didn't
> know that we used to do it, maybe we should decide whether it is stupid
> behavior. :)
>
Hmm, I also don't see us currently doing this, but it might be worth doing.
I imagine it's not going to do much in most cases since primary guards
have a pretty short retry period, but still it might be a good idea
since it will keep our guard list more up to date.
Perhaps we should consider adding this behavior to the spec and code.
> ---------------------------------------------------------------
>
> C) In Sec A.4, in the state file we have
>
>> "bridge_addr" -- If the guard is a bridge, its configured
>> address and OR port. Optional.
>
> How does this play with bridges that have pluggable transports? In my
> "bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is
> not listed.
>
> It looks from
> https://trac.torproject.org/projects/tor/ticket/21027
> like we did some fixing here, but the spec didn't get an update?
>
Hmm, seems like that part of the spec is not really accurate indeed!
Maybe we can just update the spec and say that `bridge_addr` contains
the configured address and port of the bridge which can be either the
ORPort or PT port?
> ---------------------------------------------------------------
>
> D) In Sec 4.8, when a circuit succeeds:
>
>> * If this circuit was <usable_if_no_better_guard>, it is now
>> <waiting_for retry>. You may not yet attach streams to it.
>> Then check whether the {last_time_on_internet} is more than
>> {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:
>>
>> * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
>> reachable.
>>
>> * If it is not, update the list of waiting circuits. (See
>> [UPDATE_WAITING] below)
>
> [Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]
>
> To make sure I understand this one, consider the following scenario:
>
> * We go offline, and our current circuits fail.
> * We try to make new circuits through each of our primary guards,
> failing for each (because we're offline) and marking them down.
> * Then we move to the remaining confirmed+usable ones, and mark those
> down too.
> * Then we work through the rest of USABLE_FILTERED_GUARDS, marking them
> down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks,
> causing us to add new elements to SAMPLED_GUARDS to replenish
> USABLE_FILTERED_GUARDS.
> * After we've brought SAMPLED GUARDS to 60, we stop adding new ones.
> At this point, we are out of tries:
>
Seems like a reasonable description. I haven't looked at this part of
the code in a while to be 100% sure of what you described being true.
>> * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
>> all the sampled guards. In this case we proceed by marking all guards
>> as <maybe> reachable so that we can keep on sampling.
>
> Does this mean that we mark our *already-sampled* guards as reachable,
> thus making USABLE_FILTERED_GUARDS big again, and we loop through the
> above steps again, and we keep looping until we come back online?
>
> The above "so that we can keep on sampling" phrase confuses me, because it
> seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS
> is empty" without also having SAMPLED_GUARDS at its maximum size:
>
Hmm, I think "we can keep on sampling" actually means keep on sampling
new guards for circuits, and not "keep on sampling new
SAMPLED_GUARDS". I think we overloaded the term "sample" there; we
should fix it.
(FWIW, that functionality was added in #21052.)
>> Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
>> and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
>> add new elements to {SAMPLED_GUARDS} until one of the following
>> is true:
>>
>> * {USABLE_FILTERED_GUARDS} is large enough,
>> OR
>> * {SAMPLED_GUARDS} is at its maximum size.
>
> So, perhaps we mean "so that we can keep on trying circuits"?
>
Yep.
> Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling
> from GUARDS"?
>
> Ok, proceeding with my scenario: let's further say that we're offline
> in such a way that we get quick failures for the connect attempts,
> so we burn through each connect attempt at one per second, meaning we
> blow through our CONFIRMED set within the first 10 minutes of going
> offline. Let's also say we come back online within those 10 minutes. A
> circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is
> now of type <waiting_for_better_guard>.
>
> Since last_time_on_internet isn't very long ago, and there aren't any
> other circuits open, the new circuit becomes <complete>, and we mark
> this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.
>
> We keep using the newly CONFIRMED guard (since it's the only one
> we think is reachable) for about a half hour, at which point the
> PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards,
> and we switch back to our favorite primary guard.
>
> Do I have it right?
>
I think so yes.
More information about the tor-dev
mailing list