[tor-dev] Latest state of the guard algorithm proposal (prop259) (April 2016)

Wed Apr 20 16:36:16 UTC 2016

Hi,

> Hello Fan and team,
>
> I think I'm not a big fan of the pending_guard and pending_dir_guard
> concept. To me it seems like a quick hack that tries to address fundamental
> issues with our algorithm that appeared when we tried to adapt the
> proposal to
> the tor codebase.
>
>
Yeah agree, this pending_guard hack was trying to avoid some implementation
problem, we need to redesign.
I haven't got any good idea about this, that will be nice if you already
got some thoughts.


> I think one of the main issues with the current algorithm structure is that
> _one run of the algorithm_ can be asked to _setup multiple circuits_, and
> each
> of those circuits has different requirements for guards. That is, since we
> do
> filtering on START based on the requirements of circuit #1, this means
> that any
> other circuits that appear before END is called, also have to adapt to the
> requirements of circuit #1. This is obvious in the code since we use
> guard_selection->for_directory throughout the whole algorithm run, even
> though
> for_directory was just the restriction of circuit #1.
>
> Specifically about the pending_guard trick, I feel that it interacts in
> unpredictable ways with other features of the algorithm. For example,
> consider
> how it interacts with the primary guards heuristic. It could be time for
> the
> algorithm to reenter the primary guards state and retry the top guards in
> the
> list, but because of the pending_guard thing we actually return the 15th
> guard
> to the circuit.
>
> IMO we should revisit the algorithm so that one run of the algorithm can
> accomodate multiple circuits by design and without the need for hacks.
> Here is
> an idea towards that direction:
>
>   I think one very important change that we can do to simplify things is to
>   remove the need to filter guards based on whether they are dirguards,
> fast,
>   or stable. My suggestion here would be to *only* consider guards that are
>   dirguards _and_ fast _and_ stable. This way, any circuit that appears
> will be
>   happy with all the guards in our list and there is no need to do the
>   pending_dir_guard trick. See [0] on why I think that's safe to do.
>
>   This is easy to do in the current codebase. You just need to call
>   entry_is_live() with need_uptime, need_capacity and for_directory all
>   enabled (instead of flags being 0).
>
>   If you do the above, your sampled guard set will be able to accomodate
> any
>   circuit that comes its way and that should simplify logic considerably.
>
>
Sounds great, that can simplify the logic a lot, I've done the change, no
more pending_dir_guard.

Let me know if the above does not make sense.
>
> Here are some more comments:
>
> - So the above idea addresses a large part of the filtering logic that
> happens
>   on START. The rest of the filtering logic has to do with ClientUsesIPv6,
>   ReachableAddreses, etc. . I think it's fine to conduct that filtering on
>   START as well.
>
> - I tried to run the branch as of bb3237d, but it segfaulted. Here is
> where it crashed:
>
>      #1  0x000055555567eb25 in guards_update_state (next=0x5555559c3f40,
> next at entry=0x5555559c35e8, guards=guards at entry=0x5555559c4620,
>       config_name=config_name at entry=0x55555570c47e "UsedGuard") at
> src/or/prop259.c:1137
>      1137                   !strchr(e->chosen_by_version, ' ')) {
>
>   Let me know if you need more info here.
>
> Never saw this before, will look into it.

- There is a memleak on 'extended' in filter_set().
>
>   In general, I feel that logic in that function is a bit weird. The
> function
>   is called filter_set() but it can actually end up adding guards to the
> list.
>   Maybe it can be renamed?
>
>
Split it up to filter_set & expand_set probably can make this clear.

- What's up with the each_remaining_by_bandwidth() function name?
>
>
I guess it should be iterate_remaining_guards_by_bandwidth.

---
>
>
> [0]: I think that's OK to do and here is why:
>
>         All Guards are Fast.
>         About 95% of Guards are Stable (this will become 100% with #18624)
>         About 80% of Guards are V2Dir/dirguards (this will become 100%
> with #12538)
>
>      #12538 got merged in 0.2.8, so if prop259 gets merged in 0.2.9, by the
>      time prop259 becomes active, almost all guards will be dirguards.
>
>      So I think it's fine to only consider guards that are dirguards &&
> fast &&
>      stable now, since by the time prop259 gets deployed that will be the
> case
>      for almost 100% of guards.
>


-- 
____


Fan Jiang 蒋帆
Amateur Code Chef
Thoughtworks, Inc.
mobile +86-150-9189-3714
skype fan at torchz.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20160420/07cadd8c/attachment.html>