[tor-dev] BridgeDB - Bridge Distribution Modifications
Matthew Finkel
matthew.finkel at gmail.com
Tue May 14 12:37:50 UTC 2013
On Tue, May 14, 2013 at 09:42:47AM +0200, Karsten Loesing wrote:
> On 5/14/13 8:08 AM, Matthew Finkel wrote:
> > Hi all,
> >
> > Over the last few weeks I've been working with George and Aaron on
> > updating BridgeDB's code with respect to how it handles pluggable
> > transports.
>
> Hi Matthew,
>
> I didn't read your proposed BridgeDB changes in detail (sorry!), but I'd
> like to ask for something: can you make sure that the
> bridge-pool-assignments file stays useful when your changes are deployed?
>
> https://metrics.torproject.org/formats.html#bridgepool
>
> We're not processing bridge pool assignment files automatically, but
> we'll include them in Atlas once it supports searching for bridges.
> Right now the strings make some sense to bridge operators by giving them
> an idea whether and how BridgeDB distributes their bridge. If possible,
> this should still be the case in the future.
>
> Thanks!
> Karsten
>
Hi Karsten,
Absolutely. To be honest, I don't expect these modifications to impact
that file much, and I see no reason to alter the format of it, but I'll
verify everything remains sane throughout the updates.
Thanks for raising your concern!
Matt
>
> > I've made some decent progress, but there are some
> > questions that I'd like to ask (because I'm not sure I should be the
> > one making the decision). I've also started updating the spec and
> > there are some parts on which I'd like some clarification. I'll try to
> > summarize the thoughts on the matter we/I have thus far. See [A] if
> > you're unfamiliar with the BridgeDB code/spec/idea.
> >
> > 1) How should BridgeDB decide the number of transports, and types, it
> > should hand out?
> >
> > - My current patch returns transports based on the ratio of how many
> > there are compared to the other bridges, so that if we hand out
> > four bridges and obfs2 bridges account for 3/10 of all running
> > bridges, then BridgeDB will hand out (4*(3/10)) = 1.2 bridges with
> > each request, on average.
> > - I've also added an option into bridgedb.conf to set the (expected)
> > minimum and maximum number of bridges which support a specific PT
> > that BridgeDB should hand out per request.
> > - I have a verification check that tries to force us to meet these
> > values, however, with its current implementation it's not
> > guaranteed, only probabilistic. I think this is okay for now.
> > - So, is this enough? Do we want/need a deterministic method of
> > supplying bridges with a supported set of transports?
> > - Another option is to place each transport into its own subring and
> > select from each of the subrings to ensure we meet the requirement.
> > The more I've thought about this, the more I think this defeats
> > the purpose of constructing the rings, though.
> > - Last (for now), if a bridge supports multiple PTs, should we return
> > all of them to the user or randomly select one or select one with a
> > bias? We agreed that we really shouldn't do the first because that
> > would just accelerate the ability of a censor to block more bridges.
> > The middle option works, but given that many bridges now support
> > obfs2 and obfs3, is it a good idea to, again, probabilistically
> > return each type (roughly) half the time?
> >
> > 2) Should we prefer to distribute PT bridges over regular bridges which
> > have their ORPort on 443?
> > - Right now returning ORPorts on 443 is the highest priority and
> > transports are a secondary best-effort operation.
> >
> > 3) Unless I incorrectly understand the code, the bridges never rotate.
> > The bridge interval is set to NoSchedule(), which means it returns
> > a static time. Is there a reason for this? This is counter to the
> > spec. Just wondering. :)
> >
> >
> > (I had some other points I wanted to raise, but I'm blanking on them
> > now. I think this is a good start, though.)
> >
> > Please also let me know and correct anything I may have gotten wrong.
> >
> > Thanks everyone, and thanks to George and Aaron for their help, as well.
> >
> > - Matt
> >
> >
> >
> >
> > A. For those who don't know the details of the code, the simplified
> > version is as follows:
> >
> > 1) All bridges send their bridge descriptors and misc information
> > to the Bridge Authority.
> > 2) Bridge Authority provides a network status file containing all
> > known bridges described by their name, fingerprint, digest,
> > time of publication, IP addr, ORPort, DirPort. Bridge Auth also
> > provides a bridge descriptor file also specifying the bridges
> > IP addr, ORPort, and fingerprint. Last, it supplies an extra-info
> > file that contains all the extra info that the bridges
> > provide - mainly their transports, in our case.
> > 3) BridgeDB parses all of these files and associates the information
> > to a single instance of a bridge.
> > 4) BridgeDB assigns each running bridge to a distributor (website,
> > email, etc) based on an hmac of the bridge's ID. Once assigned,
> > the bridge is inserted into the distributors list of bridges.
> > 5) BridgeDB then further organizes the bridges assigned to each
> > distributor by moving them into rings and subrings.
> > - A ring is simply a sorted list of an hmac of the bridges' ID
> > which, when traversed, wraps around to the beginning if it ever
> > reaches the end.
> > - The hmac of the bridge's ID is used to retrieve the actual
> > bridge instance from a hash, which is stored along side the ring.
> > 6) Some distributors, such as https, are 'initialized' with a few
> > rings based on filters.
> > - https starts out with a ring containing all bridges assigned to
> > it, a ring only containing bridges which support IPv4
> > connections, and a ring only containing bridges which support
> > IPv6 connections.
> > - Every ring also contains two subrings (currently). One subring
> > is the subset of bridges from the parent ring which have their
> > ORPort listening on port 443. The other subring is the subset
> > of bridges from the parent ring which have the stable flag set.
> > - For example,
> > - Cluster 1 Ring
> > - subring (stable)
> > - subring (https)
> > - Cluster 2 Ring
> > - subring (stable)
> > - subring (https)
> > - IPv4 Cluster 1 Ring
> > - subring (stable)
> > - subring (https)
> > - IPv4 Cluster 2 Ring
> > - subring (stable)
> > - subring (https)
> > - IPv6 Cluster 1 Ring
> > - subring (stable)
> > - subring (https)
> > - IPv6 Cluster 2 Ring
> > - subring (stable)
> > - subring (https)
> > 7) When BridgeDB receives a request for bridges from its website, it
> > forwards the query on to the IP distributor. The details will
> > include if a specific PT was requested, IP version bridge
> > supports, country within which the bridge should not be blocked,
> > requesing IP address, and interval.
> > 8) The distributor then decides on the "area" of the IP address,
> > currently the /24 mask, and then finds the "cluster" within that
> > area (by taking the first eight bytes of an hmac of the area and
> > using the result (modulus "the number of clusters")). A filter is
> > then constructed based on the requested information. If a ring
> > already exists that satisfies exactly these filters then that is
> > then constructed based on the requested information. If a ring
> > already exists that satisfies exactly these filters then that is
> > used. Else a new ring (with subrings) is constructed to satisfy
> > this request. The distributor also computes the position in the
> > ring as the hmac of the interval and the area.
> > 9) Once the correct ring exists, it determines how many bridges it
> > can find in the ring's subrings to satisfy the request. This is
> > done by taking the previously computed position and finding it
> > in the list of bridges ID's hmacs and then selecting the next
> > consecutive "requested number of bridges" from the list (wrapping
> > around to the beginning, if necessary). The same is then done for
> > the main ring. The results from these searchs are then joined and
> > the first "requested number of bridges" unique keys are selected
> > from the list. This list is then sorted and returned, propagating
> > back up to the user.
> > 10) Similar actions are taken by the other distributors. For example,
> > the email distributor doesn't use an "area" to decide which
> > bridges to distribute, it uses the normalized requesting/source
> > mail address.
> > 11) Misc:
> > - Because the rings are sorted by an hmac of the bridge's ID, I
> > expect that they will be uniformly distributed around the ring.
> > As such, I don't expect there to be a bias for one type of
> > bridge/transport/ORPort over any other. (Is this incorrect?)
> > _______________________________________________
> > tor-dev mailing list
> > tor-dev at lists.torproject.org
> > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
> >
>
More information about the tor-dev
mailing list