[tor-dev] Improving the structure of indirect-connection PTs (meek/flashproxy)

Wed Apr 16 14:56:05 UTC 2014

Ximin Luo <infinity0 at torproject.org> writes:

> ## Background
>
> Pluggable Transports are proxy programs that help users bypass censorship.
>
> [App client] -> XXX EVIL CENSOR HAS YOU XXX ACCESS DENIED XXX
> [App client] -> [PT client] -> (the cloud!) -> [PT server] -> [App server]
>
> The structural design, on the client side, is roughly:
>
> 1. App client specifies an endpoint to reach
> 2. PT client receives an instruction, via SOCKS, to connect to this endpoint
> 3. PT client does its thing, magic happens (intentionally vague)
>
> ## In Tor
>
> Each endpoint is specified by a Bridge line, in the form of an IP address and an
> optional fingerprint (for authentication).
>
> This point is not made more important in existing docs, but is important for
> the topic of this email: both the IP address and the fingerprint are potential
> *identifiers* of the endpoint. The former is an impure name, the latter a pure
> name.
>
> Currently, we have two main types of PT:
>
> - direct PTs - connect to the endpoint directly via a TCP connection
>   - these PTs don't try to hide the fact that you're contacting X on addrX.
>   - instead, they usually transform the traffic so it's not identifiable
>   - e.g. obfs3, fte, scramblesuit
>
> - indirect PTs - connect to the endpoint indirectly, via special means
>   - flashproxy - connects via an ephemeral browser proxy
>   - meek - connects via an online web service
>
> I will now argue that indirect PTs should do things in a specific way, which is
> *not* the way meek and flashproxy currently does things.
>
> ## Meek and flashproxy
>
> Meek and flashproxy provide an indirect way of accessing Tor. Instead of
> connecting directly to a Bridge (which might be blocked), the client connects via a
> midpoint that is harder to block. Very very roughly,
>
>                     (meek/fp controller)
> [meek/fp client] -> [meek/fp midpoint] -> [freedom!]
>
> <snip>
>
> So instead of having, as currently:
>
> (old, hacky) Bridge flashproxy (dummy addr)
>
> We would have the following cases:
>
> (1) Bridge flashproxy (real addr)
> (2) Bridge flashproxy (real addr) (fingerprint)
> (3, not-ideal) Bridge flashproxy (dummy addr) (fingerprint)
>
> Option (3) is quite nice, since in indirect PTs the actual address is
> irrelevant - the Tor client never tries to connect to it. I suggest that we
> have a special syntax for it though, to explicitly discourage hacks that {use
> dummy addresses but which are treated as real addresses by the underlying
> application}, since this breaks assumptions of the PT spec.
>

Hm, but this kind of kills the magic of indirect PTs, right? That is,
users who want to use flashproxy in the way above, will have to know
an address or a fingerprint of the bridge beforehand. What is the use
case? Advanced users? I guess most users (people who use the TBB) will
still need to use the current scheme, right?

Also, if all traffic goes over the midpoint, how can we make sure
that the midpoint will connect us to the bridge requested with:
> (1) Bridge flashproxy (real addr)
?

FWIW, I liked your argument with regards to authentication, and
David's reply citing a few tickets that detail the (lack of) threat
model for Tor bridges...