[tor-bugs] #29278 [Circumvention/Pluggable transport]: Assess HTTP proxy
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri May 17 22:22:57 UTC 2019
#29278: Assess HTTP proxy
-----------------------------------------------+---------------------------
Reporter: cohosh | Owner: phw
Type: task | Status: assigned
Priority: Low | Milestone:
Component: Circumvention/Pluggable transport | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points: 2
Reviewer: | Sponsor: Sponsor19
-----------------------------------------------+---------------------------
Comment (by phw):
Here are some general thoughts:
* I quite like the concept. httpsproxy is the closest we've ever gotten to
a transport that "looks like HTTP". It uses HTTP's CONNECT method
(conceptually similar to SOCKS), which makes it flexible and low-overhead.
It also means that anyone who runs a web server could turn on CONNECT
(and, to prevent abuse, limit outgoing connections to IP addresses of
guard relays), effectively turning the web server into a snowflake-like
bridge that doesn't run a Tor client, which conveniently fixes #7349.
This, however, requires non-trivial changes to BridgeDB as I explain
below.
* In my opinion, httpsproxy's biggest problem is that it still suffers
from the proxy distribution problem. No matter how well httpsproxy can
disguise Tor traffic, we still end up trying to distribute a small number
of long-lived bridges while hoping that our adversaries are having a hard
time collecting them all. We don't know how many of our bridges have been
collected (#9316 may shed light on this) but it's
[https://censorbib.nymity.ch/pdf/Matic2017a.pdf certainly easier than we
would like it to be].
* I worry that the crowd that can run an httpsproxy bridge may be smaller
than the crowd that can run an obfs4 bridge. httpsproxy supports two
deployment scenarios;
"[https://trac.torproject.org/projects/tor/ticket/26923#Naiveproxy naive
proxy]" and
"[https://trac.torproject.org/projects/tor/ticket/26923#FullBridge full
bridge]". The "naive proxy" scenario is similar to snowflake and expects
you to already be running a web server. We may have many motivated
volunteers, but I'm afraid that only a small fraction runs their own web
server. This is not necessary in the "full bridge" scenario, but this
comes at the cost of being less resistant to fingerprinting. In
comparison, snowflake's barrier to entry is significantly lower—especially
once we have a web extension (#23888).
Here are my thoughts on what deployment would entail:
* httpsproxy is written in golang. It's not a lot of code (the HTTP logic
comes from the [https://github.com/mholt/caddy caddy module]) and the
concept behind it is relatively simple, meaning that we would be able to
maintain it even if the original author would vanish.
* The "naive proxy" deployment scenario won't work with our bridge
authority and BridgeDB because they assume that a tor client and its
pluggable transport run on the same machine. To make the "naive proxy"
scneario work, we would probably have to come up with a new channel that
allows tor-less httpsproxies to announce themselves to BridgeDB. Since
this is similar to the way snowflake works, a snowflake-style broker
mechanism may come in handy here but unlike snowflake, httpsproxy is
affected by the bridge distribution problem, so the broker would need to
get some of the smartness that BridgeDB already has (see #29296).
* Alternatively (or in parallel), we can deploy httpsproxy in the orthodox
"full bridge" scenario, which is similar to obfs4. In this case, a tor
client ships with a web server (currently [https://caddyserver.com
caddy]). This will work out of the box with our bridge authority and
BridgeDB, but we will have a number of additional issues:
1. Bridges will expose a web server ''and'' an OR port. Because of
#7349, this will enable confirmation attacks à la "Not sure if this web
server runs a Tor bridge? Just port scan it and look for an OR port". This
isn't a new problem but it somewhat defeats the purpose of shipping a
well-designed pluggable transport.
2. All bridges will run the same web server and if this web server isn't
particularly popular on the Internet, censors could fingerprint and block
them all. I don't know how popular caddy is, but I've never heard of it
before I started learning about httpsproxy.
3. The content hosted on the bridge's web server needs to look
"natural". A web server that gives you a simple 404 or 403 for its landing
page may look suspicious. Or maybe not? I don't think we can expect our
bridge operators to be creative, and serve "natural" content on their
httpsproxy web servers.
The benefit of this scenario is that it doesn't require architectural
changes to BridgeDB. In fact, we could move forward with deploying the
"full bridge" scneario and start supporting the "naive proxy" approach
later on.
* There are a bunch of fingerprinting issues that we would have to think
about. Sergey, the author of httpsproxy, already did a great job
discussing them
[https://trac.torproject.org/projects/tor/ticket/26923#Fingerprinting over
here]. I'm particularly worried about
[https://trac.torproject.org/projects/tor/ticket/26923#Probingwebserverwithproxyrequestswithoutasecret
an active probing attack] that allows a censor to confirm if a web server
supports CONNECT.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29278#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list