[tor-dev] Thoughts on Proposal 203 [Avoiding censorship by impersonating a HTTPS server]

Fri Sep 13 07:59:43 UTC 2013

On 2013-09-12 22:00 , Kevin Butler wrote:
[..]
> I should have made my assumptions clearer. I am assuming the CA is
> compromised in this idea. I have assumed it is easy to make a
> counterfeit and valid cert from the root but it is hard(read infeasible)
> to generate one with the same fingerprint of the cert the server
> actually has.
> 
> This is the key point that I think helps against a MITM, if the
> fingerprint of the cert we recieved doesn't match with what the server
> sent us in the hmac'd value, then we assume MITM and do nothing.

That should take care of that indeed.

[..]
>     >   * The users Tor client (assuming they added the bridge), connects to
>     >     the server over https(tls) to the root domain. It should also
>     >     downloads all the resources attached to the main page, emulating a
>     >     web browser for the initial document.
> 
>     And that is where the trick lies, you basically would have to ask a real
>     browser to do so as timing, how many items are fetched and how,
>     User-Agent and everything are clear signatures of that browser.
> 
>     As such, don't ever emulate. The above project would fit this quite well
>     (though we avoid any use of HTTPS due to the cert concerns above).
> 
> I was hoping we could do some cool client integration with selenium or
> firefox or something, but it's really out of scope of what I was
> thinking about.

Or a very minimal plugin into the browser that talks to a daemon that
does most of the heavy lifting. That way there is no need for selenium
or anything else that might differ from a real browser and plugins can
exist for a variety of browsers (chrome/chromium is what we have at the
moment), and when a new one comes out people can just upgrade as it is
not that tightly bound to it.

[..]
>     >         This cookie should pretty much look like any session
>     >         cookie that comes out of rails, drupal, asp, anyone who's
>     doing
>     >         cookie sessions correctly. Once the cookie is added to the
>     >         headers, just serve the document as usual. Essentially this
>     >         should all be possible in an apache/nginx module as the page
>     >         content shouldn't matter.
> 
>     While you can likely do it as a module, you will likely need to store
>     these details outside due to differences in threading/forking models of
>     apache modules (likely the same for nginx, I did not invest time in
>     making that module for our thing yet, though with an externalized part
>     that is easy to do at one point)
> 
> I'm hoping someone with more domain knowledge on this can comment here
> :) But yeah, I'm sure it's implementable.

The knowhow is there, we got a module also on the server side, just not
had the time to get everything working in that setup; if that works
though nginx will be done too. (Although at the moment way is to have
nginx on the front, let it proxy to Apache and have the module there)

The finishing part and the 'getting it out there' is hopefully soon, but
likely around the end of october timeframe... depending on a lot of
factors though.

[..]
>     The moment you do a ratelimit you are denying possibly legit clients.
>     The only thing an adversary has to do is create $ratelimit amount of
>     requests, presto.
> 
> Hadn't considered that, Good point. We could rely on probabilities, but
> I would prefer some kinda hellban ability once a censors ip has been
> determined (act normal just dont let their actions ever do anything)

As some just use the IP of the client, blocking the 'censor' is the same
as blocking the client. IP based is not the way to go unfortunately.

[..]
>     >   * So, how does the client figure out the url to use for wss://?
>     Using
>     >     the cache headers, the client should be able to determine
>     which file
>     >     is F.
> 
>     I think this is a cool idea (using cache times), though it can be hard
>     to get this right, some websites set nearly unlimited expiration times
>     on very static content. Thus you always need to be above that, how do
>     you ensure that?
> 
> I guess I should have outlined that clearer. F is determined by whatever
> file has the longest cache time of the document served normally, if they
> put it to 50 years, we use that one, if they put two to an equal time,
> then the client and server will just use the first one that appears in
> the document. We are not to generate our own files for the computation
> process as that will lead our servers to be identifiable. Plus remember
> we have the ability to change headers, so if they're setting everything
> to some invalid infinity option, we just change it to 10years on the
> fly, I don't see this being a blocker. 

Very good points, thanks for the elaboration.

>     Also, it kind of assumes that you are running this on an existing
>     website with HTTPS support...
> 
> Yes, the website will need to support https, but these days you're being
> negligent to your users anyway if you're not allowing them https. 

With SNA it is getting easier to just have multiple single-host certs on
the same webserver, but otherwise one has to resort to a wildcard cert
and those typically will cost some dear money every year.

CACert.org unfortunately is not a standard root CA yet and using CACert
means for a censor that your audience is not seeing the lock either thus
if they want to block they likely don't hurt too many folks.

Note that scanning sites for SSL certs and thus seeing the hostname for
that site allows the censor to do a lot of things: blocking on
properties of the cert, checking if the forward DNS lookup for that cert
matches the host it is served on.

IMHO certs in general give off too many details about a site making
scanning possible and easier to do along with easier to block.

> Does that clear any of your concerns at all? 

Definitely.

Greets,
 Jeroen