Tor seems to have a huge security risk--please prove me wrong!
Paul Syverson
syverson at itd.nrl.navy.mil
Sat Aug 28 15:20:41 UTC 2010
Hi Hikki,
What you describe is known in the literature as website fingerprinting
attacks, and there have been several research papers published about
them. Consult freehaven.net/anonbib or type "website fingerprinting"
in your favorite search engine. I think the most recent paper on this
is "Website fingerprinting: attacking popular privacy enhancing
technologies with the multinomial naïve-bayes classifie" by Herman
et al. at the 2009 ACM CCSW (Cloud Computing Security Workshop). It
will cite much of the relevant previous literature.
Roughly, while Tor is not invulnerable to such an attack, it fairs
pretty well, much better than other systems that this and earlier
papers examined mostly because the uniform size cells that Tor moves
all data with adds lots of noise.
The ability to identify destinations without seeing the destination
end of a connection (even with pretty low probability of typical
success) remains worthy of continued examination and analysis.
But end-to-end correlation remains the most significant
fact-of-life for all practical low-latency systems, including Tor.
aloha,
Paul
On Sat, Aug 28, 2010 at 06:51:13AM -0400, hikki at Safe-mail.net wrote:
> There are a lot of discussions going on over at the Onion Forum, a Tor hidden service board, regarding a possible attack on the Tor's anonymity and safety. It's called "classifier attacks" and seems to be a high probability attack that may in a way unmask the encryption used by Tor, and in addition to that reveal the source as in the user using Tor as the first part of the chain.
>
> This subject seems to be either very unknown or very well silenced. So therefore I'm very interesting about what the users of this mailing list have to say about this.
>
> ----------
>
> http://l6nvqsqivhrunqvs.onion/index.php?do=topic&id=12078
>
> Here are two concerning posts:
>
> -- QUOTE START --
>
> It's really not that hard to understand the attack I don't see why everyone is having such a hard time to get it.
>
> You encrypt X with a key and the output is Y. There are 2^256 possible Y values, with a 256 bit Initialization vector. This means each time you encrypt X, even with the same key, the resulting Y is a different bit string. The Bit string of X becomes impossible to get unless you have the key and Y. So, the encrypted information itself can not be fingerprinted because there are 2^256 possible ciphertexts for a given plaintext/key.
>
> However, the SIZE that X will be after encrypted can be determined. X always produces a Y of the same size when encrypted with a given key length, even though there are 2^256 possible ciphertexts there is ONE possible size for Y.
>
> This by itself isn't that bad for small data. Cat and Dog produce the same output size for the same key. Once you start getting into really big things, like motion pictures etc, then it starts to be a lot more damaging because there are not a whole lot of things that are 329,384,394,231 bits, and by looking at the Y value you can tell how many bits the X value was if you know the algorithm used. Classifier attacks work better with SIZE.
>
> However, complexity is another issue. If there is a website with 25 small images on it, then the adversary can see the size of all these different encrypted images you are loading. Each image can be seen by the adversary as a different object, and the size of these objects can be determined. Also, if you follow links on a page that you vist, the adversary can see the same data for each of these pages and become more and more certain of what you are doing. Classifier attacks work better with COMPLEXITY.
>
> If you encrypt LARGE data, or COMPLEX SETS of data, it does not matter if you use AES-256....the bitstring of X can not be derived with Y with out the key, but enough characteristics of X stay in Y that the adversary can with high probability say what Y would PROBABLY decrypt into if they had the key. This does require the adversary to have SEEN the value of X at some point prior to it being encrypted, but this is not really that hard now is it? Tor is used to PROTECT YOU incase there IS an insider in your group....but an insider in your group can fingerprint X regardless of if it is CP, a drug forum or a secret military document.
>
> Understand?
>
> -- QUOTE END --
>
> -- QUOTE START --
>
> Oh yeah, it can be done with layers too so its not just the entry node / infrastructure to worry about, although that is the biggest worry since you are next in the chain.
>
> X -> Y
> Y -> Z
> Z -> U
>
> U can be used to determine the size of Z, Z can be used to determine the size of Y, Y can be used to determine the size of X.
>
> Layer encrypted data can still be classified, its just the relay node isn't looking for the fingerprint of X it is looking for the fingerprint of Y which it can get with Z.
>
> -- QUOTE END --
> ***********************************************************************
> To unsubscribe, send an e-mail to majordomo at torproject.org with
> unsubscribe or-talk in the body. http://archives.seul.org/or/talk/
***********************************************************************
To unsubscribe, send an e-mail to majordomo at torproject.org with
unsubscribe or-talk in the body. http://archives.seul.org/or/talk/
More information about the tor-talk
mailing list