Tor seems to have a huge security risk--please prove me wrong!

Paul Syverson syverson at itd.nrl.navy.mil
Mon Aug 30 12:52:59 UTC 2010


On Sun, Aug 29, 2010 at 05:13:21PM -0700, Mike Perry wrote:
> Thus spake Paul Syverson (syverson at itd.nrl.navy.mil):
> 
> > On Sun, Aug 29, 2010 at 12:54:59AM -0700, Mike Perry wrote:
> > > Any classifier needs enough bits to differentiate between two
> > > potentially coincident events. This is also why Tor's fixed packet
> > > size performs better against known fingerprinting attacks. Because
> > > we've truncated the lower 8 bits off of all signatures that use size
> > > as a feature in their fingerprint classifiers. They need to work to
> > > find other sources of bits.
> > 
> > I disagree. Most of what you say about base rates etc. is valid and
> > should be taken into account, but that is not the only thing that is
> > going on. First, you have just stated one reason that correlation
> > should be easier than fingerprinting but then tried to claim it as
> > some sort of methodological flaw. Truncating the lower 8 bits does
> > have a significant impact on fingerprinting but little impact on
> > correlation because of the windows and datasets, just like you said.
> > But way more importantly, fingerprinting is inherently a passive
> > attack. You are sifting through a pile of known fingerprints looking
> > for matches and that's all you can do as an attacker. But its easy to
> > induce any timing signature you want during a correlation attack. (It
> > seems to be completely unnecessary because of point 1, but it would be
> > trivial to add that if you wanted to.) Tor's current design has no
> > mechanism to counter active correlation. Proposed techniques, such as
> > in the recent paper by Aaron, Joan, and me, are clearly too expensive
> > and iffy at this stoge of research. This is totally different for
> > fingerprinting. One could have an active attack similar to
> > fingerprinting in which one tries to alter a fingerprint to make it
> > more unique and then look for that fingerprint.  I don't want to get
> > into a terminological quibble, but that is not what I mean by
> > fingerprinting and would want to call it something else or start
> > calling fingerprinting 'passive fingerprinting', something like that.
> > Then there is the whole question of how effective this would be,
> > plus a lot more details to say what "this" is, but anyway I think
> > we have good reason to treat fingerprinting and correlation as different
> > but related problems unless we want to say something trivial like
> > "They are both just instances of pattern recognition."
> 
> Ah, of course. What I meant to say then was that "passive
> fingerprinting" really is the same problem as "passive correlation". 

But there might be significant value to solving just passive
fingerprinting relative to cost whereas the value of solving just
passive correlation seems really tiny if it leaves active correlation
untouched. More below.

> 
> I don't spend a whole lot of time worrying about the "global *active*
> adversary", because I don't believe that such an adversary can really
> exist in practical terms. However, it is good that your research
> considers active adversaries in general, because they can and do exist
> on more localized scales.
> 
> I do believe that the "global external passive adversary" does exist
> though (via the AT&T secret rooms that splice cables and copy off
> traffic in transit), and I think that the techniques used against
> "passive fingerprinting" can be very useful against that adversary. I
> also think a balance can be found to provide defenses against the
> "global external passive adversary" to try to bring their success
> rates low enough that their incentive might switch to becoming a
> "local internal adversary", where they have to actually run Tor nodes
> to get enough information to perform their attacks.
>  
> This is definitely a terminological quibble, but I think it is useful
> to consider these different adversary classes and attacks, and how
> they relate to one another. I think it is likely that we are able to
> easily defeat most cases of dragnet surveillance with very good
> passive fingerprinting defenses, but that various types of active
> surveillance may remain beyond our (practical) reach for quite some
> time.

I don't share your belief about global external passive adversaries on
the current Tor network. I do find it plausible that there could be
(but no idea if there actually are) widespread adversaries (internal
and/or external) capable of attacking double-digit percentages of the
network; however I don't think they would be anything approaching
global. But we can agree to disagree on our speculations here.  Your
paranoia may vary.

My main concern is that your characterization implies a false
dichotomy by assuming an adversary's capabilities are uniform wherever
he may exist, either active everywhere or passive everywhere (and
let's ignore for now that each of 'active' and 'passive' cover a
variety of attackers).

This is central to the distinction between fingerprinting and
correlation. An adversary who is interested in what you are looking at
via Tor only has to be active at your location (your network
connection to Tor, your guards, whatever) and passive everywhere else
(global for you, many places but somewhat less than global for me) to
defeat any measure that only works against passive adversaries. (I
think we agree that all the published research indicates this is
currently unnecessary---passive attacks are enough to indentify Tor
connections at present.) My point here is one that I have not changed
for a decade, viz: adding countermeasures against a passive adversary
can be trivially defeated by a minimally active adversary. And by
'minimally' I mean both the nature of the attack (just playing a bit
with the timing of traffic. Published research already indicates that
one can do so in a way that creates an undetectable signal) and the
distribution of the attack (the adversary only has to be active at one
point).

Looking at the destination side, an adversary wondering who is
visiting a particular web server need only be active at the
destination end. He can be passive anywhere else. But such a minimally
active adversary cannot attack connection sources via destination
fingerprinting.  (This is not to deny the usefulness of continuing to
research fingerprints of existing destinations that are common or
particularly interesting, nor the usefulness of researching actively
making a destination fingerprint more unique to facilitate attack---the
active fingerprinting that we have already agreed is not what either
of us meant by "fingerprinting" in this discussion.)

>  
> > > Personally, I believe that it may be possible to develop fingerprint
> > > resistance mechanisms good enough to also begin to make inroads
> > > against correlation, *if* the network is large enough to provide an
> > > extremely high event rate. Say, the event rate of an Internet-scale
> > > anonymity network.
> > > 
> > > For this reason, I think it is very important for academic research to
> > > clearly state their event rates, and the entropy of their feature
> > > extractors and classifiers. As well as source code and full data
> > > traces, so that their results can be reproduced on larger numbers of
> > > targets and with larger event rates, as I mentioned in my other reply.
> > 
> > We don't have the luxury of chemistry or even behavioral stuff like
> > population biology of some species of fish to just hand out full
> > traces. There's this pesky little thing user privacy that creates a
> > tension we have that those fields don't. We could also argue more
> > about the nature of research and publication criteria, but I suspect
> > that we will quickly get way off topic in such a discussion, indeed
> > have already started.
> 
> In most cases, we pretty intensely frown on these attacks on the live
> Tor network, even for research purposes, so I don't think anyone is
> asking for live user traces. However most of this research is done in
> simulation, and it is rare if ever that the source code for the attack
> setup, or the simulation traces are ever provided.

Have you just asked for it? Other than for privacy purposes or because
of some intellectual property or other release headaches I don't think
this should be a problem.

> 
> As I said before, it would be great if we could develop a common
> gold-standard simulator that we could use for all of this research.
> The UCSD people may be building something like this, but I also seem
> to recall Steven Murdoch being interested in providing some model or
> corpus for base-line comparison of all timing attack literature.
> 
> I don't think this is too off-topic, because I am saying that this
> openness is what we need to be able to effectively study timing
> attack and defense. I don't think it will be possible to succeed
> without it.

I agree. I meant arguing in terms of value generalizations about the
broad areas of study and their research/publication standards rather
than talking about what we need from research to solve our problems
and what Tor needs to do so that researchers will want to work on
them. In that sense Tor has had a very good track record to date.

aloha,
paul
***********************************************************************
To unsubscribe, send an e-mail to majordomo at torproject.org with
unsubscribe or-talk    in the body. http://archives.seul.org/or/talk/



More information about the tor-talk mailing list