[tor-dev] WTF-PAD and the future
George Kadianakis
desnacked at riseup.net
Sun Jul 29 13:42:43 UTC 2018
Mike Perry <mikeperry at torproject.org> writes:
> George Kadianakis:
>> Hello Mike,
>>
>> I had a talk with Marc and Mohsen today about WTF-PAD. I now understand
>> much more about WTF-PAD and how it works with regards to histograms. I
>> think I might even understand enough to start some sort of conversation
>> about it:
>>
>> Here are some takeaways:
>>
>> 1) Marc and Mohsen think that WTF-PAD might not be the way forward
>> because of its various drawbacks and its complexity. Apparently there
>> are various attacks on WTF-PAD that Roger has discovered (SENDME
>> cells side-channels?) and also the deep learning crowd has done some
>> pretty good damage to the WTF-PAD padding (90%-60% accuracy?). They
>> also told me that achieving needed precision on the timings might be
>> a PITA.
>
> Are there citations for any of this? Last I heard Matt Wright was
> working on a deep learning study but the results were mixed.
>
I think this is the best we have in terms of public results:
https://arxiv.org/abs/1801.02265
>> 2) From what I understand you are also hoping to use WTF-PAD to protect
>> against circuit fingerprinting and not just website
>> fingerprinting. They told me that while this might be plausible,
>> there is no current research on how well it can achieve that. Are we
>> hoping to do that? And what research remains here? How can I help?
>> Which parts of the Tor circuit protocol are we hoping to hide?
>
> I am designing WTF-PAD to be a framework for deploying padding against
> arbitrary traffic analysis attacks. It is meant to allow us to define
> histograms on the fly (in the Tor consensus) as these are studied. The
> fact that they have not yet been studied is not super relevant to
> deploying the framework for it now.
>
ACK.
What other traffic analysis attacks are we looking at addressing here?
I'm thinking of stuff like "circuit fingerprinting of onion services",
but I wonder if histograms and random sampling is too crude to actually
be able to help against sophisticated attacks. I don't have a suggestion
for something better currently.
On that topic, is it decided whether the adaptive padding of WTF-PAD
will also happen during circuit construction, or only after that?
>> 3) Marc and Mohsen suggested using application-layer defences because
>> the application-layer has much better view of the actual structures
>> that are sent on the wire, instead of the black box view that the
>> network layer has.
>>
>> In particular they were mainly concerned about onion services
>> fingerprinting because they are part of a restricted closed world,
>> whereas they were less concerned about the entire internet because of
>> its vast size.
>>
>> They suggested that we could investigate using the service-side
>> "alpaca" library for onion services (e.g. as part of securedrop?)
>> which should resolve the most pressing concern of HS identification.
>
> I mean yeah application-layer defenses are useful for website traffic
> fingerprinting, but that is a very narrow slice of the traffic analysis
> problems that I want this framework to solve.
>
> WTF-PAD also doesn't rule out hidden service operators using alpaca,
> either.
>
Agreed.
>> 4) They also told me of research by Tobias Pulls which eliminates the
>> needs for histograms in WTF-PAD and instead it samples from the
>> probability distribution directly. They think that this can simplify
>> things somewhat. Any thoughts on this?
>
> Yes this is actually exactly what I want to do with the next iteration
> of WTF-PAD! The question is what form/model to use for these probability
> distributions. Right now we're encoding inter-burst and inter-packet
> timings with some weird geometric distribution determining how long
> these bursts should go on for, when it might be more natural to encode
> and sample from length-based distributions/histograms.
>
> (Histograms vs distribution is not the problem -- its what they encode
> and how they encode it that matters).
>
> I don't see this paper on Tobias's website. Is it up anywhere yet?
>
Hmm. Looking at the README of wtfpad (see the APE section), I think this
blog post is the best resource we have on this:
https://www.cs.kau.se/pulls/hot/thebasketcase-ape/
More information about the tor-dev
mailing list