[tor-dev] Prototype Primitives for Website Traffic Fingerprinting defenses
Mike Perry
mikeperry at torproject.org
Tue Jul 29 23:57:07 UTC 2014
I've inserted a couple of corrections and clarifications below. I'm
leaving original text fully quoted for the archive.
Mike Perry:
> Hello all,
>
> What follows is a summary of the primitives that Marc Juarez aims to
> implement for his Google Summer of Code project on prototyping defenses
> for Website Traffic Fingerprinting and follow-on research.
>
> After a review of Tamaraw[1], Adaptive Padding[2], CS-BuFLO[3], and the
> Supersequence[4] work, as well as some discussion at the Tor dev
> meeting, we came up with this list of padding message primitives.
>
> These functions are meant to be sent as command cells from one endpoint
> to the other. Some are bidirectional commands, others can only be sent
> in one direction (client to relay, or relay to client).
>
> For now, they will be implemented as ad-hoc messages between two
> modified obfsproxy (called wfpadtools) instances. The source code for
> this fork lives here for the moment:
> https://bitbucket.org/mjuarezm/obfsproxy-wfpadtools/
>
> Long term, I am thinking that all of these should be specified as if
> they were their own RELAY_* cell type from tor-spec.txt:
> https://gitweb.torproject.org/torspec.git/blob/HEAD:/tor-spec.txt#l1215
>
> This means that padding would be a circuit-level property, and it would
> be possible to send it to and from any hop in the circuit, due to
> leaky-pipe topology (using the "recognized" field). In a world of very
> cheap and excessive middle and Guard node bandwidth, we would run this
> padding to the middle node. For the wfpadtools prototype, it will
> obviously only cover the first hop.
>
> Here's the list of primitives, broken down by research paper:
>
> Generic Messages (common to all defenses):
> RELAY_IGNORE()
> Description:
> Simple fixed-length (CELL_SIZE) padding cell.
> Direction:
> Bidirectional (client to relay, relay to client)
> Parameters:
> None
>
> RELAY_SEND_PADDING(N,t)
> Description:
> Send the requested number of padding cells in response.
> Direction:
> Client to relay
> Parameters:
> N:
> Number of padding cells to send in response to this cell
> t:
> microseconds delay before sending
>
> RELAY_APP_HINT(session_id, status)
> Description:
> A hint from the application layer for session start/stop
> Direction:
> Client to relay
> Parameters:
> session_id:
> A string identifying the session (ie keyed hash of
> url bar domain)
> status:
> "Start" or "Stop", indicating session start and end.
>
>
> Adaptive Padding Messages:
> RELAY_BURST_HISTOGRAM(histogram[], labels_ms[], remove_toks, fuzz, when)
> Description:
> Specifies a histogram that encodes a delay distribution
> representing the probability of sending a single padding packet
> after a given delay in response to either an upstream cell, or a
> client-originating cell.
> Direction:
> Client to relay
> Parameters:
> histogram[]:
> Contains delay distribution of sending an IGNORE packet
> after sending a real packet
> labels_ms[]:
> millisecond labels for the bins (with "Infinity"/"Ignore" bin to
> allow encoding the probability of not sending any padding packet in
> response to this packet).
In both of the Adaptive Padding primitives, a considerably more compact
form is possible for bin labeling. The original Adaptive Padding
literature used a 2^K exponential spacing between these bins, but didn't
rigorously analyze the choice. If experimentation finds this is indeed
reasonable/optimal, we can send a single parameter W instead of a
labels_ms. W would be bin width in milliseconds, and L is the length of
the histogram[] array. The labels would then be computed as W*2^K with K
ranging from 0 to L-2, with the addition of a zero-delay bin label.
For the prototype though, I think we should allow the labeling to remain
a free-form array, for maximum flexibility in research. We may decide we
want some form of polynomial spacing instead of exponential here, for
example.
> remove_toks:
> If true, follow Adaptive Padding token removal rules.
> If false, histograms are immutable.
> fuzz:
> If true, randomize the delay uniformly between bin labels.
> If false, use bin labels as exact delay values.
"fuzz" here might be better called "interpolate" in both this and the
below primitive, since linear interpolation is what I mean here.
I suppose we could imagine smoother types of interpolation, like
quadratic, cubic, etc, but I don't expect the actual interpolation
mechanism to matter a whole lot. I expect it to be most useful for
preventing the adversary from subtracting the padding due to its arrival
at a fixed delay quantity after another packet.
> when:
> If set to "receive", this histogram governs the probably of sending
> a padding packet after some delay in response to a packet
> originating from. If set to "send", this histogram governs padding
^
| insert "the client" after that "from"
> packets that are transmitted after a packet arrives from upstream
> (the middle node). In both cases, the padding packet is sent in the
> direction of the client.
>
> RELAY_GAP_HISTOGRAM(histogram[], labels_ms[], remove_toks, fuzz, when)
> Description:
> Specifies a histogram that encodes a delay distribution
> representing the probability of sending a single additional padding
> packet after a given delay following a padding packet that
> originated at this hop.
> Direction:
> Client to relay
> Parameters:
> histogram[]:
> Contains delay distribution of sending an IGNORE packet after sending
> an IGNORE packet
> labels_ms[]:
> millisecond labels for the bins (with "Infinity"/"Ignore" bin to
> allow encoding the probability of not sending any padding packet in
> response to this packet).
> remove_toks:
> If true, follow Adaptive Padding token removal rules.
> If false, histograms are immutable.
> fuzz:
> If true, randomize the delay uniformly between bin labels.
> If false, use bin labels as exact delay values.
>
> when:
> If "receive", this histogram applies to locally-inserted padding
> packets that were initially sent in response to client-originated
> data. If "send", this histogram applies to packets sent in response
> to locally-inserted padding packets sent in response to upstream
> data. Note that this means that implementations must maintain this
> metadata as internal state as the system transitions from
> BURST_HISTOGRAM initiated padding into GAP_HISTOGRAM initiated
> padding. In both cases, the padding packet is sent in the direction
> of the client.
>
> CS-BuFLO Messages:
> RELAY_TOTAL_PAD(session_id, t):
> Description:
> Requests that endpoint pad all batches to nearest 2^K cells total
> or until APP_HINT(session_id, stop)
> Direction:
> Client to relay
> Parameters:
> session_id:
> The session ID from APP_HINT()
> t:
> The number of microseconds to wait between cells to consider them
> part of the same batch.
>
> RELAY_PAYLOAD_PAD(session_id, t, R):
> XXX: The paper was not clear enough for me to understand this primitive
>
>
> Tamaraw:
> RELAY_BATCH_PAD(session_id, L, t)
> Description:
> Requests that endpoint pad all batches of cells to L cells total
> or until APP_HINT(session_id, stop)
> Direction:
> Client to relay
> Parameters:
> session_id:
> The session ID from APP_HINT()
> L:
> The multiple of cells to pad to.
> t:
> The number of microseconds to wait between cells to consider them
> part of the same batch.
>
>
>
> Comments, questions, and suggestions are welcome! In particular, I'm not
> sure I got either the CS-Buflo or the tamaraw primitives completely
> correct.
>
>
>
>
> 1. http://cacr.uwaterloo.ca/techreports/2013/cacr2013-30.pdf
> 2. http://freehaven.net/anonbib/cache/ShWa-Timing06.pdf
> 3. http://arxiv.org/abs/1401.6022
> 4. http://cacr.uwaterloo.ca/techreports/2014/cacr2014-05.pdf
>
>
> --
> Mike Perry
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
--
Mike Perry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20140729/02ddc443/attachment.sig>
More information about the tor-dev
mailing list