[tor-dev] Proposal: Padding Negotiation

Mon Sep 14 01:43:00 UTC 2015

Tim Wilson-Brown - teor:
> Hi Mike,
> 
> Just a few questions about the proposal, inline below:
> 
> > On 12 Sep 2015, at 10:34, Mike Perry <mikeperry at torproject.org> wrote:
> > 
> > Here's a proposal describing some padding negotiation cell primitives that
> > should be useful to defend against website traffic fingerprinting, hidden
> > service circuit fingerprinting, and probably just about any other traffic
> > analysis attack under the sun.
> > 
> > The proposal is in my git remote at:
> > https://gitweb.torproject.org/user/mikeperry/torspec.git/tree/proposals/xxx-padding-negotiation.txt?h=padding_negotiation
> > 
> > ==============================
> > 
> > Filename: xxx-padding-negotiation.txt
> > Title: Padding Negotiation
> > Authors: Mike Perry
> > Created: 01 September 2015
> > Status: Draft
> > 
> > 
> > ...
> > 
> > 
> > 3. End-to-end circuit padding
> > 
> > For circuit-level padding, we need two types of additional features: the
> > ability to schedule additional incoming cells at one or more fixed
> > points in the future, and the ability to schedule a statistical
> > distribution of arbitrary padding to overlay on top of non-padding
> > traffic (aka "Adaptive Padding").
> > 
> > In both cases, these messages will be sent from clients to middle nodes
> > using the "leaky pipe" property of the 'recognized' field of RELAY
> > cells, allowing padding to originate from middle nodes on a circuit in a
> > way that is not detectable from the Guard node.
> > 
> > This same mechanism can also be used to request padding from the Guard
> > node itself, to achieve link-level padding without the additional
> > overhead requirements on middle nodes.
> > 
> > 3.1. Fixed-schedule padding message (RELAY_COMMAND_PADDING_SCHEDULE)
> > 
> > The fixed schedule padding will be encoded in a
> > RELAY_COMMAND_PADDING_SCHEDULE cell. It specifies a set of up to 80
> > fixed time points in the future to send cells.
> 
> Do we want to add a length field, for schedules with less than 80 time points?
> Or is this unnecessary?

At first, I thought this was unnecessary, as it is completely acceptable
to leave unused fields as 0s in my definition. Since Trunnel should make
working with variable fields more safe and less cumbersome, I've
simplified this to use variable lengths.

> > XXX: 80 timers is a lot to allow every client to create. We may want to
> > have something that checks this structure to ensure it actually
> > schedules no more than N in practice, until we figure out how to
> > optimize either libevent or timer scheduling/packet delivery. See also
> > Section 4.3.
> > 
> > The RELAY_COMMAND_PADDING_SCHEDULE body is specified in Trunnel as
> > follows:
> > 
> >    struct relay_padding_schedule {
> >       /* Number of microseconds before sending cells (cumulative) */
> >       u32 when_send[80];
> > 
> >       /* Number of cells to send at time point sum(when_send[0..i]) */
> >       u16 num_cells[80];
> > 
> >       /* Adaptivity: If 1, and server-originating cells arrive before the
> >          next when_send time, then decrement the next non-zero when_send
> >          index, so we don't send a padding cell then, too */
> >       u8 adaptive IN [0,1];
> 
> Is this the definition of “adaptivity" that we want?
>
> I can see that it works in the scenario where a client wants a minimum of
> X cells delivered at-or-before time T (adaptively), where T is a single schedule
> arbitrarily far in the future.

This cell is meant to request a one-shot amount of padding, not a
periodic cycle. Anything periodic like this should use Adaptive padding.

> But if a client wants a minimum of X cells delivered every T microseconds
> (adaptively), then there’s no way to reliably specify that using this scheme. If
> 10*X cells originate at the server due to regular traffic in the first interval, that will
> cause the next 10 intervals to be skipped.
> 
> I can see that the histograms work around this issue by refilling with the original
> figures, but I’m not sure that works for fixed-point padding.

I believe that we should ensure that the Adaptive Padding machine can
specify this and any other type of periodic padding. If it can't, we
should generalize it further, not add another primitive for specifying
periodic padding.

In fact, I've pushed updates to the adaptive padding rules in order to
allow three different versions of what you've asked, and provided the
example machine specifications for these examples. Please let me know if
you feel that we still need further flexibility to cover another related
example. You probably want a pen and paper handy to draw the state
machines and their transitions out as you read through the rules.

Example 1: X cells delivered over T microseconds, spread evenly. If a
non-padding cell is sent, padding is skipped for the next T/X
microseconds.

 num_machines = 1;

 machines[0].transition_burst_events = 0;

 machines[0].burst.histogram_len = 2;
 machines[0].burst.histogram = {1, 0};

 machines[0].burst.transition_reschedule_events = 
      RELAY_PADDING_TRANSITION_EVENT_NONPADDING_SENT |
      RELAY_PADDING_TRANSITION_EVENT_PADDING_SENT;

 machines[0].burst.remove_toks = 0;
 machines[0].burst.start_usec = T/X;
 machines[0].burst.max_sec = MAX_INT16; // Minimizes bin 0's width

 machines[0].burst.transition_start_events = 0;
 machines[0].burst.transition_gap_events = 0;

Example 2: X cells delivered in a consecutive burst, every T
microseconds. Bursts are delayed if there is any non-padding activity
prior to a burst. Any non-padding cell during a burst means one less
padding cell during that burst.

 num_machines = 1;

 machines[0].transition_burst_events = 0;

 machines[0].burst.histogram_len = 2;
 machines[0].burst.histogram = {1, 0};

 machines[0].burst.transition_reschedule_events =
    RELAY_PADDING_TRANSITION_EVENT_NONPADDING_SENT;

 machines[0].burst.remove_toks = 0;
 machines[0].burst.start_usec = T;
 machines[0].burst.max_sec = MAX_INT16; // Minimizes bin 0's width

 machines[0].burst.transition_start_events = 0;
 machines[0].burst.transition_gap_events =
   RELAY_PADDING_TRANSITION_EVENT_PADDING_SENT;

 machines[0].gap.histogram_len = 2;
 machines[0].gap.histogram = {X-1, 0};

 machines[0].gap.transition_reschedule_events =
    RELAY_PADDING_TRANSITION_EVENT_NONPADDING_SENT |
    RELAY_PADDING_TRANSITION_EVENT_PADDING_SENT;

 machines[0].gap.remove_toks = 1;
 machines[0].gap.start_usec = 0;
 machines[0].gap.max_sec = MAX_INT16; // Minimizes bin 0's width

 machines[0].gap.transition_start_events = 0;
 machines[0].gap.transition_burst_events =
   RELAY_PADDING_TRANSITION_EVENT_BINS_EMPTY;

Example 3: X cells delivered in a consecutive burst, every T
microseconds. Bursts are triggered if there is any non-padding activity
before T expires, and another burst is scheduled T microseconds later.
Any non-padding cell during a burst means one less padding cell during
that burst.

 num_machines = 1;

 machines[0].transition_burst_events = 0;

 machines[0].burst.histogram_len = 2;
 machines[0].burst.histogram = {1, 0};

 machines[0].burst.transition_reschedule_events = 0;

 machines[0].burst.remove_toks = 0;
 machines[0].burst.start_usec = T;
 machines[0].burst.max_sec = MAX_INT16; // Minimizes bin 0's width

 machines[0].burst.transition_start_events = 0;
 machines[0].burst.transition_gap_events =
   RELAY_PADDING_TRANSITION_EVENT_NONPADDING_SENT |
   RELAY_PADDING_TRANSITION_EVENT_PADDING_SENT;

 machines[0].gap.histogram_len = 2;
 machines[0].gap.histogram = {X-1, 0};

 machines[0].gap.transition_reschedule_events =
    RELAY_PADDING_TRANSITION_EVENT_NONPADDING_SENT |
    RELAY_PADDING_TRANSITION_EVENT_PADDING_SENT;

 machines[0].gap.remove_toks = 1;
 machines[0].gap.start_usec = 0;
 machines[0].gap.max_sec = MAX_INT16; // Minimizes bin 0's width

 machines[0].gap.transition_start_events = 0;
 machines[0].gap.transition_burst_events =
   RELAY_PADDING_TRANSITION_EVENT_BINS_EMPTY;

The examples required the following proposal changes (which I've pushed
to that remote):

1. A zero machines.transition_burst_events means an immediate
   transition to the burst state.
2. Added RELAY_PADDING_TRANSITION_EVENT_BINS_EMPTY to allow a state
   transition when the bins become empty.

> >    };
> > 
> > To allow both high-resolution time values, and the ability to specify
> > timeout values far in the future, the time values are cumulative. In
> > other words, sending a cell with when_send = [MAX_INT, MAX_INT, MAX_INT,
> > 0...] and num_cells = [0, 0, 100, 0...] would cause the relay to reply
> > with 100 cells in 3*MAX_INT microseconds from the receipt of this cell.
> > 
> > 3.2. Adaptive Padding message (RELAY_COMMAND_PADDING_ADAPTIVE)
> > 
> > ...
> > 
> > 3.2.4. Token removal and refill
> > 
> > If the remove_tok field is set to true for a given state's histogram,
> > then whenever a padding packet is sent, the corresponding histogram
> > bin's token count is decremented by one.
> > 
> > If a packet matching the current state's transition_reschedule_events
> > bitmask arrives from the server before the chosen padding timer expires,
> > then a token is removed from the nearest non-empty bin corresponding to
> > the delay since the last packet was sent, and the padding packet timer
> > is re-sampled from the histogram.
> 
> Does nearest mean “next highest” or “closest”?

The original Adaptive Padding paper did indeed specify "next highest".
"Closest" makes more sense to me, but I've gone ahead and parameterized
this in the proposal, in case optimal token removal policy turns out to
vary depending on application.

-- 
Mike Perry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20150913/bf96c58e/attachment.sig>