[tor-bugs] #31788 [Core Tor/Tor]: Circuit padding trace simulator

Wed Sep 25 21:57:36 UTC 2019

#31788: Circuit padding trace simulator
-----------------------------------------------+------------------------
 Reporter:  mikeperry                          |          Owner:  (none)
     Type:  enhancement                        |         Status:  new
 Priority:  Medium                             |      Milestone:
Component:  Core Tor/Tor                       |        Version:
 Severity:  Normal                             |     Resolution:
 Keywords:  circpad-researchers-want, wtf-pad  |  Actual Points:
Parent ID:                                     |         Points:
 Reviewer:                                     |        Sponsor:
-----------------------------------------------+------------------------
Changes (by mikeperry):

 * cc: dgoulet, nickm (added)

Comment:

 Ok, so here's a the specific details on the approach.

 This simulator needs two components. First: instrumentation of Tor itself,
 so that researchers can produce reference trace sets from a crawl. Second:
 A unit-test driven defense simulator, which reads trace files, applies
 padding machines to them, and outputs new defense-applied trace files.

 ~~~~~~~~~~~~

 The undefended reference crawl traces must be collected at three
 locations: the client, the guard, and the middle node.

 For the client and the middle node, the simplest place to collect these
 traces are in the circpad_cell_event_* callbacks, as well as the
 circpad_machine_event_* callbacks. Those callbacks can have added loglines
 that print out a timestamp, an event type, and a cell direction. Note also
 that the patches in #29494 will make these callback locations more closely
 reflect actual network wire write time.

 For the guard node, we can allow our tracer clients to use a special magic
 third padding machine (if we bump CIRCPAD_MAX_MACHINES and increase the
 field width) just to hop 1. Then, to signal to the guard node which
 circuits to record, a PADDING_NEGOTIATE can be sent the guard, so it can
 mark circuits as belonging to the researcher, for safe collection, by way
 of adding a dummy padding machine. The same circpad_cell_event callbacks
 as were used for the client and middle can then be used for guard
 collection in that case, since a padding machine will be open there, only
 on the researcher's circuits.

 Be warned though: this guard approach may get hairy because of the need to
 use index 3 on the client, but index 1 on the guard, as well as make sure
 the machine_index bitfield and CIRCPAD_MAX_MACHINES are bumped to 3 for
 the client.

 The timing differences for cell transit time from middle to guard and
 client to guard should be used to compute some statistics for simulating
 this delay upon defense application, as the simulator will not have a
 guard node.

 Alternatively, all of the classifier input could simply use client traces,
 with an appropriate model for guard latency built in. This will eliminate
 the need to pin the guard node in all crawls. However, without at least
 some live traces for reference, this model may miss out in varying
 congestion and queuing delays on the guard node from the live scenario.

 ~~~~~~~~~~~~~~~~

 For the simulator, our unit testing framework should be used to set up
 client and relay padding machines on mock circuits, with a mock circuitmux
 between them. Trace files can then be read, advance mocked monotime as per
 the timestamps in the trace, and and call the appropriate
 circpad_cell_event_* and circpad_machine_event_* callbacks as per the
 trace file.

 Example unit tests that already do this sort of thing are
 test_circuitpadding_rtt(), test_circuitpadding_negotiation(), and
 test_circuitpadding_circuitsetup_machine() (see #30578 for a fixup of this
 last test... it may be the most useful) in
 ./src/test/test_circuitpadding.c.

 There is an extra wrinkle that after each event, the simulator must check
 if the next scheduled padding packet (if any) would happen before or after
 the next timestamp in the trace. If it would happen before the next
 timestamp, we need to only advance monotime to that next padding callback,
 and not to the next timestamp in the trace, so that the callback can fire
 with an appropriately correct timestamp from monotime.

 This simulator will also not measure the delays from libevent to the
 callbacks, because mocked monotime will prevent it from doing so. To
 capture this delay, calls to tor_gettimeofday() before advancing monotime,
 and then again in the callbacks can be used to try to measure this delta
 and move mocked monotime forward appropriately, but this may still be
 short of reality, since on relays especially, other libevent event
 callbacks will delay padding callbacks if they are made first.

 I added nickm and dgoulet to Cc since they are most familiar with how
 circuitmux, etc looks on the wire, and how libevent callbacks in a mocked
 scenario may differ from reality, and to generally sanity check the above.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31788#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online