[tor-bugs] #31788 [Core Tor/Tor]: Circuit padding trace simulator
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Sep 25 21:57:36 UTC 2019
#31788: Circuit padding trace simulator
-----------------------------------------------+------------------------
Reporter: mikeperry | Owner: (none)
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Core Tor/Tor | Version:
Severity: Normal | Resolution:
Keywords: circpad-researchers-want, wtf-pad | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------------------------+------------------------
Changes (by mikeperry):
* cc: dgoulet, nickm (added)
Comment:
Ok, so here's a the specific details on the approach.
This simulator needs two components. First: instrumentation of Tor itself,
so that researchers can produce reference trace sets from a crawl. Second:
A unit-test driven defense simulator, which reads trace files, applies
padding machines to them, and outputs new defense-applied trace files.
~~~~~~~~~~~~
The undefended reference crawl traces must be collected at three
locations: the client, the guard, and the middle node.
For the client and the middle node, the simplest place to collect these
traces are in the circpad_cell_event_* callbacks, as well as the
circpad_machine_event_* callbacks. Those callbacks can have added loglines
that print out a timestamp, an event type, and a cell direction. Note also
that the patches in #29494 will make these callback locations more closely
reflect actual network wire write time.
For the guard node, we can allow our tracer clients to use a special magic
third padding machine (if we bump CIRCPAD_MAX_MACHINES and increase the
field width) just to hop 1. Then, to signal to the guard node which
circuits to record, a PADDING_NEGOTIATE can be sent the guard, so it can
mark circuits as belonging to the researcher, for safe collection, by way
of adding a dummy padding machine. The same circpad_cell_event callbacks
as were used for the client and middle can then be used for guard
collection in that case, since a padding machine will be open there, only
on the researcher's circuits.
Be warned though: this guard approach may get hairy because of the need to
use index 3 on the client, but index 1 on the guard, as well as make sure
the machine_index bitfield and CIRCPAD_MAX_MACHINES are bumped to 3 for
the client.
The timing differences for cell transit time from middle to guard and
client to guard should be used to compute some statistics for simulating
this delay upon defense application, as the simulator will not have a
guard node.
Alternatively, all of the classifier input could simply use client traces,
with an appropriate model for guard latency built in. This will eliminate
the need to pin the guard node in all crawls. However, without at least
some live traces for reference, this model may miss out in varying
congestion and queuing delays on the guard node from the live scenario.
~~~~~~~~~~~~~~~~
For the simulator, our unit testing framework should be used to set up
client and relay padding machines on mock circuits, with a mock circuitmux
between them. Trace files can then be read, advance mocked monotime as per
the timestamps in the trace, and and call the appropriate
circpad_cell_event_* and circpad_machine_event_* callbacks as per the
trace file.
Example unit tests that already do this sort of thing are
test_circuitpadding_rtt(), test_circuitpadding_negotiation(), and
test_circuitpadding_circuitsetup_machine() (see #30578 for a fixup of this
last test... it may be the most useful) in
./src/test/test_circuitpadding.c.
There is an extra wrinkle that after each event, the simulator must check
if the next scheduled padding packet (if any) would happen before or after
the next timestamp in the trace. If it would happen before the next
timestamp, we need to only advance monotime to that next padding callback,
and not to the next timestamp in the trace, so that the callback can fire
with an appropriately correct timestamp from monotime.
This simulator will also not measure the delays from libevent to the
callbacks, because mocked monotime will prevent it from doing so. To
capture this delay, calls to tor_gettimeofday() before advancing monotime,
and then again in the callbacks can be used to try to measure this delta
and move mocked monotime forward appropriately, but this may still be
short of reality, since on relays especially, other libevent event
callbacks will delay padding callbacks if they are made first.
I added nickm and dgoulet to Cc since they are most familiar with how
circuitmux, etc looks on the wire, and how libevent callbacks in a mocked
scenario may differ from reality, and to generally sanity check the above.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/31788#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list