[tor-commits] [tor-browser-spec/master] Separate general fingerprinting defesnes from randomiation discussion.
mikeperry at torproject.org
mikeperry at torproject.org
Wed May 6 00:40:51 UTC 2015
commit 646e0e732053c9e91b8194fb5f3f83babe115460
Author: Mike Perry <mikeperry-git at torproject.org>
Date: Tue May 5 17:09:24 2015 -0700
Separate general fingerprinting defesnes from randomiation discussion.
---
design-doc/design.xml | 217 ++++++++++++++++++++++++++++++++++---------------
1 file changed, 153 insertions(+), 64 deletions(-)
diff --git a/design-doc/design.xml b/design-doc/design.xml
index 05d2e2b..47caa6e 100644
--- a/design-doc/design.xml
+++ b/design-doc/design.xml
@@ -1585,98 +1585,187 @@ url="https://amiunique.org/">Am I Unique</ulink>.
<title>General Fingerprinting Defenses</title>
<para>
-XXX: Stategies vs approaches? Approaches will include things like
-virtualization, spoofing, reimplementation, permissions, and disabling features..
+When implemented after an API or feature has been standardized and widely
+deployed, defenses to fingerprinting issues tend to take one of the following
+forms: value spoofing, subsystem reimplementation, virtualization, site
+permissions, and feature removal.
-Without looking at a particular fingerprinting vector there are basically two
-strategies to thwart fingerprinting attacks in general:
+ </para>
+ <orderedlist>
+ <listitem><command>Value Spoofing</command>
+ <para>
+
+Value spoofing can be used for simple cases where the browser directly provides some
+aspect of the user's configuration details, devices, hardware, or operating
+system directly to a website. It becomes less useful when the fingerprinting
+method is instead relying on API behavior.
+
+ </para>
+ </listitem>
+ <listitem><command>Subsystem Reimplementation</command>
+ <para>
+
+In cases where simple spoofing is not enough to properly conceal underlying
+device characteristics or operating system details, the underlying
+susbsystem that provides the functionality for a feature or API may need
+to be completely reimplemented. This is most common in cases where
+customizable or version-specific aspects of the user's operating system are
+visible through the browser's featureset or APIs, usually because the browser
+directly exposes OS-provided implementations of underlying features. In these
+cases, such OS-provided implementations must be replaced by a generic
+implementation, or at least an implementation wrapper that makes effort to
+conceal any user-customized aspects of the system.
-<orderedlist>
- <listitem>
- Making users uniform: This would render fingerprinting moot as it only works
- if there are detectable differences between targets.
+ </para>
</listitem>
- <listitem>
- Giving randomized values back: This would bury the real device
- characteristics within noise. That way a fingerprinter cannot be sure to
- identify a user upon (re-)visit of a website which is rendering
- fingerprinting ineffective.
+ <listitem><command>Virtualization</command>
+ <para>
+
+Virtualization is needed when simply reimplementing a feature in a different
+way is insufficient to fully conceal the underlying behavior. This is most
+common in instances of device and hardware fingerprinting, but since the
+notion of time can also be virtualized, it also can apply to any instance
+where an accurate measure of wallclock time is required for a fingerprinting
+vector to attain high accuracy.
+
+ </para>
</listitem>
- <listitem>Virtualization..</listitem>
- <listitem>Disabling features</listitem>
-</orderedlist>
+ <listitem><command>Site Permissions</command>
+ <para>
-Although there is some research <ulink
-url="http://research.microsoft.com/pubs/209989/tr1.pdf">suggesting</ulink> the
-second approach we think the former is currently a better suited heuristic for
-Tor Browser for a couple of reasons:
+In the event that virtualization is too expensive in terms of performance or
+engineering effort, and the relative expected usage of a feature is rare, site
+permissions can be used to prevent the usage of a feature execpt in cases
+where the user actually wishes to use it. Unfortunately, this mechanism
+becomes less effective once a feature becomes widely overused and abused by
+many websites, as warning fatigue quickly sets in for most users.
- <itemizedlist>
- <listitem>
+ </para>
+ </listitem>
+ <listitem><command>Feature/Functionality Removal</command>
+ <para>
+
+When extremely invasive features serve only a narrow domain or usecase, or
+there are alternate ways of accomplishing the same task, features and/or
+certain aspects of their functionality may be simply removed.
-It might not be possible to randomize all fingerprintable characteristics.
-While it seems plausible that many end-user configuration details that the
-browser currently exposes may be replaced by false information, this approach
-seems to break down when it is applied to deeper issues. In particular, it is
-not clear how to randomize the capabilities of hardware attached to a computer
-in such a way that it convincingly behaves like other hardware, while still
-providing a consistent experience to the user from site to site. Similarly,
-concealing operating system version differences through randomization will
-require an implementation of the underlying support code for every version
-your randomization is trying to mimick.
+ </para>
+ </listitem>
+ </orderedlist>
+ </sect3>
+ <sect3>
+ <title>Randomization or Uniformity?</title>
+ <para>
-In both cases, randomizatin requires virtualization of many underlying
-implementations, where as uniformity only requires virtualization of one
-implementation.
+When applying a form of defense to a specific fingerprinting vector or source,
+there are two general strategies available. Either the implementation for all
+users of a single browser implementation can be made to behave as uniformly as
+possible, or the user agent can attempt to randomize its behavior, so that
+each interaction between a user and a site provides a different fingerprint.
-XXX Virtualization
+ </para>
+ <para>
- </listitem>
- <listitem>
-Usability.
- </listitem>
- <listitem>
+Although <ulink url="http://research.microsoft.com/pubs/209989/tr1.pdf">some
+research suggests</ulink> that randomization can be effective, so far striving
+for uniformity has generally proved to be a better strategy for Tor Browser
+for the following reasons:
-It might not be easy to randomize values in a way that they are not
-distinguishable from noise. In particular, naive randomization
+ </para>
+ <orderedlist>
+ <listitem><command>Randomization is not a shortcut</command>
+ <para>
+
+While it appears that many end-user configuration details that the browser
+currently exposes may be safely replaced by false information, randomization
+of these details must be just as exhaustive as an approach that seeks to make
+these behaviors uniform. In the face of either strategy, the adversary can
+still make use of those features which have not been altered to be either
+sufficiently uniform or sufficiently random.
+
+ </para>
+ <para>
+
+Furthermore, the randomization approach seems to break down when it is applied
+to deeper issues where underlying system functionality is directly exposed. In
+particular, it is not clear how to randomize the capabilities of hardware
+attached to a computer in such a way that it either convincingly behaves like
+other hardware, or where the exact properties of the hardware that vary from
+user to user are sufficiently randomized. Similarly, truly concealing operating
+system version differences through randomization may require reimplementation
+of the underlying operating system functionality to ensure that every version
+that your randomization is trying to blend in with is covered by the range of
+possible behaviors.
+
+ </para>
</listitem>
- <listitem>
+ <listitem><command>Evaluation and measurement difficulties</command>
+ <para>
+
+The fact that randomization causes behaviors to differ slightly with every
+visit makes it appealing at first glance, but this same property makes it very
+difficult to objectively measure its effectiveness. By contrast, an
+implementation that strives for uniformity is very simple to measure. Despite
+their current flaws, a properly designed version of <ulink
+url="https://panopticlick.eff.org/">Panopticlick</ulink> or <ulink
+url="https://amiunique.org/">Am I Unique</ulink> could report the entropy and
+uniqueness rates for all users of a single user agent version, without the
+need for complicated statistics about the variance of the measured behaviors.
+
+ </para>
+ <para>
-Hard to measure success.
+Randomization (especially incomplete randomization) may also provide a false
+sense of security. When a fingerprinting attempt makes naive use of randomized
+information, a fingerprint will appear unstable, but may not actually be
+sufficiently randomized to prevent a dedicated adversary. Sophisticated
+fingerprinting mechanisms may either ignore randomized information, or
+incorportate knowledge of the distribution and range of randomized values into
+the creation of a more stable fingerprint (by either removing the randomness,
+modeling it, or averaging it).
+ </para>
</listitem>
- <listitem>
+ <listitem><command>Usability issues</command>
+ <para>
-Completeness. Randomization may provide a false sense of security - any items
-that are not randomized, or for which the randomization can be averaged away
-will still be desirable targets.
+When randomization is introduced to features that affect site behavior, it can
+be very distracting for this behavior to change between visits of a given
+site. For simple cases such as when this information affects layout behavior,
+this will lead to visual nuisances. However, when this information affects
+reported functionality or hardware characteristics, sometimes a site will
+function one way on one visit, and another way on a subsequent visit.
+ </para>
</listitem>
- <listitem>
+ <listitem><command>Performance costs</command>
+
+ <para>
Randomizing involves performance costs. This is especially true if the
fingerprinting surface is large (like in a modern browser) and one needs more
elaborate randomizing strategies (including randomized virtualization) to
-ensure that the randomization fully conceals the true behavior.
+ensure that the randomization fully conceals the true behavior. Many calls to
+a cryptographically secure random number generator during the course of a page
+load will both serve to exhaust available entropy pools, as well as lead to
+increased computation while loading a page.
+ </para>
</listitem>
- <listitem>
- Randomizing itself might introduce a new fingerprinting vector as the
- process of generating the values for the fingerprintable attributes
- could be susceptible to timing side-channel attacks.
- </listitem>
- </itemizedlist>
- We'll see in the next section that the idea of making users uniform does not
- work either in the general way expressed above mainly due to usability issues.
- However, we believe that it avoids a lot of the complications involved in
- randomization even if just used as a guiding principle.
- </para>
- </sect3>
+ <listitem><command>Increased vulnerability surface</command>
+ <para>
+Randomizing itself might introduce a new fingerprinting vector as the process
+of generating the values for the fingerprintable attributes could be itself
+susceptible to side-channel attacks, analysis, or exploitation.
+ </para>
+ </listitem>
+ </orderedlist>
+ </sect3>
<sect3 id="fingerprinting-defenses">
- <title>Fingerprinting Defenses in the Tor Browser</title>
+ <title>Specific Fingerprinting Defenses in the Tor Browser</title>
<para>
The following defenses are listed roughly in order of most severe
More information about the tor-commits
mailing list