[tor-dev] [GSOC16] Fingerprint Central - Status report n°2

Fri Jun 24 14:24:29 UTC 2016

On 06/24/2016 02:27 PM, Nicolas Vigier wrote:
> On Fri, 24 Jun 2016, Pierre Laperdrix wrote:
> 
>>
>>
>> On 06/23/2016 05:19 PM, Nicolas Vigier wrote:
>>> On Fri, 17 Jun 2016, Pierre Laperdrix wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> Here is my second status report for my GSOC project.
>>>> A little reminder that the repo is located on GitHub:
>>>> https://github.com/plaperdr/fp-central
>>>
>>> I have looked at this quickly, and the system to define the attribute
>>> tests seems nice. Is there an option at the end of the tests to download
>>> a file containing all the attributes collected?
>>>
>>
>> I don't know if this is what you are looking for but I've just added a
>> way to download a JSON file containing all the information in the
>> fingerprint. Here is the commit:
>> https://github.com/plaperdr/fp-central/commit/6f09ea5b88cf850f7be14af950e928327f0ded6c
> 
> Thanks!
> 
>>>> 1 - I have progressed faster than I expected in the last two weeks. Here
>>>> is everything that I have done:
>>>> - Storage of fingerprints in a MongoDB database
>>>> - Adding a small API to get statistics on stored variables
>>>> - Adding support of hashed variables for faster stats computation
>>>> - Adding collection of new attributes and support of HTTP headers
>>>> - Adding support of translation with the start of a French version
>>>>
>>>> 2 - I also started development of a page to tell if a user has an
>>>> "acceptable" fingerprint or not (I haven't pushed the code to GitHub
>>>> yet). So far, I'm verifying that the screen resolution is in the correct
>>>> bounds (i.e. not fullscreen) and that there are no plugins in the
>>>> browser. If anyone has any idea that I could implement to help users
>>>> have a less recognizable fingerprint, I'll be happy to add it. I have
>>>> also added steps to follow to help people better configure their browser.
>>>
>>> This "acceptable" fingerprint page is a good idea. However, unless I
>>> misunderstood your latest commit, it seems to be done as a separate
>>> thing from the attributes tests. Is there a reason for not using the
>>> collected attributes to check if the fingerprint is acceptable, rather
>>> than reimplementing the same tests separately?
>>
>> Right now, it is done as a separate thing because I'm only focusing on
>> two key attributes for the moment. It could be integrated in the main
>> collection page but I like the fact that it is separated. Also, I don't
>> want to deceive the user because lots of automatic things are running
>> and he or she doesn't have the control on it. That's why I put three
>> buttons on the main FP page to trigger different actions  because I want
>> the user to understand and control what is happening. And so, one
>> additional reason behind the separate page is to avoid a mega page with
>> everything on it. Each page has its own purpose (and also for
>> maintainability, it is a plus).
>> I don't know if that makes sense but I can change how it is done if the
>> approach is not good.
> 
> Ah, then it looks like the goal of this page is something different than
> what I was thinking. The feature I was thinking was telling if a browser
> looks exactly the same as the current version of Tor Browser, based on
> all the attributes that we can collect. But your page is something else
> that you do before running the fingerprint tests?
> 
>>
>>>
>>> I think one way to do it would be to have a directory with a list
>>> of .json files containing attributes and their values, one file for each
>>> supported version/slider-setting/platform. And if the browser is
>>> matching one of the .json files, then it is considered good. The .json
>>> files would not include attributes such as screen.width or screen.heigh,
>>> but it could include other attributes indicating if they are rounded.
>>>
> 
> After thinking again about this, rather than having a set of .json
> files, with one for each supported version/slider-setting/platform
> containing a lot of duplicate information (most attributes will be the
> same in all the cases), an other way to do it would be to list in the
> fingerprint/attributes/*.json files the expected values in the different
> cases.
> 
> Adding something like this to the .json files:
> 
>  "expected_values" : {
>    "variable_1" :
>      [
>        {
>          "security_slider": 4,
>          "value" : "X"
>        },
>        {
>          "security_slider": 3,
>          "value" : "Y"
>        },
>        {
>          "value" : "Z"
>        }
>      ]
>    }
> 

I think this could work since some attributes should hardly change
across versions and security levels.
What I could do is implement a small system that may work that way:
1) When a new version of the Tor browser is released, someone could go
on the website with an unmodified version of the browser and get the
complete FP by downloading the corresponding JSON file.
2) Scan the file with a specific Python script. If it detects a value
that is different from previous ones, it automatically adds it to the
right definition JSON file.
If it is not, it adds a new entry with the current version of TBB.

I'm envisioning an augmented version of what you described above:
{
  "name" : ...,
  "description" : ...
  ...,
 "expected_values" : {
    "variable_1" :
      [
        {
	  "versions": ["46.0","47.0","48.0"],
          "security_slider": [1,2,3],
          "value" : "X"
        },
        {
	  "versions": ["46.0","47.0","48.0"],
          "security_slider": [4],
          "value" : "X"
        }
      ]
}

Maybe this is not the best format but something along those lines could
work.

>> I haven't thought about it that way but that could be a very good
>> approach. My main intuition behind the "acceptable" fingerprint change
>> was to follow the design principles of Tor and see if they are
>> respected. Now, I check the screen size and the absence of plugins.
>> I think comparing a browser's fingerprint with a list of
>> "normal"/"standard" ones could be good but in order for it to work, I
>> think we need to get some data. One reason behind that would be what
>> happens if the user's fingerprint does not match any "normal" ones. We
>> have to put in place some boundaries for each attribute to say this
>> value for this attribute is varying in a range we consider acceptable.
>> Also, if the user's fingerprint deviates a little, can the user do
>> anything about it? Can generic actions be applied or would it be more a
>> case by case modifications?
>> All in all, I put this on the roadmap because I definitely think
>> something cool can be done here but without some data first, I don't
>> think I can implement something really relevant.
> 
> I think that in theory, for a given version of Tor Browser, slider
> setting and platform, all attributes should be the same for everybody,
> with a few exceptions such as the screen size. If that is not the
> case, and an attribute is changing for some people, it might be a
> bug that we want to fix in Tor Browser. So the only data needed would be
> one fingerprint for each platform/slider setting (or to start, a subset
> of them).
> 
> If the user's fingerprint deviates a little, and if they are using the
> latest version, did not play with about:config or install addons, what
> they should do is open a ticket so we can investigate if this is a
> fingerprinting vector that we should try to fix.
> 

With the system described above, we could indicate with a color or a
sign if the collected value is one that is expected and then provide a
link to open a ticket on the bug tracker.

Right now, I'm focused on launching a beta version of the website but
after it is done, it seems to me that it can be a worthwhile addition
that gives additional information to devs and users about what may not
be normal.

Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20160624/1fa39d3d/attachment-0001.sig>