[tor-dev] Anonymous Local Count Statistics Using PCSA - GSoC
Aaron Johnson
aaron.m.johnson at nrl.navy.mil
Sun Apr 2 13:22:34 UTC 2017
Also, I think that counting users by IP is still a fine way to do it (absent the privacy issue that PCSA tries to address). I was just stating that my understanding based on talking to the Tor Metrics people is that the plan is to handle the privacy issue by moving to per-connection country statistics instead of by implementing PCSA.
I would also wonder how the privacy of PCSA actually compares to the privacy of per-country (noisy) counting, especially if the local statistics could be locally stored in a differentially-private way (again, this requires an accuracy analysis). As Tschorsch and Scheuermann note [0], the FM sketch used by PCSA can indicate the presence of an individual user (Sec. 4). Thus they propose to add noise by independently flipping some of the PCSA bits (Sec. 5). This seems quite similar to the differentially-private technique of adding noise to a counter. It is not clear to me that it is better to suffer the inaccuracy of the PCSA sketching plus that of the added noise when one could simply rely on adding differentially-private noise, especially when the latter provides a precise notion of privacy where the former does not.
Best,
Aaron
[0] Florian Tschorsch and Björn Scheuermann, "An algorithm for privacy-preserving distributed user statistics”, Computer Networks 57 (2013).
> On Apr 2, 2017, at 9:07 AM, Aaron Johnson <aaron.m.johnson at nrl.navy.mil> wrote:
>
> Sorry, I should have been more clear there. Tor Metrics estimates the total number of users by counting the number of directory downloads and dividing by an estimated expected number of directory downloads per user per day (10, I believe). This statistic is in the graph under the “Relay Users” tab on <https://metrics.torproject.org/userstats-relay-country.html <https://metrics.torproject.org/userstats-relay-country.html>>.
>
> Best,
> Aaron
>
>> On Apr 2, 2017, at 8:51 AM, Veer Kalantri <mads.531998 at gmail.com <mailto:mads.531998 at gmail.com>> wrote:
>>
>> about which stats are you talking Aaron?
>>
>>
>> On Sun, Apr 2, 2017 at 5:45 PM, Aaron Johnson <aaron.m.johnson at nrl.navy.mil <mailto:aaron.m.johnson at nrl.navy.mil>> wrote:
>> > These statistics not just tell about the user's country but also keep a
>> > track of unique IP addresses connecting from each country. This is
>> > needed so as to present more realistic stats. If we increment counter on
>> > any IP address instead of unique IP address then the statistics would
>> > also reflect user(s) connecting again and again. If we don't count
>> > Unique IPs, we would have stats about per country usage rather than per
>> > country users. We could do much better and implement a way(as described
>> > by the OP of thread) that counts unique IPs at the same time preserves
>> > privacy.
>>
>> It is true that this would count connections rather than unique IPs. However, Tor already infers the number of users by counting directory downloads and then adjusting that number based on how many each user is expected to make. In addition, each user doesn’t necessarily correspond to a different IP because of NAT, and so counting connections may actually be more accurate.
>>
>> Best,
>> Aaron
>> _______________________________________________
>> tor-dev mailing list
>> tor-dev at lists.torproject.org <mailto:tor-dev at lists.torproject.org>
>> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev <https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev>
>>
>> _______________________________________________
>> tor-dev mailing list
>> tor-dev at lists.torproject.org <mailto:tor-dev at lists.torproject.org>
>> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20170402/8cb28355/attachment-0001.html>
More information about the tor-dev
mailing list