[metrics-bugs] #21315 [Circumvention/Snowflake]: publish some realtime stats from the broker?
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue May 21 19:53:15 UTC 2019
#21315: publish some realtime stats from the broker?
-------------------------------------+--------------------------------
Reporter: arma | Owner: (none)
Type: enhancement | Status: needs_revision
Priority: Medium | Milestone:
Component: Circumvention/Snowflake | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: #29461 | Points:
Reviewer: irl | Sponsor: Sponsor19
-------------------------------------+--------------------------------
Comment (by cohosh):
Replying to [comment:14 irl]:
Thanks for this feedback. This was very helpful. What makes snowflake
statistics a little more complex than bridge or relay stats is that, while
stats about how many times a bridge was used doesn't closely reflect
client usage, proxies handle either a single client or a small, fixed
number of clients as determined by the individual proxy and so there's a
greater possibility for data leakage there.
> Number of currently available snowflake proxies is not sensitive. We do
not make any efforts to hide the numbers of relays or bridges, and so this
can be an exact count. The question here is not the count resolution but
the time resolution. (Sorry to answer your question with a question.)
>
> If I'm an attacker, can I learn anything about a client if I can observe
the client's traffic and the exact count of snowflakes. For example, what
do I learn if a snowflake that a client is using disappears? I'm not sure
what the snowflake protocol does in this case.
Possibly, as stated above, it depends on what type of proxy you are and
how it's set up. I think we're better off doing binning in this case.
However, as stated below, if we collect at a granularity of every 24 hours
this shouldn't leak client usage.
>
> I'm not sure what you mean with the GeoIP stats. If these are stats
regarding the locations of proxies, again exact counts would be fine and
would be in line with what we do for relays and bridges at the moment. If
this is for clients, we should aim to provide differential privacy. I fear
that at the moment, we are not seeing enough users that we can safely
report GeoIP stats (usefully) for clients at all. With relays and bridges,
we round the counts up to the nearest multiple of 8.
We're absolutely not collecting geoip stats of clients. These are only of
snowflake proxies. I originally thought to include geoip stats of proxies
that are actually handed out but it's safer to do stats for available
proxies since this shouldn't leak client usage if collected over a period
of 24 hours.
>
> Round trip time of snowflake rendezvous sounds like a really useful
metric for engineering work, but a dangerous one for safety. This would be
a good candidate for PrivCount but without such a technique I wouldn't do
this one. We currently measure performance of relays using active
measurement, such that we are only analyzing our own traffic. We have
extended that tool, OnionPerf, to also work for pluggable transports but
it will do the end-to-end performance not just client->snowflake.
>
That's fair, this is really only available for debug purposes. We don't
need to export it as a metric and I'd argue that this should only be
logged locally.
> Can you lay out in detail exactly what metrics you'd want, what
resolution data you want (both in counts and in time) and what you might
consider an attacker could learn, assuming they are in a position to
monitor, or are running, a point in the network?
>
It looks like bridge stats default to 24 hours, that seems reasonable for
snowflake as well.
> Section 2.1.2 of dir-spec contains some examples of descriptions of
metrics.
To summarize, and be more precise about what we want to collect, I've put
our proposed exported metrics in the Tor Directory Protocol Format:
{{{
"snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
[At most once.]
YYYY-MM-DD HH:MM:SS defines the end of the included measurement
interval of length NSEC seconds (86400 seconds by default).
"snowflake-ips" CC=NUM,CC=NUM,... NL
[At most once.]
List of mappings from two-letter country codes to the number of
unique IP addresses of available snowflake proxies, rounded up
to the nearest multiple of 8.
"snowflake-available-count" NUM
[At most once.]
A count of the number of unique IP addresses corresponding
to currently available snowflake proxies, rounded up to
the nearest multiple of 8.
"snowflake-usage-count" NUM
[At most once.]
A count of the number of snowflake proxies that have been
handed out by the broker to clients, rounded up to the
nearest multiple of 8.
}}}
So in short, we'd collect over a 24 hour period:
- geoip stats of unique available snowflake proxies
- approximated count of unique, available snowflake proxies
- approximated count of the number of proxies handed to snowflake clients
(which would also be the same as the total number of client requests).
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21315#comment:15>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list