[metrics-bugs] #21315 [Circumvention/Snowflake]: publish some realtime stats from the broker?

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue May 21 19:53:15 UTC 2019


#21315: publish some realtime stats from the broker?
-------------------------------------+--------------------------------
 Reporter:  arma                     |          Owner:  (none)
     Type:  enhancement              |         Status:  needs_revision
 Priority:  Medium                   |      Milestone:
Component:  Circumvention/Snowflake  |        Version:
 Severity:  Normal                   |     Resolution:
 Keywords:                           |  Actual Points:
Parent ID:  #29461                   |         Points:
 Reviewer:  irl                      |        Sponsor:  Sponsor19
-------------------------------------+--------------------------------

Comment (by cohosh):

 Replying to [comment:14 irl]:
 Thanks for this feedback. This was very helpful. What makes snowflake
 statistics a little more complex than bridge or relay stats is that, while
 stats about how many times a bridge was used doesn't closely reflect
 client usage, proxies handle either a single client or a small, fixed
 number of clients as determined by the individual proxy and so there's a
 greater possibility for data leakage there.
 > Number of currently available snowflake proxies is not sensitive. We do
 not make any efforts to hide the numbers of relays or bridges, and so this
 can be an exact count. The question here is not the count resolution but
 the time resolution. (Sorry to answer your question with a question.)
 >
 > If I'm an attacker, can I learn anything about a client if I can observe
 the client's traffic and the exact count of snowflakes. For example, what
 do I learn if a snowflake that a client is using disappears? I'm not sure
 what the snowflake protocol does in this case.
 Possibly, as stated above, it depends on what type of proxy you are and
 how it's set up. I think we're better off doing binning in this case.
 However, as stated below, if we collect at a granularity of every 24 hours
 this shouldn't leak client usage.
 >
 > I'm not sure what you mean with the GeoIP stats. If these are stats
 regarding the locations of proxies, again exact counts would be fine and
 would be in line with what we do for relays and bridges at the moment. If
 this is for clients, we should aim to provide differential privacy. I fear
 that at the moment, we are not seeing enough users that we can safely
 report GeoIP stats (usefully) for clients at all. With relays and bridges,
 we round the counts up to the nearest multiple of 8.
 We're absolutely not collecting geoip stats of clients. These are only of
 snowflake proxies. I originally thought to include geoip stats of proxies
 that are actually handed out but it's safer to do stats for available
 proxies since this shouldn't leak client usage if collected over a period
 of 24 hours.
 >
 > Round trip time of snowflake rendezvous sounds like a really useful
 metric for engineering work, but a dangerous one for safety. This would be
 a good candidate for PrivCount but without such a technique I wouldn't do
 this one. We currently measure performance of relays using active
 measurement, such that we are only analyzing our own traffic. We have
 extended that tool, OnionPerf, to also work for pluggable transports but
 it will do the end-to-end performance not just client->snowflake.
 >
 That's fair, this is really only available for debug purposes. We don't
 need to export it as a metric and I'd argue that this should only be
 logged locally.
 > Can you lay out in detail exactly what metrics you'd want, what
 resolution data you want (both in counts and in time) and what you might
 consider an attacker could learn, assuming they are in a position to
 monitor, or are running, a point in the network?
 >
 It looks like bridge stats default to 24 hours, that seems reasonable for
 snowflake as well.
 > Section 2.1.2 of dir-spec contains some examples of descriptions of
 metrics.

 To summarize, and be more precise about what we want to collect, I've put
 our proposed exported metrics in the Tor Directory Protocol Format:
 {{{
     "snowflake-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
         [At most once.]

         YYYY-MM-DD HH:MM:SS defines the end of the included measurement
         interval of length NSEC seconds (86400 seconds by default).

     "snowflake-ips" CC=NUM,CC=NUM,... NL
         [At most once.]

         List of mappings from two-letter country codes to the number of
         unique IP addresses of available snowflake proxies, rounded up
         to the nearest multiple of 8.

     "snowflake-available-count" NUM
         [At most once.]

         A count of the number of unique IP addresses corresponding
         to currently available snowflake proxies, rounded up to
         the nearest multiple of 8.

     "snowflake-usage-count" NUM
         [At most once.]

         A count of the number of snowflake proxies that have been
         handed out by the broker to clients, rounded up to the
         nearest multiple of 8.

 }}}

 So in short, we'd collect over a 24 hour period:
 - geoip stats of unique available snowflake proxies
 - approximated count of unique, available snowflake proxies
 - approximated count of the number of proxies handed to snowflake clients
 (which would also be the same as the total number of client requests).

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/21315#comment:15>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the metrics-bugs mailing list