[tor-commits] [torspec/master] Four new proposals based on experiments with download size
nickm at torproject.org
nickm at torproject.org
Fri Feb 24 16:26:31 UTC 2017
commit 2e5e0cb3f87f6813b789f09459daea6ebcaa4eb4
Author: Nick Mathewson <nickm at torproject.org>
Date: Fri Feb 24 11:23:31 2017 -0500
Four new proposals based on experiments with download size
---
proposals/000-index.txt | 8 ++
proposals/274-rotate-onion-keys-less.txt | 113 +++++++++++++++++++++++++
proposals/275-md-published-time-is-silly.txt | 119 +++++++++++++++++++++++++++
proposals/276-lower-bw-granularity.txt | 70 ++++++++++++++++
proposals/277-detect-id-sharing.txt | 59 +++++++++++++
5 files changed, 369 insertions(+)
diff --git a/proposals/000-index.txt b/proposals/000-index.txt
index 4e400c8..d3a4100 100644
--- a/proposals/000-index.txt
+++ b/proposals/000-index.txt
@@ -194,6 +194,10 @@ Proposals by number:
271 Another algorithm for guard selection [CLOSED]
272 Listed routers should be Valid, Running, and treated as such [FINISHED]
273 Exit relay pinning for web services [DRAFT]
+274 Rotate onion keys less frequently [OPEN]
+275 Stop including meaningful "published" time in microdescriptor consensus [OPEN]
+276 Report bandwidth with lower granularity in consensus documents [OPEN]
+277 Detect multiple relay instances running with same ID [OPEN]
Proposals by status:
@@ -249,6 +253,10 @@ Proposals by status:
256 Key revocation for relays and authorities
261 AEZ for relay cryptography
262 Re-keying live circuits with new cryptographic material
+ 274 Rotate onion keys less frequently [for 0.3.1.x-alpha]
+ 275 Stop including meaningful "published" time in microdescriptor consensus [for 0.3.1.x-alpha]
+ 276 Report bandwidth with lower granularity in consensus documents [for 0.3.1.x-alpha]
+ 277 Detect multiple relay instances running with same ID [for 0.3.??]
ACCEPTED:
140 Provide diffs between consensuses
172 GETINFO controller option for circuit information
diff --git a/proposals/274-rotate-onion-keys-less.txt b/proposals/274-rotate-onion-keys-less.txt
new file mode 100644
index 0000000..0d61d5d
--- /dev/null
+++ b/proposals/274-rotate-onion-keys-less.txt
@@ -0,0 +1,113 @@
+Filename: 274-rotate-onion-keys-less.txt
+Title: Rotate onion keys less frequently.
+Author: Nick Mathewson
+Created: 20-Feb-2017
+Status: Open
+Target: 0.3.1.x-alpha
+
+1. Overview
+
+ This document proposes that, in order to limit the bandwidth needed
+ for microdescriptor listing and transmission, we reduce the onion key
+ rotation rate from the current value (7 days) to something closer to
+ 28 days.
+
+ Doing this will reduce the total microdescriptor download volume
+ by approximately 70%.
+
+2. Motivation
+
+ Currently, clients must download a networkstatus consensus document
+ once an hour, and must download every unfamiliar microdescriptor
+ listed in that document. Therefore, we can reduce client directory
+ bandwidth if we can cause microdescriptors to change less often.
+
+ Furthermore, we are planning (in proposal 140) to implement a
+ diff-based mechanism for clients to download only the parts of each
+ consensus that have changed. If we do that, then by having the
+ microdescriptor for each router change less often, we can make these
+ consensus diffs smaller as well.
+
+3. Analysis
+
+ I analyzed microdescriptor changes over the month of January
+ 2017, and found that 94.5% of all microdescriptor transitions
+ were changes in onion key alone.
+
+ Therefore, we could reduce the number of changed "m" lines in
+ consensus diffs by approximately 94.5% * (3/4) =~ 70%,
+ if we were to rotate onion keys one-fourth as often.
+
+ The number of microdescriptors to actually download should
+ decrease by a similar number.
+
+ This amount to a significant reduction: currently, by
+ back-of-the-envelope estimates, an always-on client that downloads
+ all the directory info in a month downloads about 449MB of compressed
+ consensuses and something around 97 MB of compressed
+ microdescriptors. This proposal would save that user about 12% of
+ their total directory bandwidth.
+
+ If we assume that consensus diffs are implemented (see proposal 140),
+ then the user's compressed consensus downloads fall to something
+ closer to 27 MB. Under that analysis, the microdescriptors will
+ dominate again at 97 MB -- so lowering the number of microdescriptors
+ to fetch would save more like 55% of the remaining bandwidth.
+
+ [Back-of-the-envelope technique: assume every consensus is
+ downloaded, and every microdesc is downloaded, and microdescs are
+ downloaded in groups of 61, which works out to a constant rate.]
+
+ We'll need to do more analysis to assess the impact on clients that
+ connect to the network infrequently enough to miss microdescriptors:
+ nonetheless, the 70% figure above ought to apply to clients that connect
+ at least weekly.
+
+ (XXXX Better results pending feedback from ahf's analysis.)
+
+4. Security analysis
+
+ The onion key is used to authenticate a relay to a client when the
+ client is building a circuit through that relay. The only reason to
+ limit their lifetime is to limit the impact if an attacker steals an
+ onion key without being detected.
+
+ If an attacker steals an onion key and is detected, the relay can
+ issue a new onion key ahead of schedule, with little disruption.
+
+ But if the onion key theft is _not_ detected, then the attacker
+ can use that onion key to impersonate the relay until clients
+ start using the relay's next key. In order to do so, the
+ attacker must also impersonate the target relay at the link
+ layer: either by stealing the relay's link keys, which rotate
+ more frequently, or by compromising the previous relay in the
+ circuit.
+
+ Therefore, onion key rotation provides a small amount of
+ protection only against an attacker who can compromise relay keys
+ very intermittently, and who controls only a small portion of the
+ network. Against an attacker who can steal keys regularly it
+ does little, and an attacker who controls a lot of the network
+ can already mount other attacks.
+
+5. Proposal
+
+ I propose that we move the default onion key rotation interval
+ from 7 days to 28 days, as follows.
+
+ There should be a new consensus parameter, "onion-key-rotation-days",
+ measuring the key lifetime in days. Its minimum should be 1, its
+ maximum should be 90, and its default should be 28.
+
+ There should also be a new consensus parameter,
+ "onion-key-grace-period-days", measuring the interval for which
+ older onion keys should still be accepted. Its minimum should be
+ 1, its maximum should be onion-key-rotation-days, and its default
+ should be 7.
+
+ Every relay should list each onion key it generates for
+ onion-key-rotation-days days after generating it, and then
+ replace it. Relays should continue to accept their most recent
+ previous onion key for an additional onion-key-rotation-days days
+ after it is replaced.
+
diff --git a/proposals/275-md-published-time-is-silly.txt b/proposals/275-md-published-time-is-silly.txt
new file mode 100644
index 0000000..b23e747
--- /dev/null
+++ b/proposals/275-md-published-time-is-silly.txt
@@ -0,0 +1,119 @@
+Filename: 275-md-published-time-is-silly.txt
+Title: Stop including meaningful "published" time in microdescriptor consensus
+Author: Nick Mathewson
+Created: 20-Feb-2017
+Status: Open
+Target: 0.3.1.x-alpha
+
+1. Overview
+
+ This document proposes that, in order to limit the bandwidth needed
+ for networkstatus diffs, we remove "published" part of the "r" lines
+ in microdescriptor consensuses.
+
+ The more extreme, compatibility-breaking version of this idea will
+ reduce ed consensus diff download volume by approximately 55-75%. A
+ less-extreme interim version would still reduce volume by
+ approximately 5-6%.
+
+2. Motivation
+
+ The current microdescriptor consensus "r" line format is:
+ r Nickname Identity Published IP ORPort DirPort
+ as in:
+ r moria1 lpXfw1/+uGEym58asExGOXAgzjE 2017-01-10 07:59:25 \
+ 128.31.0.34 9101 9131
+
+ As I'll show below, there's not much use for the "Published" part
+ of these lines. By omitting them or replacing them with
+ something more compressible, we can save space.
+
+ What's more, changes in the Published field are one of the most
+ frequent changes between successive networkstatus consensus
+ documents. If we were to remove this field, then networkstatus diffs
+ (see proposal 140) would be smaller.
+
+3. Compatibility notes
+
+ Above I've talked about "removing" the published field. But of
+ course, doing this would make all existing consensus consumers
+ stop parsing the consensus successfully.
+
+ Instead, let's look at how this field is used currently in Tor,
+ and see if we can replace the value with something else.
+
+ * Published is used in the voting process to decide which
+ descriptor should be considered. But that is takend from
+ vote networkstatus documents, not consensuses.
+
+ * Published is used in mark_my_descriptor_dirty_if_too_old()
+ to decide whether to upload a new router descriptor. If the
+ published time in the consensus is more than 18 hours in the
+ past, we upload a new descriptor. (Relays are potentially
+ looking at the microdesc consensus now, since #6769 was
+ merged in 0.3.0.1-alpha.) Relays have plenty of other ways
+ to notice that they should upload new descriptors.
+
+ * Published is used in client_would_use_router() to decide
+ whether a routerstatus is one that we might possibly use.
+ We say that a routerstatus is not usable if its published
+ time is more than OLD_ROUTER_DESC_MAX_AGE (5 days) in the
+ past, or if it is not at least
+ TestingEstimatedDescriptorPropagationTime (10 minutes) in
+ the future. [***] Note that this is the only case where anything
+ is rejected because it comes from the future.
+
+ * client_would_use_router() decides whether we should
+ download a router descriptor (not a microdescriptor)
+ in routerlist.c
+
+ * client_would_use_router() is used from
+ count_usable_descriptors() to decide which relays are
+ potentially usable, thereby forming the denominator of
+ our "have descriptors / usable relays" fraction.
+
+ So we have a fairly limited constraints on which Published values
+ we can safely advertize with today's Tor implementations. If we
+ advertise anything more than 10 minutes in the future,
+ client_would_use_router() will consider routerstatuses unusable.
+ If we advertize anything more than 18 hours in the past, relays
+ will upload their descriptors far too often.
+
+4. Proposal
+
+ Immediately, in 0.2.9.x-stable (our LTS release series), we
+ should stop caring about published_on dates in the future. This
+ is a two-line change.
+
+ As an interim solution: We should add a new consensus method number
+ that changes the process by which Published fields in consensuses are
+ generated. It should set all all Published fields in the consensus
+ should be the same value. These fields should be taken to rotate
+ every 15 hours, by taking consensus valid-after time, and rounding
+ down to the nearest multiple of 15 hours since the epoch.
+
+ As a longer-term solution: Once all Tor versions earlier than 0.2.9.x
+ are obsolete (in mid 2018), we can update with a new consensus
+ method, and set the published_on date to some safe time in the
+ future.
+
+5. Analysis
+
+ To consider the impact on consensus diffs: I analyzed consensus
+ changes over the month of January 2017, using scripts at [1].
+
+ With the interim solution in place, compressed diff sizes fell by
+ 2-7% at all measured intervals except 12 hours, where they increased
+ by about 4%. Savings of 5-6% were most typical.
+
+ With the longer-term solution in place, and all published times held
+ constant permanently, the compressed diff sizes were uniformly at
+ least 56% smaller.
+
+ With this in mind, I think we might want to only plan to support the
+ longer-term solution.
+
+ [1] https://github.com/nmathewson/consensus-diff-analysis
+
+
+
diff --git a/proposals/276-lower-bw-granularity.txt b/proposals/276-lower-bw-granularity.txt
new file mode 100644
index 0000000..4d3735c
--- /dev/null
+++ b/proposals/276-lower-bw-granularity.txt
@@ -0,0 +1,70 @@
+Filename: 276-lower-bw-granularity.txt
+Title: Report bandwidth with lower granularity in consensus documents
+Author: Nick Mathewson
+Created: 20-Feb-2017
+Status: Open
+Target: 0.3.1.x-alpha
+
+1. Overview
+
+ This document proposes that, in order to limit the bandwidth needed for
+ networkstatus diffs, we lower the granularity with which bandwidth is
+ reported in consensus documents.
+
+ Making this change will reduce the total compressed ed diff download
+ volume by around 10%.
+
+2. Motivation
+
+ Consensus documents currently report bandwidth values as the median
+ of the measured bandwidth values in the votes. (Or as the median of
+ all votes' values if there are not enough measurements.) And when
+ voting, in turn, authorities simply report whatever measured value
+ they most recently encountered, clipped to 3 significant base-10
+ figures.
+
+ This means that, from one consensus to the next, these weights very
+ often and with little significance: A large fraction of bandwidth
+ transitions are under 2% in magnitude.
+
+ As we begin to use consensus diffs, each change will take space to
+ transmit. So lowering the amount of changes will lower client
+ bandwidth requirements significantly.
+
+3. Proposal
+
+ I propose that we round the bandwidth values as they are placed in
+ the votes to two no more than significant digits. In addition, for
+ values beginning with decimal "2" through "4", we should round the
+ first two digits the nearest multiple of 2. For values beginning
+ with decimal "5" though "9", we should round to the nearest multiple
+ of 5.
+
+ This change does not require a consensus method; it will take effect
+ once enough authorities have upgraded.
+
+4. Analysis
+
+ The rounding proposed above will not round any value by more than
+ 5%, so the overall impact on bandwidth balancing should be small.
+
+ In order to assess the bandwidth savings of this approach, I
+ smoothed the January 2017 consensus documents' Bandwidth fields,
+ using scripts from [1]. I found that if clients download
+ consensus diffs once an hour, they can expect 11-13% mean savings
+ after xz or gz compression. For two-hour intervals, the savings
+ is 8-10%; for three-hour or four-hour intervals, the savings only
+ is 6-8%. After that point, we start seeing diminishing returns,
+ with only 1-2% savings on a 72-hour interval's diff.
+
+ [1] https://github.com/nmathewson/consensus-diff-analysis
+
+5. Open questions:
+
+ Is there a greedier smoothing algorithm that would produce better
+ results?
+
+ Is there any reason to think this amount of smoothing would not
+ be save?
+
+ Would a time-aware smoothing mechanism work better?
diff --git a/proposals/277-detect-id-sharing.txt b/proposals/277-detect-id-sharing.txt
new file mode 100644
index 0000000..dee7f6e
--- /dev/null
+++ b/proposals/277-detect-id-sharing.txt
@@ -0,0 +1,59 @@
+Filename: 277-detect-id-sharing.txt
+Title: Detect multiple relay instances running with same ID.
+Author: Nick Mathewson
+Created: 20-Feb-2017
+Status: Open
+Target: 0.3.??
+
+1. Overview
+
+ This document proposes that we detect multiple relay instances running
+ with the same ID, and block them all, or block all but one of each.
+
+2. Motivation
+
+ While analyzing microdescriptor and relay status transitions (see
+ proposal XXXX), I found that something like 16/10631 router
+ identities from January 2017 were apparently shared by two or
+ more relays, based on their excessive number of onion key
+ transitions. This is probably accidental: and if intentional,
+ it's probably not achieving whatever the relay operators
+ intended.
+
+ Sharing identities causes all the relays in question to "flip" back
+ and forth onto the network, depending on which one uploaded its
+ descriptor most recently. One relay's address will be listed; and
+ so will that relay's onion key. Routers connected to one of the
+ other relays will believe its identity, but be suspicious of its
+ address. Attempts to extend to the relay will fail because of the
+ incorrect onion key. No more than one of the relays' bandwidths will
+ actually get significant use.
+
+ So clearly, it would be best to prevent this.
+
+3. Proposal 1: relay-side detection
+
+ Relays should themselves try to detect whether another relay is using
+ its identity. If a relay, while running, finds that it is listed in
+ a fresh consensus using an onion key other than its current or
+ previous onion key, it should tell its operator about the problem.
+
+ (This proposal borrows from Mike Perry's ideas related to key theft
+ detection.)
+
+4. Proposal 2: offline detection
+
+ Any relay that has a large number of onion-key transitions over time,
+ but only a small number of distinct onion keys, is probably two or
+ more relays in conflict with one another.
+
+ In this case, the operators can be contacted, or the relay
+ blacklisted.
+
+ We could build support for blacklisting all but one of the addresses,
+ but it's probably best to treat this as a misconfiguratino serious
+ enough that it needs to be resolved.
+
+
+
+
More information about the tor-commits
mailing list