[or-cvs] r9415: Make a new directory for specification proposals, and move s (in tor/trunk/doc: . spec spec/proposals)
nickm at seul.org
nickm at seul.org
Fri Jan 26 05:50:44 UTC 2007
Author: nickm
Date: 2007-01-26 00:50:40 -0500 (Fri, 26 Jan 2007)
New Revision: 9415
Added:
tor/trunk/doc/spec/dir-spec-v1.txt
tor/trunk/doc/spec/proposals/
tor/trunk/doc/spec/proposals/100-tor-spec-udp.txt
tor/trunk/doc/spec/proposals/101-dir-voting.txt
Removed:
tor/trunk/doc/dir-spec-v1.txt
tor/trunk/doc/dir-voting.txt
tor/trunk/doc/tor-spec-udp.txt
Log:
Make a new directory for specification proposals, and move some proposals there. Also, move dir-spec-v1.txt to spec.
Deleted: tor/trunk/doc/dir-spec-v1.txt
===================================================================
--- tor/trunk/doc/dir-spec-v1.txt 2007-01-26 05:20:26 UTC (rev 9414)
+++ tor/trunk/doc/dir-spec-v1.txt 2007-01-26 05:50:40 UTC (rev 9415)
@@ -1,315 +0,0 @@
-$Id$
-
- Tor Protocol Specification
-
- Roger Dingledine
- Nick Mathewson
-
-0. Prelimaries
-
- THIS SPECIFICATION IS OBSOLETE.
-
- This document specifies the Tor directory protocol as used in version
- 0.1.0.x and earlier. See dir-spec.txt for a current version.
-
-1. Basic operation
-
- There is a small number of directory authorities, and a larger number of
- caches. Client and servers know public keys for the directory authorities.
- Tor servers periodically upload self-signed "router descriptors" to the
- directory authorities. Each authority publishes a self-signed "directory"
- (containing all the router descriptors it knows, and a statement on which
- are running) and a self-signed "running routers" document containing only
- the statement on which routers are running.
-
- All Tors periodically download these documents, downloading the directory
- less frequently than they do the "running routers" document. Clients
- preferentially download from caches rather than authorities.
-
-1.1. Document format
-
- Router descriptors, directories, and running-routers documents all obey the
- following lightweight extensible information format.
-
- The highest level object is a Document, which consists of one or more
- Items. Every Item begins with a KeywordLine, followed by one or more
- Objects. A KeywordLine begins with a Keyword, optionally followed by
- whitespace and more non-newline characters, and ends with a newline. A
- Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
- An Object is a block of encoded data in pseudo-Open-PGP-style
- armor. (cf. RFC 2440)
-
- More formally:
-
- Document ::= (Item | NL)+
- Item ::= KeywordLine Object*
- KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
- Keyword = KeywordChar+
- KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
- ArgumentChar ::= any printing ASCII character except NL.
- WS = (SP | TAB)+
- Object ::= BeginLine Base-64-encoded-data EndLine
- BeginLine ::= "-----BEGIN " Keyword "-----" NL
- EndLine ::= "-----END " Keyword "-----" NL
-
- The BeginLine and EndLine of an Object must use the same keyword.
-
- When interpreting a Document, software MUST reject any document containing a
- KeywordLine that starts with a keyword it doesn't recognize.
-
- The "opt" keyword is reserved for non-critical future extensions. All
- implementations MUST ignore any item of the form "opt keyword ....." when
- they would not recognize "keyword ....."; and MUST treat "opt keyword ....."
- as synonymous with "keyword ......" when keyword is recognized.
-
-8.2. Router descriptor format.
-
- Every router descriptor MUST start with a "router" Item; MUST end with a
- "router-signature" Item and an extra NL; and MUST contain exactly one
- instance of each of the following Items: "published" "onion-key" "link-key"
- "signing-key" "bandwidth". Additionally, a router descriptor MAY contain
- any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items.
- Other than "router" and "router-signature", the items may appear in any
- order.
-
- The items' formats are as follows:
- "router" nickname address ORPort SocksPort DirPort
-
- Indicates the beginning of a router descriptor. "address"
- must be an IPv4 address in dotted-quad format. The last
- three numbers indicate the TCP ports at which this OR exposes
- functionality. ORPort is a port at which this OR accepts TLS
- connections for the main OR protocol; SocksPort is deprecated and
- should always be 0; and DirPort is the port at which this OR accepts
- directory-related HTTP connections. If any port is not supported,
- the value 0 is given instead of a port number.
-
- "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
-
- Estimated bandwidth for this router, in bytes per second. The
- "average" bandwidth is the volume per second that the OR is willing
- to sustain over long periods; the "burst" bandwidth is the volume
- that the OR is willing to sustain in very short intervals. The
- "observed" value is an estimate of the capacity this server can
- handle. The server remembers the max bandwidth sustained output
- over any ten second period in the past day, and another sustained
- input. The "observed" value is the lesser of these two numbers.
-
- "platform" string
-
- A human-readable string describing the system on which this OR is
- running. This MAY include the operating system, and SHOULD include
- the name and version of the software implementing the Tor protocol.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- The time, in GMT, when this descriptor was generated.
-
- "fingerprint"
-
- A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded
- in hex, with a single space after every 4 characters) for this router's
- identity key. A descriptor is considered invalid (and MUST be
- rejected) if the fingerprint line does not match the public key.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "hibernating" 0|1
-
- If the value is 1, then the Tor server was hibernating when the
- descriptor was published, and shouldn't be used to build circuits.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "uptime"
-
- The number of seconds that this OR process has been running.
-
- "onion-key" NL a public key in PEM format
-
- This key is used to encrypt EXTEND cells for this OR. The key MUST
- be accepted for at least XXXX hours after any new key is published in
- a subsequent descriptor.
-
- "signing-key" NL a public key in PEM format
-
- The OR's long-term identity key.
-
- "accept" exitpattern
- "reject" exitpattern
-
- These lines, in order, describe the rules that an OR follows when
- deciding whether to allow a new stream to a given address. The
- 'exitpattern' syntax is described below.
-
- "router-signature" NL Signature NL
-
- The "SIGNATURE" object contains a signature of the PKCS1-padded
- hash of the entire router descriptor, taken from the beginning of the
- "router" line, through the newline after the "router-signature" line.
- The router descriptor is invalid unless the signature is performed
- with the router's identity key.
-
- "contact" info NL
-
- Describes a way to contact the server's administrator, preferably
- including an email address and a PGP key fingerprint.
-
- "family" names NL
-
- 'Names' is a whitespace-separated list of server nicknames. If two ORs
- list one another in their "family" entries, then OPs should treat them
- as a single OR for the purpose of path selection.
-
- For example, if node A's descriptor contains "family B", and node B's
- descriptor contains "family A", then node A and node B should never
- be used on the same circuit.
-
- "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
-
- Declare how much bandwidth the OR has used recently. Usage is divided
- into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines
- the end of the most recent interval. The numbers are the number of
- bytes used in the most recent intervals, ordered from oldest to newest.
-
- [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
-2.1. Nonterminals in routerdescriptors
-
- nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
-
- exitpattern ::= addrspec ":" portspec
- portspec ::= "*" | port | port "-" port
- port ::= an integer between 1 and 65535, inclusive.
- addrspec ::= "*" | ip4spec | ip6spec
- ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
- ip4 ::= an IPv4 address in dotted-quad format
- ip4mask ::= an IPv4 mask in dotted-quad format
- num_ip4_bits ::= an integer between 0 and 32
- ip6spec ::= ip6 | ip6 "/" num_ip6_bits
- ip6 ::= an IPv6 address, surrounded by square brackets.
- num_ip6_bits ::= an integer between 0 and 128
-
- Ports are required; if they are not included in the router
- line, they must appear in the "ports" lines.
-
-3. Directory format
-
- A Directory begins with a "signed-directory" item, followed by one each of
- the following, in any order: "recommended-software", "published",
- "router-status", "dir-signing-key". It may include any number of "opt"
- items. After these items, a directory includes any number of router
- descriptors, and a single "directory-signature" item.
-
- "signed-directory"
-
- Indicates the start of a directory.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- The time at which this directory was generated and signed, in GMT.
-
- "dir-signing-key"
-
- The key used to sign this directory; see "signing-key" for format.
-
- "recommended-software" comma-separated-version-list
-
- A list of which versions of which implementations are currently
- believed to be secure and compatible with the network.
-
- "running-routers" whitespace-separated-list
-
- A description of which routers are currently believed to be up or
- down. Every entry consists of an optional "!", followed by either an
- OR's nickname, or "$" followed by a hexadecimal encoding of the hash
- of an OR's identity key. If the "!" is included, the router is
- believed not to be running; otherwise, it is believed to be running.
- If a router's nickname is given, exactly one router of that nickname
- will appear in the directory, and that router is "approved" by the
- directory server. If a hashed identity key is given, that OR is not
- "approved". [XXXX The 'running-routers' line is only provided for
- backward compatibility. New code should parse 'router-status'
- instead.]
-
- "router-status" whitespace-separated-list
-
- A description of which routers are currently believed to be up or
- down, and which are verified or unverified. Contains one entry for
- every router that the directory server knows. Each entry is of the
- format:
-
- !name=$digest [Verified router, currently not live.]
- name=$digest [Verified router, currently live.]
- !$digest [Unverified router, currently not live.]
- or $digest [Unverified router, currently live.]
-
- (where 'name' is the router's nickname and 'digest' is a hexadecimal
- encoding of the hash of the routers' identity key).
-
- When parsing this line, clients should only mark a router as
- 'verified' if its nickname AND digest match the one provided.
-
- "directory-signature" nickname-of-dirserver NL Signature
-
- The signature is computed by computing the digest of the
- directory, from the characters "signed-directory", through the newline
- after "directory-signature". This digest is then padded with PKCS.1,
- and signed with the directory server's signing key.
-
- If software encounters an unrecognized keyword in a single router descriptor,
- it MUST reject only that router descriptor, and continue using the
- others. Because this mechanism is used to add 'critical' extensions to
- future versions of the router descriptor format, implementation should treat
- it as a normal occurrence and not, for example, report it to the user as an
- error. [Versions of Tor prior to 0.1.1 did this.]
-
- If software encounters an unrecognized keyword in the directory header,
- it SHOULD reject the entire directory.
-
-4. Network-status descriptor
-
- A "network-status" (a.k.a "running-routers") document is a truncated
- directory that contains only the current status of a list of nodes, not
- their actual descriptors. It contains exactly one of each of the following
- entries.
-
- "network-status"
-
- Must appear first.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- (see 8.3 above)
-
- "router-status" list
-
- (see 8.3 above)
-
- "directory-signature" NL signature
-
- (see 8.3 above)
-
-5. Behavior of a directory server
-
- lists nodes that are connected currently
- speaks HTTP on a socket, spits out directory on request
-
- Directory servers listen on a certain port (the DirPort), and speak a
- limited version of HTTP 1.0. Clients send either GET or POST commands.
- The basic interactions are:
- "%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n",
- command, url, content-length, host.
- Get "/tor/" to fetch a full directory.
- Get "/tor/dir.z" to fetch a compressed full directory.
- Get "/tor/running-routers" to fetch a network-status descriptor.
- Post "/tor/" to post a server descriptor, with the body of the
- request containing the descriptor.
-
- "host" is used to specify the address:port of the dirserver, so
- the request can survive going through HTTP proxies.
-
Deleted: tor/trunk/doc/dir-voting.txt
===================================================================
--- tor/trunk/doc/dir-voting.txt 2007-01-26 05:20:26 UTC (rev 9414)
+++ tor/trunk/doc/dir-voting.txt 2007-01-26 05:50:40 UTC (rev 9415)
@@ -1,388 +0,0 @@
-$Id: /tor/branches/eventdns/doc/dir-spec.txt 9469 2006-11-01T23:56:30.179423Z nickm $
-
- Voting on the Tor Directory System
-
-0. Scope and preliminaries
-
- This document describes a consensus voting scheme for Tor directories.
- Once it's accepted, it should be merged with dir-spec.txt. Some
- preliminaries for authority and caching support should be done during
- the 0.1.2.x series; the main deployment should come during the 0.1.3.x
- series.
-
-0.1. Goals and motivation: voting.
-
- The current directory system relies on clients downloading separate
- network status statements from the caches signed by each directory.
- Clients download a new statement every 30 minutes or so, choosing to
- replace the oldest statement they currently have.
-
- This creates a partitioning problem: different clients have different
- "most recent" networkstatus sources, and different versions of each
- (since authorities change their statements often).
-
- It also creates a scaling problem: most of the downloaded networkstatus
- are probably quite similar, and the redundancy grows as we add more
- authorities.
-
- So if we have clients only download a single multiply signed consensus
- network status statement, we can:
- - Save bandwidth.
- - Reduce client partitioning
- - Reduce client-side and cache-side storage
- - Simplify client-side voting code (by moving voting away from the
- client)
-
- We should try to do this without:
- - Assuming that client-side or cache-side clocks are more correct
- than we assume now.
- - Assuming that authority clocks are perfectly correct.
- - Degrading badly if a few authorities die or are offline for a bit.
-
- We do not have to perform well if:
- - No clique of more than half the authorities can agree about who
- the authorities are.
-
-1. The idea.
-
- Instead of publishing a network status whenever something changes,
- each authority instead publishes a fresh network status only once per
- "period" (say, 60 minutes). Authorities either upload this network
- status (or "vote") to every other authority, or download every other
- authority's "vote" (see 3.1 below for discussion on push vs pull).
-
- After an authority has (or has become convinced that it won't be able to
- get) every other authority's vote, it deterministically computes a
- consensus networkstatus, and signs it. Authorities download (or are
- uploaded; see 3.1) one another's signatures, and form a multiply signed
- consensus. This multiply-signed consensus is what caches cache and what
- clients download.
-
- If an authority is down, authorities vote based on what they *can*
- download/get uploaded.
-
- If an authority is "a little" down and only some authorities can reach
- it, authorities try to get its info from other authorities.
-
- If an authority computes the vote wrong, its signature isn't included on
- the consensus.
-
- Clients use a consensus if it is "trusted": signed by more than half the
- authorities they recognize. If clients can't find any such consensus,
- they use the most recent trusted consensus they have. If they don't
- have any trusted consensus, they warn the user and refuse to operate
- (and if DirServers is not the default, beg the user to adapt the list
- of authorities).
-
-2. Details.
-
-2.1. Vote specifications
-
- Votes in v2.1 are similar to v2 network status documents. We add these
- fields to the preamble:
-
- "vote-status" -- the word "vote".
-
- "valid-until" -- the time when this authority expects to publish its
- next vote.
-
- "known-flags" -- a space-separated list of flags that will sometimes
- be included on "s" lines later in the vote.
-
- "dir-source" -- as before, except the "hostname" part MUST be the
- authority's nickname, which MUST be unique among authorities, and
- MUST match the nickname in the "directory-signature" entry.
-
- Authorities SHOULD cache their most recently generated votes so they
- can persist them across restarts. Authorities SHOULD NOT generate
- another document until valid-until has passed.
-
- Router entries in the vote MUST be sorted in ascending order by router
- identity digest. The flags in "s" lines MUST appear in alphabetical
- order.
-
- Votes SHOULD be synchronized to half-hour publication intervals (one
- hour? XXX say more; be more precise.)
-
- XXXX some way to request older networkstatus docs?
-
-2.2. Consensus directory specifications
-
- Consensuses are like v2.1 votes, except for the following fields:
-
- "vote-status" -- the word "consensus".
-
- "published" is the latest of all the published times on the votes.
-
- "valid-until" is the earliest of all the valid-until times on the
- votes.
-
- "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
- are included for each authority that contributed to the vote.
-
- "vote-digest" for each authority that contributed to the vote,
- calculated as for the digest in the signature on the vote. [XXX
- re-English this sentence]
-
- "client-versions" and "server-versions" are sorted in ascending
- order based on version-spec.txt.
-
- "dir-options" and "known-flags" are not included.
-[XXX really? why not list the ones that are used in the consensus?
-For example, right now BadExit is in use, but no servers would be
-labelled BadExit, and it's still worth knowing that it was considered
-by the authorities. -RD]
-
- The fields MUST occur in the following order:
- "network-status-version"
- "vote-status"
- "published"
- "valid-until"
- For each authority, sorted in ascending order of nickname, case-
- insensitively:
- "dir-source", "fingerprint", "contact", "dir-signing-key",
- "vote-digest".
- "client-versions"
- "server-versions"
-
- The signatures at the end of the document appear as multiple instances
- of directory-signature, sorted in ascending order by nickname,
- case-insensitively.
-
- A router entry should be included in the result if it is included by more
- than half of the authorities (total authorities, not just those whose votes
- we have). A router entry has a flag set if it is included by more than
- half of the authorities who care about that flag. [XXXX this creates an
- incentive for attackers to DOS authorities whose votes they don't like.
- Can we remember what flags people set the last time we saw them? -NM]
- [Which 'we' are we talking here? The end-users never learn which
- authority sets which flags. So you're thinking the authorities
- should record the last vote they saw from each authority and if it's
- within a week or so, count all the flags that it advertised as 'no'
- votes? Plausible. -RD]
-
- The signature hash covers from the "network-status-version" line through
- the characters "directory-signature" in the first "directory-signature"
- line.
-
- Consensus directories SHOULD be rejected if they are not signed by more
- than half of the known authorities.
-
-2.2.1. Detached signatures
-
- Assuming full connectivity, every authority should compute and sign the
- same consensus directory in each period. Therefore, it isn't necessary to
- download the consensus computed by each authority; instead, the authorities
- only push/fetch each others' signatures. A "detached signature" document
- contains a single "consensus-digest" entry and one or more
- directory-signature entries. [XXXX specify more.]
-
-2.3. URLs and timelines
-
-2.3.1. URLs and timeline used for agreement
-
- An authority SHOULD publish its vote immediately at the start of each voting
- period. It does this by making it available at
- http://<hostname>/tor/status-vote/current/authority.z
- and sending it in an HTTP POST request to each other authority at the URL
- http://<hostname>/tor/post/vote
-
- If, N minutes after the voting period has begun, an authority does not have
- a current statement from another authority, the first authority retrieves
- the other's statement.
-
- Once an authority has a vote from another authority, it makes it available
- at
- http://<hostname>/tor/status-vote/current/<fp>.z
- where <fp> is the fingerprint of the other authority's identity key.
-
- The consensus network status, along with as many signatures as the server
- currently knows, should be available at
- http://<hostname>/tor/status-vote/current/consensus.z
- All of the detached signatures it knows for consensus status should be
- available at:
- http://<hostname>/tor/status-vote/current/consensus-signatures.z
-
- Once an authority has computed and signed a consensus network status, it
- should send its detached signature to each other authority in an HTTP POST
- request to the URL:
- http://<hostname>/tor/post/consensus-signature
-
-
- [XXXX Store votes to disk.]
-
-2.3.2. Serving a consensus directory
-
- Once the authority is done getting signatures on the consensus directory,
- it should serve it from:
- http://<hostname>/tor/status/consensus.z
-
- Caches SHOULD download consensus directories from an authority and serve
- them from the same URL.
-
-2.3.3. Timeline and synchronization
-
- [XXXX]
-
-2.4. Distributing routerdescs between authorities
-
- Consensus will be more meaningful if authorities take steps to make sure
- that they all have the same set of descriptors _before_ the voting
- starts. This is safe, since all descriptors are self-certified and
- timestamped: it's always okay to replace a signed descriptor with a more
- recent one signed by the same identity.
-
- In the long run, we might want some kind of sophisticated process here.
- For now, since authorities already download one another's networkstatus
- documents and use them to determine what descriptors to download from one
- another, we can rely on this existing mechanism to keep authorities up to
- date.
-
- [We should do a thorough read-through of dir-spec again to make sure
- that the authorities converge on which descriptor to "prefer" for
- each router. Right now the decision happens at the client, which is
- no longer the right place for it. -RD]
-
-3. Questions and concerns
-
-3.1. Push or pull?
-
- The URLs above define a push mechanism for publishing votes and consensus
- signatures via HTTP POST requests, and a pull mechanism for downloading
- these documents via HTTP GET requests. As specified, every authority will
- post to every other. The "download if no copy has been received" mechanism
- exists only as a fallback.
-
-3.2. Dropping "opt".
-
- The "opt" keyword in Tor's directory formats was originally intended to
- mean, "it is okay to ignore this entry if you don't understand it"; the
- default behavior has been "discard a routerdesc if it contains entries you
- don't recognize."
-
- But so far, every new flag we have added has been marked 'opt'. It would
- probably make sense to change the default behavior to "ignore unrecognized
- fields", and add the statement that clients SHOULD ignore fields they don't
- recognize. As a meta-principle, we should say that clients and servers
- MUST NOT have to understand new fields in order to use directory documents
- correctly.
-
- Of course, this will make it impossible to say, "The format has changed a
- lot; discard this quietly if you don't understand it." We could do that by
- adding a version field.
-
-3.3. Multilevel keys.
-
- Replacing a directory authority's identity key in the event of a compromise
- would be tremendously annoying. We'd need to tell every client to switch
- their configuration, or update to a new version with an uploaded list. So
- long as some weren't upgraded, they'd be at risk from whoever had
- compromised the key.
-
- With this in mind, it's a shame that our current protocol forces us to
- store identity keys unencrypted in RAM. We need some kind of signing key
- stored unencrypted, since we need to generate new descriptors/directories
- and rotate link and onion keys regularly. (And since, of course, we can't
- ask server operators to be on-hand to enter a passphrase every time we
- want to rotate keys or sign a descriptor.)
-
- The obvious solution seems to be to have a signing-only key that lives
- indefinitely (months or longer) and signs descriptors and link keys, and a
- separate identity key that's used to sign the signing key. Tor servers
- could run in one of several modes:
- 1. Identity key stored encrypted. You need to pick a passphrase when
- you enable this mode, and re-enter this passphrase every time you
- rotate the signing key.
- 1'. Identity key stored separate. You save your identity key to a
- floppy, and use the floppy when you need to rotate the signing key.
- 2. All keys stored unencrypted. In this case, we might not want to even
- *have* a separate signing key. (We'll need to support no-separate-
- signing-key mode anyway to keep old servers working.)
- 3. All keys stored encrypted. You need to enter a passphrase to start
- Tor.
- (Of course, we might not want to implement all of these.)
-
- Case 1 is probably most usable and secure, if we assume that people don't
- forget their passphrases or lose their floppies. We could mitigate this a
- bit by encouraging people to PGP-encrypt their passphrases to themselves,
- or keep a cleartext copy of their secret key secret-split into a few
- pieces, or something like that.
-
- Migration presents another difficulty, especially with the authorities. If
- we use the current set of identity keys as the new identity keys, we're in
- the position of having sensitive keys that have been stored on
- media-of-dubious-encryption up to now. Also, we need to keep old clients
- (who will expect descriptors to be signed by the identity keys they know
- and love, and who will not understand signing keys) happy.
-
- I'd enumerate designs here, but I'm hoping that somebody will come up with
- a better one, so I'll try not to prejudice them with more ideas yet.
-
- Oh, and of course, we'll want to make sure that the keys are
- cross-certified. :)
-
- Ideas? -NM
-
-3.4. Long and short descriptors
-
- Some of the costliest fields in the current directory protocol are ones
- that no client actually uses. In particular, the "read-history" and
- "write-history" fields are used only by the authorities for monitoring the
- status of the network. If we took them out, the size of a compressed list
- of all the routers would fall by about 60%. (No other disposable field
- would save more than 2%.)
-
- One possible solution here is that routers should generate and upload a
- short-form and long-form descriptor. Only the short-form descriptor should
- ever be used by anybody for routing. The long-form descriptor should be
- used only for analytics and other tools. (If we allowed people to route with
- long descriptors, we'd have to ensure that they stayed in sync with the
- short ones somehow.) We can ensure that the short descriptors are used by
- only recommending those in the network statuses.
-
- Another possible solution would be to drop these fields from descriptors,
- and have them uploaded as a part of a separate "bandwidth report" to the
- authorities. This could help prevent the mistake of using long descriptors
- in the place of short ones.
-
- Thoughts? -NM
-
-3.5. Compression
-
- Gzip would be easier to work with than zlib; bzip2 would result in smaller
- data lengths. [Concretely, we're looking at about 10-15% space savings at
- the expense of 3-5x longer compression time for using bzip2.] Doing
- on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
- Pre-compressing status documents in multiple formats would force us to use
- more memory to hold them.
-
-4. Migration
-
- For directory voting:
- * It would be cool if caches could get ready to download consensus
- status docs, verify enough signatures, and serve them now. That way
- once stuff works all we need to do is upgrade the authorities. Caches
- don't need to verify the correctness of the format so long as it's
- signed (or maybe multisigned?). We need to make sure that caches back
- off very quickly from downloading consensus docs until they're
- actually implemented.
-
- For dropping the "opt" requirement:
- * stopped requiring it as of 0.1.2.5-alpha. Stop generating it once
- earlier formats are obsolete.
-
- For multilevel keys:
- * no idea
-
- For long/short descriptors:
- * In 0.1.2.x:
- * Authorities should accept both, now, and silently drop short
- descriptors.
- * Routers should upload both once authorities accept them.
- * There should be a "long descriptor" url and the current "normal" URL.
- Authorities should serve long descriptors from both URLs.
- * Once tools that want long descriptors support fetching them from the
- "long descriptor" URL:
- * Have authorities remember short descriptors, and serve them from the
- 'normal' URL.
-
Copied: tor/trunk/doc/spec/dir-spec-v1.txt (from rev 9409, tor/trunk/doc/dir-spec-v1.txt)
Copied: tor/trunk/doc/spec/proposals/100-tor-spec-udp.txt (from rev 9409, tor/trunk/doc/tor-spec-udp.txt)
Copied: tor/trunk/doc/spec/proposals/101-dir-voting.txt (from rev 9409, tor/trunk/doc/dir-voting.txt)
Deleted: tor/trunk/doc/tor-spec-udp.txt
===================================================================
--- tor/trunk/doc/tor-spec-udp.txt 2007-01-26 05:20:26 UTC (rev 9414)
+++ tor/trunk/doc/tor-spec-udp.txt 2007-01-26 05:50:40 UTC (rev 9415)
@@ -1,414 +0,0 @@
-[This proposed Tor extension has not been implemented yet. It is currently
-in request-for-comments state. -RD]
-
- Tor Unreliable Datagram Extension Proposal
-
- Marc Liberatore
-
-Abstract
-
-Contents
-
-0. Introduction
-
- Tor is a distributed overlay network designed to anonymize low-latency
- TCP-based applications. The current tor specification supports only
- TCP-based traffic. This limitation prevents the use of tor to anonymize
- other important applications, notably voice over IP software. This document
- is a proposal to extend the tor specification to support UDP traffic.
-
- The basic design philosophy of this extension is to add support for
- tunneling unreliable datagrams through tor with as few modifications to the
- protocol as possible. As currently specified, tor cannot directly support
- such tunneling, as connections between nodes are built using transport layer
- security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
- to the operation of most UDP-based application level protocols.
-
- Thus, we propose the addition of links between nodes using datagram
- transport layer security (DTLS). These links allow packets to traverse a
- route through tor quickly, but their unreliable nature requires minor
- changes to the tor protocol. This proposal outlines the necessary
- additions and changes to the tor specification to support UDP traffic.
-
- We note that a separate set of DTLS links between nodes creates a second
- overlay, distinct from the that composed of TLS links. This separation and
- resulting decrease in each anonymity set's size will make certain attacks
- easier. However, it is our belief that VoIP support in tor will
- dramatically increase its appeal, and correspondingly, the size of its user
- base, number of deployed nodes, and total traffic relayed. These increases
- should help offset the loss of anonymity that two distinct networks imply.
-
-1. Overview of Tor-UDP and its complications
-
- As described above, this proposal extends the Tor specification to support
- UDP with as few changes as possible. Tor's overlay network is managed
- through TLS based connections; we will re-use this control plane to set up
- and tear down circuits that relay UDP traffic. These circuits be built atop
- DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
- TLS.
-
- The unreliability of DTLS circuits creates problems for Tor at two levels:
-
- 1. Tor's encryption of the relay layer does not allow independent
- decryption of individual records. If record N is not received, then
- record N+1 will not decrypt correctly, as the counter for AES/CTR is
- maintained implicitly.
-
- 2. Tor's end-to-end integrity checking works under the assumption that
- all RELAY cells are delivered. This assumption is invalid when cells
- are sent over DTLS.
-
- The fix for the first problem is straightforward: add an explicit sequence
- number to each cell. To fix the second problem, we introduce a
- system of nonces and hashes to RELAY packets.
-
- In the following sections, we mirror the layout of the Tor Protocol
- Specification, presenting the necessary modifications to the Tor protocol as
- a series of deltas.
-
-2. Connections
-
- Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
- corresponding TLS links, as all control messages are sent over TLS. All
- implementations MUST support the DTLS ciphersuite "[TODO]".
-
- DTLS connections are formed using the same protocol as TLS connections.
- This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
- as detailed in section 4.6.
-
- Once a paired TLS/DTLS connection is established, the two sides send cells
- to one another. All but two types of cells are sent over TLS links. RELAY
- cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
- below, are sent over DTLS links. [Should all cells still be 512 bytes long?
- Perhaps upon completion of a preliminary implementation, we should do a
- performance evaluation for some class of UDP traffic, such as VoIP. - ML]
- Cells may be sent embedded in TLS or DTLS records of any size or divided
- across such records. The framing of these records MUST NOT leak any more
- information than the above differentiation on the basis of cell type. [I am
- uncomfortable with this leakage, but don't see any simple, elegant way
- around it. -ML]
-
- As with TLS connections, DTLS connections are not permanent.
-
-3. Cell format
-
- Each cell contains the following fields:
-
- CircID [2 bytes]
- Command [1 byte]
- Sequence Number [2 bytes]
- Payload (padded with 0 bytes) [507 bytes]
- [Total size: 512 bytes]
-
- The 'Command' field holds one of the following values:
- 0 -- PADDING (Padding) (See Sec 6.2)
- 1 -- CREATE (Create a circuit) (See Sec 4)
- 2 -- CREATED (Acknowledge create) (See Sec 4)
- 3 -- RELAY (End-to-end data) (See Sec 5)
- 4 -- DESTROY (Stop using a circuit) (See Sec 4)
- 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
- 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
- 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
- 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
- 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
- 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
-
- The sequence number allows for AES/CTR decryption of RELAY cells
- independently of one another; this functionality is required to support
- cells sent over DTLS. The sequence number is described in more detail in
- section 4.5.
-
- [Should the sequence number only appear in RELAY packets? The overhead is
- small, and I'm hesitant to force more code paths on the implementor. -ML]
- [There's already a separate relay header that has other material in it,
- so it wouldn't be the end of the world to move it there if it's
- appropriate. -RD]
-
- [Having separate commands for UDP circuits seems necessary, unless we can
- assume a flag day event for a large number of tor nodes. -ML]
-
-4. Circuit management
-
-4.2. Setting circuit keys
-
- Keys are set up for UDP circuits in the same fashion as for TCP circuits.
- Each UDP circuit shares keys with its corresponding TCP circuit.
-
- [If the keys are used for both TCP and UDP connections, how does it
- work to mix sequence-number-less cells with sequenced-numbered cells --
- how do you know you have the encryption order right? -RD]
-
-4.3. Creating circuits
-
- UDP circuits are created as TCP circuits, using the *_UDP cells as
- appropriate.
-
-4.4. Tearing down circuits
-
- UDP circuits are torn down as TCP circuits, using the *_UDP cells as
- appropriate.
-
-4.5. Routing relay cells
-
- When an OR receives a RELAY cell, it checks the cell's circID and
- determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the RELAY cell.
-
- Otherwise, if the OR is not at the OP edge of the circuit (that is,
- either an 'exit node' or a non-edge node), it de/encrypts the payload
- with AES/CTR, as follows:
- 'Forward' relay cell (same direction as CREATE):
- Use Kf as key; decrypt, using sequence number to synchronize
- ciphertext and keystream.
- 'Back' relay cell (opposite direction from CREATE):
- Use Kb as key; encrypt, using sequence number to synchronize
- ciphertext and keystream.
- Note that in counter mode, decrypt and encrypt are the same operation.
- [Since the sequence number is only 2 bytes, what do you do when it
- rolls over? -RD]
-
- Each stream encrypted by a Kf or Kb has a corresponding unique state,
- captured by a sequence number; the originator of each such stream chooses
- the initial sequence number randomly, and increments it only with RELAY
- cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
- there's no need for counting bytes directly. Right? - ML]
- [I believe this is true. You'll find out for sure when you try to
- build it. ;) -RD]
-
- The OR then decides whether it recognizes the relay cell, by
- inspecting the payload as described in section 5.1 below. If the OR
- recognizes the cell, it processes the contents of the relay cell.
- Otherwise, it passes the decrypted relay cell along the circuit if
- the circuit continues. If the OR at the end of the circuit
- encounters an unrecognized relay cell, an error has occurred: the OR
- sends a DESTROY cell to tear down the circuit.
-
- When a relay cell arrives at an OP, the OP decrypts the payload
- with AES/CTR as follows:
- OP receives data cell:
- For I=N...1,
- Decrypt with Kb_I, using the sequence number as above. If the
- payload is recognized (see section 5.1), then stop and process
- the payload.
-
- For more information, see section 5 below.
-
-4.6. CREATE_UDP and CREATED_UDP cells
-
- Users set up UDP circuits incrementally. The procedure is similar to that
- for TCP circuits, as described in section 4.1. In addition to the TLS
- connection to the first node, the OP also attempts to open a DTLS
- connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
- payload in the same format as a CREATE cell. To extend a UDP circuit past
- the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
- instructs the last node in the circuit to send a CREATE_UDP cell to extend
- the circuit.
-
- The relay payload for an EXTEND_UDP relay cell consists of:
- Address [4 bytes]
- TCP port [2 bytes]
- UDP port [2 bytes]
- Onion skin [186 bytes]
- Identity fingerprint [20 bytes]
-
- The address field and ports denote the IPV4 address and ports of the next OR
- in the circuit.
-
- The payload for a CREATED_UDP cell or the relay payload for an
- RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
- RELAY_EXTENDED cell. Both circuits are established using the same key.
-
- Note that the existence of a UDP circuit implies the
- existence of a corresponding TCP circuit, sharing keys, sequence numbers,
- and any other relevant state.
-
-4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
-
- As above, the OP must successfully connect using DTLS before attempting to
- send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
- section 4.1.1.
-
-5. Application connections and stream management
-
-5.1. Relay cells
-
- Within a circuit, the OP and the exit node use the contents of RELAY cells
- to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
- across circuits. End-to-end commands and UDP packets can be initiated by
- either edge; streams are initiated by the OP.
-
- The payload of each unencrypted RELAY cell consists of:
- Relay command [1 byte]
- 'Recognized' [2 bytes]
- StreamID [2 bytes]
- Digest [4 bytes]
- Length [2 bytes]
- Data [498 bytes]
-
- The relay commands are:
- 1 -- RELAY_BEGIN [forward]
- 2 -- RELAY_DATA [forward or backward]
- 3 -- RELAY_END [forward or backward]
- 4 -- RELAY_CONNECTED [backward]
- 5 -- RELAY_SENDME [forward or backward]
- 6 -- RELAY_EXTEND [forward]
- 7 -- RELAY_EXTENDED [backward]
- 8 -- RELAY_TRUNCATE [forward]
- 9 -- RELAY_TRUNCATED [backward]
- 10 -- RELAY_DROP [forward or backward]
- 11 -- RELAY_RESOLVE [forward]
- 12 -- RELAY_RESOLVED [backward]
- 13 -- RELAY_BEGIN_UDP [forward]
- 14 -- RELAY_DATA_UDP [forward or backward]
- 15 -- RELAY_EXTEND_UDP [forward]
- 16 -- RELAY_EXTENDED_UDP [backward]
- 17 -- RELAY_DROP_UDP [forward or backward]
-
- Commands labelled as "forward" must only be sent by the originator
- of the circuit. Commands labelled as "backward" must only be sent by
- other nodes in the circuit back to the originator. Commands marked
- as either can be sent either by the originator or other nodes.
-
- The 'recognized' field in any unencrypted relay payload is always set to
- zero.
-
- The 'digest' field can have two meanings. For all cells sent over TLS
- connections (that is, all commands and all non-UDP RELAY data), it is
- computed as the first four bytes of the running SHA-1 digest of all the
- bytes that have been sent reliably and have been destined for this hop of
- the circuit or originated from this hop of the circuit, seeded from Df or Db
- respectively (obtained in section 4.2 above), and including this RELAY
- cell's entire payload (taken with the digest field set to zero). Cells sent
- over DTLS connections do not affect this running digest. Each cell sent
- over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
- set to the SHA-1 digest of the current RELAY cells' entire payload, with the
- digest field set to zero. Coupled with a randomly-chosen streamID, this
- provides per-cell integrity checking on UDP cells.
- [If you drop malformed UDP relay cells but don't close the circuit,
- then this 8 bytes of digest is not as strong as what we get in the
- TCP-circuit side. Is this a problem? -RD]
-
- When the 'recognized' field of a RELAY cell is zero, and the digest
- is correct, the cell is considered "recognized" for the purposes of
- decryption (see section 4.5 above).
-
- (The digest does not include any bytes from relay cells that do
- not start or end at this hop of the circuit. That is, it does not
- include forwarded data. Therefore if 'recognized' is zero but the
- digest does not match, the running digest at that node should
- not be updated, and the cell should be forwarded on.)
-
- All RELAY cells pertaining to the same tunneled TCP stream have the
- same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
- cells that affect the entire circuit rather than a particular
- stream use a StreamID of zero.
-
- All RELAY cells pertaining to the same UDP tunnel have the same streamID.
- This streamID is chosen randomly by the OP, but cannot be zero.
-
- The 'Length' field of a relay cell contains the number of bytes in
- the relay payload which contain real payload data. The remainder of
- the payload is padded with NUL bytes.
-
- If the RELAY cell is recognized but the relay command is not
- understood, the cell must be dropped and ignored. Its contents
- still count with respect to the digests, though. [Before
- 0.1.1.10, Tor closed circuits when it received an unknown relay
- command. Perhaps this will be more forward-compatible. -RD]
-
-5.2.1. Opening UDP tunnels and transferring data
-
- To open a new anonymized UDP connection, the OP chooses an open
- circuit to an exit that may be able to connect to the destination
- address, selects a random streamID not yet used on that circuit,
- and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
- and port of the destination host. The payload format is:
-
- ADDRESS | ':' | PORT | [00]
-
- where ADDRESS can be a DNS hostname, or an IPv4 address in
- dotted-quad format, or an IPv6 address surrounded by square brackets;
- and where PORT is encoded in decimal.
-
- [What is the [00] for? -NM]
- [It's so the payload is easy to parse out with string funcs -RD]
-
- Upon receiving this cell, the exit node resolves the address as necessary.
- If the address cannot be resolved, the exit node replies with a RELAY_END
- cell. (See 5.4 below.) Otherwise, the exit node replies with a
- RELAY_CONNECTED cell, whose payload is in one of the following formats:
- The IPv4 address to which the connection was made [4 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- or
- Four zero-valued octets [4 octets]
- An address type (6) [1 octet]
- The IPv6 address to which the connection was made [16 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
- field. No version of Tor currently generates the IPv6 format.]
-
- The OP waits for a RELAY_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package UDP data in RELAY_DATA_UDP cells, and upon receiving such
- cells, echo their contents to the corresponding socket.
- RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
-
- Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
- a cell, the OR or OP must drop it.
-
-5.3. Closing streams
-
- UDP tunnels are closed in a fashion corresponding to TCP connections.
-
-6. Flow Control
-
- UDP streams are not subject to flow control.
-
-7.2. Router descriptor format.
-
-The items' formats are as follows:
- "router" nickname address ORPort SocksPort DirPort UDPPort
-
- Indicates the beginning of a router descriptor. "address" must be
- an IPv4 address in dotted-quad format. The last three numbers
- indicate the TCP ports at which this OR exposes
- functionality. ORPort is a port at which this OR accepts TLS
- connections for the main OR protocol; SocksPort is deprecated and
- should always be 0; DirPort is the port at which this OR accepts
- directory-related HTTP connections; and UDPPort is a port at which
- this OR accepts DTLS connections for UDP data. If any port is not
- supported, the value 0 is given instead of a port number.
-
-Other sections:
-
-What changes need to happen to each node's exit policy to support this? -RD
-
-Switching to UDP means managing the queues of incoming packets better,
-so we don't miss packets. How does this interact with doing large public
-key operations (handshakes) in the same thread?
-
-========================================================================
-COMMENTS
-========================================================================
-
-[16 May 2006]
-
-I don't favor this approach; it makes packet traffic partitioned from
-stream traffic end-to-end. The architecture I'd like to see is:
-
- A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
- TCP/TLS for firewall penetration or something. (This also gives us an
- upgrade path for routing through legacy servers.)
-
- B Stream traffic is handled with end-to-end per-stream acks/naks and
- retries. On failure, the data is retransmitted in a new RELAY_DATA cell;
- a cell isn't retransmitted.
-
-We'll need to do A anyway, to fix our behavior on packet-loss. Once we've
-done so, B is more or less inevitable, and we can support end-to-end UDP
-traffic "for free".
-
-(Also, there are some details that this draft spec doesn't address. For
-example, what happens when a UDP packet doesn't fit in a single cell?)
-
--NM
More information about the tor-commits
mailing list