RFC/proposal for Thandy changes
Nick Mathewson
nickm at freehaven.net
Sun Jan 2 06:14:40 UTC 2011
On Sun, Oct 17, 2010 at 10:46 PM, Justin Samuel <js at justinsamuel.com> wrote:
> Hi all,
Hi, and sorry about the delay! I think I like most of it, dislike a
couple of details, and have some points where I don't get it. More
detail follows.
[ lines re-wrapped]
>
> 0. Proposed Thandy Changes
> ==========================
>
> This is a set of proposals that includes a section of simple changes
> that can be considered on their own (Section 1) as well as a more
> fundamental Thandy restructuring proposal (Section 2).
>
> This isn't meant to be at the level of detail needed for a spec and
> subsequent implementation. This is to get feedback and promote
> discussion. It's not an official proposal at this point but more of a
> request for comment.
Okay. I'm going to omit all comments of the form "Could be okay, but
needs more detail", then. :)
> A few relevant documents for reference:
>
> * Thandy spec:
> https://gitweb.torproject.org/thandy.git/blob_plain/HEAD:/specs/thandy-spec.txt
> * TUF spec: https://www.updateframework.com/browser/specs/tuf-spec.txt
> * High-level differences between Thandy and TUF:
> https://www.updateframework.com/wiki/ThandyDifferences
> * Paper on TUF: http://www.freehaven.net/~arma/tuf-ccs2010.pdf
>
> 1. Individual Thandy Changes
> ============================
>
> These are changes that could be made to Thandy without major overhaul
> and can be considered separately of the restructuring proposal (Section
> 2).
>
> 1.1. Multiple File Hashes
> -------------------------
>
> Make all file hashes be a set of (algorithm, digest) pairs rather than a
> single digest of a predefined algorithm. Thus, instead of describing a
> file's hash in metadata with:
>
> "hash" : 349dceb3de2db82e363c3d73063f031c56c5aac5
>
> It would be described as:
>
> "hash" : ["sha1" : 349dceb3de2db82e363c3d73063f031c56c5aac5,
> "sha256" : 95eaa1682a99fba24b26c94499b545...747d7759ba845c8b5c]
(To be pedantic, what you describe is NOT well-formed. You'd need to
say
"hash" : [ ("sha1", "349dceb3db...") ,
("sha256", "95eaa...") ]
if you mean a list of algorithm digest pairs, but
"hash" : { "sha1" : "349dceb3db...",
"sha256: "95eaa..." }
if you mean a map from algorithm names to digest values.)
> Note that it could still be allowed to list only one digest. This just
> allows the ability to use multiple hashes. It's up to the client
> implementation to determine which are checked.
I like this approach adding support for more hash algorithm. We'd need
to say more about interoperability, however, since unless the party
verifying the hash and the party generating it have at least one hash
algorithm in common, the hash can't get verified.
Also, if we've got multiple hashes that not everybody can verify, we
create the possibility of making documents that some people will accept
but others will not. For example, suppose that when the digests D1 and
D2 are both included, client C1 only checks D1, but client C2 only
checks D2. A corrupt signer can then make an update that C1 will reject
but C2 will accept. Worse, if the C1 implementation is relatively
unpopular among developers and code signers, then the bad digest might
not get noticed until the update was widely distributed
To solve the first problem (compatibility), let's define at least one
reasonably good hash algorithm that all verifiers must recognize, and
that all formats must include. I suggest SHA256 for now, plus SHA3
when it is finalized.
Solving the second problem (not everybody verifying the same digests or
accepting the same updates) is harder. It's not possible in general to
avoid it, so long as there is ever a digest algorithm that some client
knows about and others don't. We can, however, avoid the worst attacks
here if we stipulate:
- When verifying hashes on relatively small inputs, all parties
should check that they match ALL of the provided hashes that they
recognize. In other words, if you're checking hashes on anything
smaller than a raw binary file, and more than one hash is given
that you know how to calculate, you should calculate and check them
all.
- Code that builds bundles, releases, or whatever, should check all
possible hashes.
- Implementors SHOULD NOT add client support for any digest algorithm
to a client without also getting it widely deployed among parties
that generate it.
The issue as a whole isn't so bad, since any party that could
> 1.2. Refer to Keys by Their ID when Delegating
> ----------------------------------------------
>
> In the Thandy key list file (the "root metadata"), the full keys are
> listed each time they are referenced. This may decrease the human
> readability of the key list.
>
> An alternative approach is to use a separate section in the file that
> defines the keys that will be used in the rest of this metadata file,
> list them with their ID (a hash of the canonical format of the key), and
> then refer to them later by this ID. This is similar to how signatures
> are already done in Thandy: the ID of the key is listed along with the
> signature.
>
> An implementation of this needs to check for ID collisions when reading
> keys from metadata. It's fine to see the same key specified with the
> same ID, but a different key with the same ID as one that has been seen
> indicates something wrong. (Note that the implementation would always
> check that the specified IDs are correct for the corresponding key, even
> without collisions.)
Sounds fine to me.
But I wasn't originally envisioning the same key ever being used for
multiple roles, but there's no reason why it can't be. The spec says,
"Separate keys should be used for different people and different
roles.") But there's no inherent reason that one person couldn't use
one key to sign everything they do: if two private keys would have been
stored in the same user's keystore, any compromise on one would probably
compromise the other.
We _should_ probably require that the same key never be used for two
different types of role. For instance, a root key should never sign
packages, a timestamp key should never sign anything but the timestamp,
and so on on.
> 1.3. Indicate Signature Thresholds
> ----------------------------------
>
> The current Thandy spec isn't clear about how multiple keys are
> specified for a role. There also doesn't appear to be a way to specify a
> threshold that is less than the total number of keys. The method of
> specifying multiple keys should be made clear and the ability to
> indicate the number of signatures of those keys that are required should
> be added. (There's an example of how this can be specified in metadata
> in Section 2.2.)
Sounds fine.
> 1.4. Add a 'Release' Role
> -------------------------
>
> Thandy currently lists the hashes of all other metadata in the timestamp
> file. There are certain attacks that could be mitigated if the
> Timestamp role signed a separate Release role's metadata that listed the
> hashes of all other metadata files. The idea here is that an attacker
> who compromises only the Timestamp role cannot present clients with a
> mix-and-match of signed metadata files that were available from the
> repository at different times. The separation helps because the
> Timestamp role has a higher likelihood of key compromise because the
> keys are used in an automated fashion, whereas the Release role would
> not be used in an automated fashion.
>
> Though the idea of a metadata mix-and-match attack is in general
> something worth keeping in mind, it may be the case that Thandy isn't at
> much risk because bundles serve a similar role of grouping together
> package versions in a way that attackers can't cause the clients to use
> an unintended combination of package versions. The risk to Thandy
> depends on whether packagers ever replace a package version rather than
> increment it (they aren't supposed to ever replace a version) and
> whether Thandy bundles always specify exact package versions rather than
> minimum/maximum package versions or package version ranges.
Sounds okay to me.
> 2. Thandy Restructuring Proposal
> ================================
>
> Primary goal: Keep Thandy's concepts of bundles and packages but overlay
> them on top of the generic 'targets' approach of TUF.
I don't get this as a goal. Obviously, you're not advocating that we
should use TUF's 'targets' approach for it's own sake: you're advocating
that we use it in order to get some concrete benefit that it provides.
_That benefit_ is the real goal, not mere use of TUF's approach for its
own sake. (I know this sounds like nitpicking, but unless we're clear
about what actual benefits a change is meant to provide, it's harder to
evaluate it.)
If I had to guess, I would say that the real goal is probably something
like, "separate Thandy's notion of packages and bundles from Thandy's
notion of authenticated downloads."
> Note: This proposal is not advocating using/maintaining/relying on TUF
> as a separate project. That depends on factors such as the future of TUF
> according to the current TUF maintainers, whether Python is an
> appropriate choice for Windows clients, etc.
>
> 2.1 Approach
> ------------
>
> Two separate layers:
>
> 1. An authentication layer that downloads and authenticates opaque
> 'target' files according to metadata it understands that lists
> hashes and sizes of the target files. This layer doesn't understand
> what bundles and packages are.
>
> 2. A decision/installation layer that uses the authentication layer to
> download bundle/package info and associated files. This layer
> doesn't know the details of the authentication mechanisms or roles;
> it gets files from the authentication layer that the authentication
> layer has already authenticated.
>
> * Note that the update decision and installation code are probably
> separate, but for the sake of this proposal all that matters is
> that the Thandy authentication layer is logically separate from
> the rest of Thandy.
Hm. It would help to know what exactly the interface to layer 1 should
be. I'm guessing it's something like, "Update the metadata", "Tell me
what files are available", "Download the following files".
> For the authentication layer, we start with the following roles (the
> same as TUF uses):
>
> * Root
> o Root of trust for the entire PKI. Indicates through signed
> metadata which keys are trusted for the Release, Targets,
> Timestamp, and Mirror roles.
>
> * Timestamp
> o Signs a frequently regenerated timestamp file with a short
> expiration indicating the most recent release metadata.
>
> * Release
> o Signs the release metadata which lists the hashes and sizes of all
> other metadata files (other than the timestamp file). Note that
> bundleinfo and pkginfo are not considered metadata at the
> authentication layer.
>
> * Targets
> o Signs a metadata file that lists the hashes and sizes of target
> files: the files that the decision layer ultimately wants to
> obtain.
>
> o Can delegate to sub-roles the responsibility for providing target
> files from specific paths on the repository (e.g. Role A is
> trusted to provide files from the /targets/role_a/ directory).
It sounds like you're combining the roles of signing code (which the
targets key can do, and delegates) with the role of deciding who can
sign code. Is that wise? Nowhere else in the Thandy design is this
done.
In practice, I'd assume that the Targets role should be pretty much
*only* used for delegation. But in that case, what's the benefit of
separating this from the root role?
> * Mirror
> o Signs a metadata file that lists the locations and details of
> repository mirrors.
>
> From here we use delegation by the Targets role to create the roles for
> bundlers and packagers. The top-level Targets role delegates a separate
> role for each bundle and each package.
>
> The targets role hierarchy looks like this (with many more bundle and
> package roles):
>
> Root
> `-- Targets
> |-- bundles/tor-browser-stable
> |-- bundles/tor-browser-beta
> `-- pkgs/openssl
>
> Each bundle version and package version that bundlers and packagers
> released has a separate bundleinfo and pkginfo file, respectively. These
> bundleinfo and pkginfo files are opaque to the authentication layer: it
> considers them target files like any other. However, the decision layer
> understands the contents of these files and uses them to make subsequent
> download and installation decisions (with the downloads always being
> done through the authentication layer).
>
> 2.2. Repository Structure
> -------------------------
>
> Top-level metadata files are:
>
> /meta/root.txt
> /meta/release.txt
> /meta/timestamp.txt
> /meta/targets.txt
> /meta/mirrors.txt
>
> The /meta/targets.txt file would include a delegations section such as:
>
> delegations : {
> keys : {
> 'ABC...' : { details },
> '123...' : { details },
> ...
> },
> roles : {
> 'bundles/tor-browser-stable' : {
> keys : ['ABC...', '123...'],
> threshold : 2,
> paths : ['bundles/tor-browser-stable/**'],
> },
> 'pkgs/openssl' : {
> keys : ['DEF...', '456...'],
> threshold : 2,
> paths : ['pkgs/openssl/**'],
> },
> ...
> }
> }
To be clear, are you proposing that *every* role be able to delegate
itself in its particular file, or that a single level of delegation
exist in the targets.txt file?
> The above would mean that the top-level Targets role had delegated a
> role whose full name would be targets/bundles/tor-browser-stable (as it
> is delegated by the targets role, the prepended targets/ is implicit in
> the delegated role's name). This role for the tor-browser-stable bundle
> would be trusted for the specified paths relative to the repository's
> targets/ directory. Thus, a specific version's bundleinfo file created
> by the bundler could be placed on the repository at, for example:
>
> /targets/bundles/tor-browser-stable/win32/0.1/tor-browser-stable_win32_0.1.bundleinfo
>
> (Note that this bundle role is trusted for all targets files matching
> the path 'bundles/tor-browser-stable/**' under the repository's targets/
> directory, as specified when this role was created through the above
> delegation.)
>
> The bundle maintainer would sign a metadata file listing the hash and
> size of this bundleinfo. This metadata would be placed on the repository
> at:
>
> /meta/targets/bundles/tor-browser-stable/win32/0.1/tor-browser-stable_win32_0.1.txt
>
> (Note that the basename of these files isn't crucial to this aspect of
> the design. They don't need to repeat the path info, though that's
> probably helpful for humans.)
>
> More generally, the metadata location is:
>
> /meta/ROLE_NAME/[ANY_PATH/]ANY_NAME.txt
>
> Packages are similar to bundles with the difference that there are one
> or more target files in addition to the pkginfo file. A package
> maintainer may supply the following files to be placed on the
> repository:
>
> /targets/pkgs/openssl/win32/0.9.8m/openssl_win32_0.9.8m.pkginfo
> /targets/pkgs/openssl/win32/0.9.8m/libeay32.dll
> /targets/pkgs/openssl/win32/0.9.8m/ssleay32.dll
>
> The hashes and sizes of these files are listed in metadata signed by the
> targets/pkgs/openssl role (that is, the openssl package maintainer's
> role). This metadata would be placed on the repository at:
>
> /meta/targets/pkgs/openssl/win32/0.9.8m/openssl_win32_0.9.8m.txt
So to see if I have it right:
- Every target file corresponds to exactly one target metadata file,
though any target metadata file can in principle correspond to one
or more target files.
- It is trivial, given a target metadata file, to learn which target
files it authenticates. It is not trivial, given a target file, to
learn which target metadata file authenticates it; ideally, it will
be in a corresponding location in the metadata. (Is this required?)
- Both layers of the updater (the authentication layer and the
decision layer) need to be able to verify hashes and signatures.
> 2.3. Update Procedure
> ---------------------
>
> The update procedure is:
>
> * The decision layer uses the authentication layer to retrieve a list
> of all available bundleinfo files.
> o Implementation: the decision layer asks the authentication layer
> for a list of all available metadata file paths/names. The
> authentication layer obtains this information from the release
> metadata.
> * Looking at the paths/names of available bundleinfo files, the
> decision layer identifies whether there is a newer version of a
> bundle it is interested in.
> o Implementation: the bundle names, OS, arch, and bundle version are
> all contained in paths of the available bundle metadata files.
This seems to add a requirement that you can do a mapping from bundle
name to bundle version. Specifying string-to-version mappings in a
reliable way can be really nasty. Sure you want to do that?
> * The decision layer notices a bundle version in the list that it
> wants and uses the authentication layer to retrieve the bundleinfo
> file for that version.
> * The decision layer reads the contents of the bundleinfo file which
> indicate the necessary package versions and any other info the
> decision layer needs.
> * The decision layer uses the authentication layer to retrieve the
> pkginfo files for each of the package versions that it wants.
> * The decision layer understands the contents of the pkginfo
> files. These files indicate the individual files that are part of
> this version of the package.
> * The decision layer uses the authentication layer to retrieve the
> individual files (e.g. /targets/pkgs/openssl/win32/0.9.8m/libeay32.dll)
> that are needed.
> * The decision layer hands off the relevant installation instructions
> (from the bundleinfo and pkginfo files) and individual package files
> to the code that performs the installation/upgrade.
>
>
> 2.4.bundleinfo and pkginfo
> --------------------------
>
> As the contents of the bundleinfo and pkginfo are opaque to the
> authentication layer, essentially there are two completely separate sets
> of metadata in this design. It would make sense to have them use the
> same format (e.g. Canonical JSON) and be parsed/generated by the same
> code.
This argues for three components, then: the two you described, plus a
generic data-format layer that they both could use.
> The bundleinfo and pkginfo files would contain largely the same
> information as these files do in the current Thandy spec (though they
> wouldn't be directly signed but rather would be described in signed
> authentication-layer metadata).
>
> There are a few reasons it is good to have the bundleinfo/pkginfo be
> opaque to the authentication layer. One reason is that changes to
> bundleinfo/pkginfo fields can be tested independently of the
> authentication layer. Also, non-backwards-compatible changes could be
> made by introducing a new file name such as bundleinfo.v2 which would be
> effectively invisible to legacy clients.
>
> 2.5. Differences with TUF
> -------------------------
>
> The authentication layer's metadata and roles are very similar to the
> current TUF specification. However, there are a few differences.
>
> TUF currently does not allow a single role to directly delegate multiple
> roles deep. In TUF, one would need the following role structure:
>
> Root
> `-- Targets
> |-- bundles
> | `-- tor-browser-stable
> `-- pkgs
> `-- openssl
>
> That is, the Targets role would have to first delegate a bundles role
> which then delegates a tor-browser-stable role.
>
> Relatedly, TUF gives each delegated role the ability to sign a single
> metadata file whose name is exactly the role's name. This may be
> non-ideal for Thandy because bundlers and packagers would need to keep a
> continuously growing metadata file that lists all of the versions that
> they want to be available to clients or, alternatively, delegate
> subroles for each version in order to use separate metadata files for
> each. (Note that this is talking about the authentication layer's
> metadata, not bundleinfo and pkginfo files.)
>
> In contrast, with this proposal, a bundler/packager would sign a
> metadata file that lists only the new target files they are adding to
> the repository.---This isn't a case where there's one correct way to do
> things, but my understanding is that Thandy would like old versions to
> remain available within their expiration times and would like
> bundlers/packagers to not have to deal with issues such as accidentally
> removing an old version they didn't mean to remove when generating and
> signing metadata to make a new version available.
>
> [end of proposal]
>
peace & happy new year,
--
Nick
More information about the tor-dev
mailing list