[tor-bugs] #5651 [Metrics Utilities]: Annotation header with descriptor types
Tor Bug Tracker & Wiki
torproject-admin at torproject.org
Thu May 3 13:55:37 UTC 2012
#5651: Annotation header with descriptor types
-------------------------------+--------------------------------------------
Reporter: atagar | Owner:
Type: enhancement | Status: new
Priority: normal | Milestone:
Component: Metrics Utilities | Version:
Keywords: | Parent:
Points: | Actualpoints:
-------------------------------+--------------------------------------------
Comment(by karsten):
Hmm. After reading your last comment a couple of times, especially the
"break backward compatibility" and "tor implementation detail" parts, I'd
like to step back a bit.
What we are trying to achieve here is that whenever we obtain descriptors
---produced by Tor or by a Tor-related service and maybe post-processed by
metrics-db---we want to get hints for parsing them. The four descriptor
sources that come to mind are: 1) request from Tor's control port, 2)
request from a (remote) Tor's directory port, 3) read from a cached-* file
in a (local) Tor data directory, 4) read from a metrics tarball. For
cases 1 to 3 we already have enough context to know what descriptor type
and version to expect: for 1) and 2) we know what we requested from Tor,
so we can use that as a hint to parse what we get returned; for 3), we can
derive what descriptors are contained in a cached-* file from the filename
---it's a hack anyway, because Tor's cached-* files are highly
implementation-specific and we're messing with them on our own risk. Only
for 4) it's not so easy to tell what descriptors are contained in a
tarball, especially if tarballs are extracted, so having annotations in
files contained in metrics tarballs would make sense. But I think we
shouldn't aim for adding these annotations to anything that Tor produces.
We also shouldn't mimic Tor's annotation syntax: let's start metrics
annotations with `"# "` instead of `"@type"`.
Your point of using the version number to decide if a change is backward-
compatible or not is a good one. Ideally, we should distinguish between
changes that add information and are harmless if parsed by an old parser,
and changes that cannot be parsed by an old parser in a reasonable way.
For example: leaving the original bridge nickname in sanitized descriptors
instead of "Unnamed" is a minor change (#5684), replacing server
descriptor digests with correctly calculated ones would be a major change
(#5607). We should have major and minor versions just for that. We could
encode the dir-spec.txt version in the descriptor type name, because a new
dir-spec version is very likely not backward-compatible; and even if it
is, we can easily use the same parser. The rule would be that we wouldn't
parse a descriptor with unknown type or higher major version than ours,
but that we would parse a descriptor with higher minor version, possibly
emitting a warning.
So, how about we include the following descriptor-type annotations to
files contained in metrics tarballs?
- `# server-descriptor 1.0`
- `# extra-info 1.0`
- `# directory 1.0`
- `# network-status-2 1.0`
- `# network-status-consensus-3 1.0`
- `# network-status-vote-3 1.0`
- `# bridge-network-status 1.0`
- `# bridge-server-descriptor 1.0`
- `# bridge-extra-info 1.0`
- `# torperf 1.0`
- `# bridge-pool-assignment 1.0`
- `# gettor 1.0`
- `# tordnsel 1.0`
Whenever something changes in Tor or in a Tor service in a
backward-''in''compatible way, the descriptor type name would change.
Whenever we change something in metrics-db, which happens much more
frequently, the version number would change.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/5651#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list