[metrics-bugs] #23285 [Metrics/Metrics website]: Provide an index.json file on Tor Metrics containing stats files
Tor Bug Tracker & Wiki
blackhole at torproject.org
Mon Aug 21 13:35:58 UTC 2017
#23285: Provide an index.json file on Tor Metrics containing stats files
-----------------------------------------+--------------------------
Reporter: karsten | Owner: metrics-team
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Metrics/Metrics website | Version:
Severity: Normal | Keywords:
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
-----------------------------------------+--------------------------
We have been discussing separating the data-aggregating part of metrics-
web from the website part in the past. Here's a plan to make this happen:
- We provide a new index file on Tor Metrics containing all stats files
specified on the [https://metrics.torproject.org/stats.html Statistics]
page, including path, size, and last-modified time. Example (with just a
single file):
{{{
{
"index_created": "2017-08-21 13:10",
"path": "https://metrics.torproject.org",
"directories": [
{
"path": "stats",
"files": [
{
"path": "servers.csv",
"size": 4794794,
"last_modified": "2017-08-21 00:29"
}
]
}
]
}
}}}
- The new index file will be available under
`https://metrics.torproject.org/index/index.json` (does not exist yet) as
well as `.gz`, `.xz`, etc.
- The new file will be written right after running the periodic update
twice per day as part of [https://gitweb.torproject.org/metrics-
web.git/tree/shared/bin/99-copy-stats-files.sh this script].
- We might even include an `"implementation_version"` field as discussed
in #21414.
- We start using that file by putting a new table at the top of the
[https://metrics.torproject.org/stats.html Statistics] page that lists all
available files together with their size, last update time, and link to
their specification. Like a table of contents. So far so good, this is not
yet worth the effort. That comes next!
- In the next step we write a little internal downloader that is part of
the website part of metrics-web. That downloader periodically fetches the
`index.json` file to see if there are updates to stats files. If there
are, it downloads these files and stores them locally for rserve to
produce new graphs based on the new data.
- Now we can set up a second metrics-web instance somewhere that has the
sole purpose of aggregating data. We might want to call it
`https://metrics2.torproject.org/` (or some other name, if we can settle
on one). We point the periodic downloader to that host and fetch newly
updated CSV files from there. And we turn off data-aggregating modules on
the actual Tor Metrics website host. (Maybe it's easier to find a smaller
host for the website and move that part, while keeping the data-
aggregating parts in place. Whatever.)
Does this make sense?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23285>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list