[tor-commits] [metrics-web/master] Add Contributor's guide to the Metrics website.
karsten at torproject.org
karsten at torproject.org
Thu Jan 28 17:00:32 UTC 2016
commit 5611ef4563453ac74ada78299d0972c166b83230
Author: Karsten Loesing <karsten.loesing at gmx.net>
Date: Thu Jan 28 18:00:20 2016 +0100
Add Contributor's guide to the Metrics website.
---
CONTRIB.md | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 89 insertions(+)
diff --git a/CONTRIB.md b/CONTRIB.md
new file mode 100644
index 0000000..4f36afe
--- /dev/null
+++ b/CONTRIB.md
@@ -0,0 +1,89 @@
+Contributor's guide to the Metrics website
+
+Dear contributor to the Metrics website. This guide shall help you
+understand the design decisions behind building the Metrics website and
+give you starting points where you should look to make it bigger and
+better.
+
+First of all, let's talk briefly about the scope of the Metrics website,
+which we'll be calling Metrics in the following.
+
+ - What Metrics is: Metrics is supposed to provide easy access to Tor
+ network data. The typical Metrics user is neither a researcher nor a
+ developer and is just looking for an easy way to learn more about this
+ Tor network they have been hearing about. Metrics is giving them data
+ in visual or tabular form, together with explanations that are easy to
+ understand with as little technical language as possible.
+
+ - What Metrics is not: The typical Tor researcher or Tor developer would
+ probably want to dive deeper into the data to learn even more about the
+ Tor network. But in contrast to the average Metrics user they could
+ just fetch the original data from CollecTor and run their own analysis.
+ Metrics is not trying to be the solution for everyone. If we have to
+ choose, we're aiming for simplicity instead of comprehensiveness.
+
+Now let's take a quick tour of the components that Metrics is made of.
+
+ - Data-processing modules: The bulk of Metrics code is running in the
+ background, invisible to Metrics users. It's the code that takes
+ CollecTor data as input and that produces .csv files that are the basis
+ for graphs and tables on Metrics. There's usually one such module per
+ generated .csv file that focuses on a different aspect of Tor network
+ data. All these modules are periodically executed by the system's cron
+ daemon, independent of user requests to the website part of Metrics.
+ See the modules/ subdirectory for the existing data-processing modules.
+ Note that modules don't have to be written in Java even though that's
+ currently the case for all of them. The only requirement is that
+ there's a shell script to run the module using packages available in
+ Debian stable. The remaining components of Metrics are all related to
+ its website part.
+
+ - Start page: The website part of Metrics is organized into one page per
+ metric, which can be a graph, table, data file, or external link, and
+ the start page to browse available metrics. Each metric has attributes
+ like a descriptive name, one or more tags (relays, bridges, etc.), a
+ type (graph, table, etc.), and a level (basic or advanced). All
+ metrics are defined in `website/etc/metrics.json` and displayed in the
+ table on the start page.
+
+ - Graph pages: The bulk of graph pages consist of graphing methods in
+ `website/rserve/graphs.R` that are written in R and using the ggplot2
+ graphing library. These methods read one or more of the .csv files
+ produced by data-processing modules and produce a graph image as
+ output. Graphs have a few additional attributes in
+ `website/etc/metrics.json` like a description and parameters to
+ customize the graph. As of writing this guide, there's one exception
+ with the bubble graph which is implemented using JavaScript library
+ D3.js and which might soon be generated on the server using Node.js.
+
+ - Table pages: Metrics also provides a few aspects of Tor network data in
+ tabular form with customization options. Like graphs, the data in
+ these tables is provided using R by reading the previously generated
+ .csv files. All relevant R code for generating table data is located
+ in `website/rserve/tables.R`. Again, there are additional attributes
+ in `website/etc/metrics.json` that define what parameters are available
+ to customize table contents and how to format results.
+
+ - Data pages: While most Metrics user are not expected to run their own
+ analyses based on raw Tor network data, some of them might want to look
+ deeper into the data they saw in a graph or table. Metrics provides
+ all pre-aggregated output from its data-processing modules as
+ downloadable .csv files and also documents these file formats in
+ sufficient detail for Metrics users to use them.
+
+ - Link pages: Metrics is not the only game in town, and it's great that
+ other developers take the publicly available Tor network data and
+ visualize it in a different way. Metrics acknowledges these efforts by
+ adding link pages with thumbnails to make it easy for Metrics users to
+ find those external visualizations.
+
+ - About page: Most Metrics users have a basic understanding of how Tor
+ works, most likely from reading the main Tor website. But Metrics
+ should give its users enough explanations to understand where all the
+ Tor network data comes from and how that data is used to learn
+ interesting facts about the Tor network. That's where the About page
+ comes into play. The About page consists of a list of frequently used
+ terms and a second list of frequently asked questions. There could be
+ more documentation, but more text doesn't necessarily mean that users
+ will read more.
+
More information about the tor-commits
mailing list