[tor-commits] [ooni-probe/master] First pass at freshening up the architecture document (#580)
art at torproject.org
art at torproject.org
Fri Jan 13 12:39:57 UTC 2017
commit 128531b9bc1557d5d36580651bac9caaeee50a95
Author: Arturo Filastò <arturo at filasto.net>
Date: Tue Nov 22 12:37:23 2016 +0000
First pass at freshening up the architecture document (#580)
---
docs/source/architecture.rst | 351 ++++++++++++++++++++++++-------------------
docs/source/conf.py | 2 +-
2 files changed, 201 insertions(+), 152 deletions(-)
diff --git a/docs/source/architecture.rst b/docs/source/architecture.rst
index 5e7d253..921fcbb 100644
--- a/docs/source/architecture.rst
+++ b/docs/source/architecture.rst
@@ -1,32 +1,119 @@
Architecture
============
-The goal of this document is provide an overview of how ooni works, what are
-it's pieces and how they interact with one another.
+Last Updated: 2016-08-01
-Keep in mind that this is the *big picture* and not all of the features and
-compontent detailed here are implemented.
-To get an idea of what is implemented and with what sort of quality see the
-`Implementation status`_ section of this page.
+The purpose of this goal is to illustrate the design goals of the various
+components part of the OONI ecosystem, how they work and what is the
+relationship between each other.
-The two main components of ooni are `oonib`_ and `ooniprobe`_.
+The following diagram gives you an idea of how the various OONI components
+are related to each other.
-.. image:: _static/images/ooniprobe-architecture.png
- :width: 700px
+.. graphviz::
-ooniprobe
----------
+ digraph Architecture {
-ooniprobe the client side component of ooni that is responsible for performing
+ subgraph cluster_0 {
+ style=filled;
+ color=lightgrey;
+ node [style=filled,color=white];
+ "ooni-probe";
+ "measurement-kit";
+ label="clients";
+ }
+
+ "ooni-probe" -> "ooni-backend";
+ "measurement-kit" -> "ooni-backend";
+ "ooni-wui" -> "ooni-probe";
+ "lepidopter" -> "ooni-probe";
+ "ooni-backend" -> "ooni-pipeline";
+ "ooni-pipeline" -> "ooni-explorer";
+ }
+
+
+The main software components are the following:
+
+* ooni-probe_: what users interested in contributing measurements will run.
+ It also includes a web based user interface for running measurements and
+ inspecting the results.
+ code repository: `<https://github.com/TheTorProject/ooni-probe>`_
+
+* measurement-kit_: a portable C++ library that implements some ooniprobe
+ tests and is currently being used to port ooniprobe to mobile platforms
+ (Android and iOS).
+ In the future the measurement engine of ooniprobe will be replaced with
+ measurement-kit.
+ code repository: `<https://github.com/measurement-kit/measurement-kit>`_
+
+* ooni-backend_: the software component that measurement clients communicate
+ with to learn the address of where they should submit results, submit results
+ (collector) and run certain tests against (see: `Test Helpers`_).
+ code repository: `<https://github.com/TheTorProject/ooni-backend>`_
+
+* ooni-pipeline_: responsible for taking raw measurement data (from
+ collectors) normalising it, extracting insight from it and preparing it for
+ being presented inside of the `ooni-explorer`_ interface.
+ code repository: `<https://github.com/TheTorProject/ooni-pipeline>`_
+
+* ooni-explorer_: a web front-end to the measurements collected by the OONI
+ platform. It features a world map view showcasing the countries where we have
+ identified network anomalies.
+ code repository: `<https://github.com/TheTorProject/ooni-explorer>`_
+
+* ooni-wui_: web user interface assets and the implementation of the
+ ooni-probe web interface. Components in here are meant to be re-used across
+ the various software components (ooni-probe, ooni-explorer, net-probe, etc.),
+ though work on this from is not yet complete.
+ code repository: `<https://github.com/TheTorProject/ooni-wui>`_
+
+* lepidopter_: a raspberry pi image for running ooniprobe.
+ code repository: `<https://github.com/TheTorProject/lepidopter>`_
+
+* ooni-web_: the canonical ooni.torproject.org website.
+ code repository: `<https://github.com/TheTorProject/ooni-web>`_
+
+
+
+.. _ooni-probe:
+---------------
+
+ooni-probe the client side component of OONI that is responsible for performing
measurements on the to be tested network.
-The main design goals for ooniprobe are:
+Originally thought of as a tool to be used by users to investigate network
+anomalies on their own and quickly implement new tests to check for new
+censorship conditions, the focus is now shifting more towards something
+meant to be used in an unattended manner.
+
+As such it's evolving into being a system daemon that is always running on
+a users machine and automatically performs the network measurements the user
+has instructed it to perform.
+
+Design goals
+.............
+
+The current design goals are:
+
+**Unattended measurement collection**
+
+It should be possible for a user of the system to install it and forget about
+it. This means that it shouldn't be necessary to constantly interact with the tool
+itself.
-Test specification decoupling
-.............................
+Previously some of the design considerations for ooni-probe used to be:
-By this I mean that the definition of the test should be as loosely coupled to
-the code that is used for running the test.
+**Test specification decoupling**
+
+This design goal is still largely valid, though as ooni-probe grows as mainly
+an enduser tool it's importance will be decreasing.
+
+Moreover the long-term plan for this is given the fact that tests are going to
+be run based on measurement-kit_ is to have the testing framework logic be
+implemented in the measurement-kit_ scripting language.
+
+The outline of this design goal nonetheless is that the definition of the test
+should be as loosely coupled to the code that is used for running the test.
This is achieved via what are called **Test Templates**. Test Templates a high
level interface to the test developer specific to the protocol they are writing
@@ -46,8 +133,7 @@ received, but a developer may with to include inside of their report the
checksum of the of the content as is show in the example in `Writing Tests
<writing_tests.html>`_.
-Support for high concurrency
-............................
+**Support for high concurrency**
By this I mean that we want to be able to scan through big lists as fast as
possible.
@@ -68,68 +154,72 @@ For this purpose we have chosen to use the `Twisted networking framework
If you have an argument for which you believe Twisted is not a good idea, I
would love to know :).
-Notes:
-.. XXX
+Running lot's of tests concurrently can reduce their accuracy. The ideal
+strategy for dealing with this would involve adjusting the concurrency
+based on failure rate.
+Currently this is not implemented inside of ooniprobe and instead we use
+a configurable concurrency value that is set to default as 3.
-Running lot's of tests concurrently can reduce their accuracy. The strategy
-for dealing with this involves doing proper error handling and adjusting the
-concurrency window over time if the amount of error rates increases.
+Implementation details
+......................
-Currently the level of concurrency for tests is implemented inside of
-:class:`ooni.inputunit`_, but we do not expose to the user a way of setting
-this. Such feature will be something that will be controllable via the
-ooniprobe API.
+Below is a high level diagram of how the various modules of ooniprobe
+are interrelated to each other.
-Why Tor Hidden Services?
-........................
+.. graphviz::
-We chose to use Tor Hidden Services as the means of exposing a backend
-reporting system for the following reasons:
+ digraph ooniprobe_impl {
-Easy addressing
-_______________
+ "agent" -> "director";
+ "scheduler" -> "director";
-Using Tor Hidden Service allows us to have a globally unique identifier to be
-passed to the ooni-probe clients. This identifier does not need to change even
-if we decide to migrate the collector backend to a different machine (all we
-have to do is copy the private key to the new box).
+ "director" -> "deck";
-It also allows people to run a collector backend if they do not have a public
-IP address (if they are behing NAT for example).
+ "deck" -> "nettest";
+ "deck" -> "backend_client";
+ "deck" -> "nettests";
+ }
-Security
-________
+ooni-probe is written in python using the `Twisted networking framework
+<http://twistedmatrix.org>`_.
-Tor Hidden Services give us for free and with little thought end to end
-encryption and authentication. Once the address for the collector has been
-transmitted to the probe you do not need to do any extra authenticatication, because
-the address is self authenticating.
+The two main concepts in ooniprobe are a decks and nettests. A nettest is a
+particular network test that is designed to identify one class of anomalies.
-Possible drawbacks
-__________________
+A deck is a collection of one or more nettests and some associated inputs (such
+as a list of URLs).
-Supporting Tor Hidden Services as the only system for reporting means a
-ooni-probe user is required to have Tor working to be able to submit reports to
-a collector. In some cases this is not possible, because the user is in a
-country where Tor is censored and they do not have any Tor bridges available.
+The director is responsible for starting the measurement and reporting task
+managers, starting tor, looking up the IP address of the probe and in general
+controlling the lifecycle of the application.
-Latency is also a big issue in Tor Hidden Services and this can make the
-reporting process very long especially if the users network is not very good.
+The schedulers are periodic tasks that need to be executed (think cron). Their
+state is kept track of on disk (in particular the last time a successful
+execution was performed).
-For these reasons we plan to support in the future also non Tor HS based
-reporting to oonib.
-Currently this can easily be achieved by simply using tor2web.org.
+The agent is responsible for starting director, the schedulers and exposing the
+web user interface.
-Standardization
-...............
+.. _measurement-kit:
+--------------------
-.. TODO
+Measurement-kit is a C++ library that implements network measurement primitives
+and some of the ooniprobe tests.
+
+It has been developed with the goal of being able to target mobile platforms
+(Android and iOS), but is growing with the intent of eventually replacing the
+measurement engine of ooniprobe entirely with native code.
+
+There is work in progress to support calling it from python (see:
+`<https://github.com/measurement-kit/measurement-kit/pull/697>`_) and there
+are plans to implement a scripting interface around it to aid the development of
+tests (see: `<https://github.com/measurement-kit/measurement-kit/issues/702>`_).
-oonib
------
+.. _ooni-backend:
+-----------------
This is the backend component of OONI. It is responsible for exposing `test
-helpers`_ and the `report collector`_.
+helpers`_ , the `measurement collector`_ and the `bouncer service`_
Test Helpers
............
@@ -139,120 +229,79 @@ ooniprobes when running tests.
If you would like to see a test helper implemented inside of oonib, thats
great!
-All you have to do is `open a ticket on trac
-<https://trac.torproject.org/projects/tor/newticket?component=Ooni&keywords=oonib_testhelpers%20ooni_wishlist&summary=Add%20support%20for%20PROTOCOL_NAME%20test%20helper>`_.
+All you have to do is `open a ticket on github
+<https://github.com/TheTorProject/ooni-backend/issues/new?title=[new%20test-helper%20request]%20YOUR_TESTHELPER_NAME>`_.
To get an idea of the current implementation status of test helpers see the
`oonib/testhelpers/
-<https://gitweb.torproject.org/ooni-probe.git/tree/HEAD:/oonib/testhelpers>`_
+<https://github.com/TheTorProject/ooni-backend/tree/master/oonib/testhelpers>`_
directory of the ooniprobe git repository.
.. TODO
write up the list of currently implemented test helpers and how to use them.
-Report collector
+Measurement collector
................
-.. autoclass:: oonib.report.file_collector.NewReportHandlerFile
- :noindex:
+This is the service that is used for submitting measurement results to.
+The specification for the API of the measurement collector can be found here:
+`<https://github.com/TheTorProject/ooni-spec/blob/master/oonib.md#20-collector>`_
-An ooniprobe run
-----------------
-
-Here we describe how an ooniprobe run should look like:
-
- 1. If configured to do so ooniprobe will start a connection to the Tor
- network for the purpose of having a known good test channel and for
- having a way of reporting to the backend collector
-
- 2. It will obtain it's IP Address from Tor via the getinfo addr Tor Ctrl port
- request.
-
- 3. If a collect is specified it will connect to the reporting system and get
- a report id that allows them to submit reports to the collector.
-
- 4. If inputs are specified it will slice them up into chunks of request to be
- performed in parallel.
-
- 5. Once every chunk of inputs (called an InputUnit) will have completed the
- report file and/or the collector will be updated.
-
-
-OONIprobe Control Interface
----------------------------
-.. XXX update this section once interface is implemented.
-The ooniprobe client provides a rich and simple JSON-based interface for
-control over HTTP. While the implementation of this interface is currently
-a work in progress, the specification may be found `here <control_interface.rst>`_.
-
-
-Implementation status
----------------------
-
-ooniprobe
-.........
-
-**Reporting**
-
- * To flat YAML file: *alpha*
-
- * To remote httpo backend: *alpha*
-
-**Test templates**
-
- * HTTP test template: *alpha*
-
- * Scapy test template: *alpha*
-
- * DNS test template: *alpha*
-
- * TCP test template: *prototype*
-
-**Tests**
-
-To see the list of implemented tests see:
-https://ooni.torproject.org/docs/#core-ooniprobe-tests
-
-**ooniprobe API**
-
- * Specification: *draft*
-
- * HTTP API: *not implemented*
-
-**ooniprobe HTML5/JS user interface**
+Bouncer service
+................
- Not implemented.
+This is the service that is responsible for informing clients of where they
+should be submitting their results to and what are the addresses of the
+test-helpers they require to perform their measurements.
-**ooniprobe build system**
+The specification for the API of the bouncer can be found here:
+`<https://github.com/TheTorProject/ooni-spec/blob/master/oonib.md#40-bouncer>`_
- Not implemented.
+.. _ooni-pipeline:
+------------------
-**ooniprobe command line interface**
+When measurements are submitted to a measurement collector they are then
+processed by the data pipeline.
- Implemented in alpha quality, though needs to be ported to use the HTTP based
- API.
+The measurements are first normalised (to take into account the different data
+formats that ooniprobe has supported over time), then sanitised (to redact from them
+sensitive information such a private bridge IP address) and then put inside of a
+database to be served via the ooni-explorer_.
-oonib
-.....
+It is currently written in python using the `luigi workflow manager
+<https://luigi.readthedocs.org>`_, but that may change in the near future.
+For future plans see: `<https://github.com/TheTorProject/ooni-pipeline/issues/32>`_
-**Collector**
+.. _ooni-explorer:
+------------------
- * collection of YAML reports to flat file: *alpha*
+This is the web interface that is used by end users to inspect measurements
+collected by ooniprobe.
- * collection of pcap reports: *not implemented*
+It is written as a node.js web app (based on the strongloop framework), with
+angular.js and d3.js.
- * association of reports with test helpers: *not implemented*
+.. _ooni-wui:
+-------------
-**Test helpers**
+Web user interface assets and the implementation of the ooni-probe web
+interface. Components in here are meant to be re-used across the various
+software components (ooni-probe, ooni-explorer, net-probe, etc.), though work
+on this from is not yet complete.
- * HTTP Return JSON Helper: *alpha*
+.. _lepidopter:
+---------------
- * DNS Test helper: *prototype*
+A raspberry pi image for running ooniprobe.
- * Test Helper - collector mapping: *Not implemented*
+Amongst other things it takes care of automatically updating ooniprobe to the
+latest version and packaging all the dependencies required to run ooniprobe.
- * TCP Test helper: *prototype*
+.. _ooni-web:
+-------------
- * Daphn3 Test helper: *prototype*
+The canonical ooni.torproject.org website.
+It is implemented using `hugo <https://gohugo.io>`_ a golang based static
+website generator.
diff --git a/docs/source/conf.py b/docs/source/conf.py
index e3c6544..2eba8e4 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -30,7 +30,7 @@ from ooni import __version__ as ooniprobe_version
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = ['sphinx.ext.todo', 'sphinx.ext.coverage', 'sphinx.ext.pngmath',
-'sphinx.ext.viewcode', 'sphinx.ext.autodoc']
+'sphinx.ext.viewcode', 'sphinx.ext.autodoc', 'sphinx.ext.graphviz']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
More information about the tor-commits
mailing list