[tor-dev] Integration testing plans, draft, v1

Mon Dec 1 04:21:11 UTC 2014

Hi!  This is the outcome of some discussion I've had with dgoulet, and
some work I've done at identifying current problem-points in our use
of integration testing tools.

I'm posting it here for initial feedback, and so I have a URL to link
to in my monthly report. :)

                     INTEGRATION TEST PLANS FOR TOR
                             November 2014

1. Goals and nongoals and scope

    This is not a list of all the tests we need; this is just a list of
    the kind of tests we can and should run with Chutney.

    These tests need to be the kind that a random developer can run on
    their own and reasonably hope to do.  Longer/more expensive tests
    may be okay too, but if it needs anything spiffer than a linux
    desktop, consider using Shadow instead.

    Setting up an environment to run the tests needs to be so easy that
    nobody who writes C for Tor is likely to be dissuaded from running
    them.

    Writing new tests needs to be pretty simple too.

    Most tests need to be runnable on all the platforms we support.

    We should support load tests. Though doing so is not likely to give
    an accurate picture of how the network behaves under load, it's
    probably good enough to identify bottlenecks in the code. (David
    Goulet has had some success here already for identifying HS
    performance issues.)

    We should specify our design and interfaces to keep components
    loosely coupled and easily replaceable.  With that done, we should
    avoid over-designing components at first: experience teaches that
    only experience can teach what facilities need which features.

2. Architecture

    Here are the components.  I'm treating them as conceptually
    separate, though in practice several of them may get folded into
    Chutney.

    A. Usage simulator

       One or more programs that emulate users and servers on the
       internet.  It reports what succeeded and what failed and how long
       everything took.  Right now we're using curl and nc for this.

    B. Network manager

       This is what Chutney does today.  It launches a set of Tor nodes
       according to a provided configuration.

    C. Testcase scripts

       We do this in shell today: launch the network, wait for it to
       bootstrap, send some traffic through it, and report success or
       failure.

    D. Test driver

       This part is responsible for determining which testcases to run,
       in what order, on what network.

       There is no current analogue to this step; we've only got the one
       test-network.sh script, and it assumes a single type of network.

    One thing to notice here is that testcase scripts need to work with
    multiple kinds of network manager configurations.  For example, we'd
    like to be able to run HTTP-style connectivity tests on small
    networks, large networks, heterogenous networks, dispersed networks,
    and so on.  We therefore need to make sure that each kind of network
    can work with as many tests as possible, so that the work needed to
    write the tests doesn't grow quadratically.

    The coupling for the components will go as possible:

    A. Usage simulations will need to expose their status to test
    scripts.

    B. The network manager will need to expose information about
    available networks and network configurations to the test scripts,
    so that the test scripts know how to configure usage simulations to
    use them.  It will need to expose commands like "wait until
    bootstrapped", "check server logs for trouble", etc.

    C. Each testcase needs to be able to identify which features it
    needs from a network, invoke the network commands it needs, and
    invoke usage simulations. It needs to export information about its
    running status, and whether it's making progress.

    D. The test driver needs to be able to enumerate networks and
    testcases and figure out which are compatible with each other, and
    which can run locally, and which ones meet the user's requirements.

2.1. Minimal versions of the above:

    A. The minimal user tools are an http server and client.  Use
    appropriate tooling to support generating and receiving hundreds to
    thousands of simultaneous requests.

    B. The minimal network manager is probably chutney.  It needs the
    ability to export information about networks, to "wait until
    bootstrapped", to export information about servers, and so on.  Its
    network list needs to turn into a database of named networks.

    C. Testcases need to be independent, and ideally abstracted.  They
    shouldn't run in-process with chutney.  For starters, they can
    duplicate the current functionality of test-network and of dgoulet's
    hidden service tests.  Probing for features can be keyword-based.
    reporting results can use some logging framework.

    D. The test driver can do its initial matching by keyword tags
    exported by the other objects.  It should treat testcases and
    networks as abitrary subprocesses that it can launch, so that they
    can be written in any language.

3. A short inventory of use-cases

- Voting
- Controllers
- HS
  * IP+RP up/down
  * HSDir up/down
  * Authenticated HS
- Bad clients
  * Wrong handshake
  * Wrong desc.
- Bad relays
  * dropping cells/circ/traffic
  * BAD TLS
- Pathing
  * Does it behaves the way we think?
  * 4 hops for HS
- Relay
  * Up/Down
  * Multiple IPs for a single key
  * OOM handling
  * Scheduling
- Client
  * Does traffic goes through?
    - For HS and Exit
  * DNS testing
    - Caches at the exit
  * Stream isolation using n SocksPort
  * AddressMap

4. Longer-term

   We might look into using "The Internet" (no, not that one. The one
   at https://github.com/nsec/the-internet) on Linux to simulate latency
   between nodes.

   When designing tools and systems, we should do so with an eye to
   migrating them into shadow.

   We should refactor or adapt chutney to support using stem, to take
   advantage of stem's improved templating and control features, so
   we can better inspect running servers.  We should probably retain the
   original hands-off non-controller chutney design too, to better
   detect heisenbugs.

5. Immediate steps

   - Turn the above into a work plan.

   - Specify initial interfaces, with plan for migration to better ones
     once we have more experience.

   - Identify current chutney and test-network issues standing in the
     way of reliably getting current tests to work.

   - Refactor our current integration tests to the above framework.