[tor-dev] Torperf implementation considerations (was: Torperf)

Tue Sep 24 11:39:00 UTC 2013

Hi.

On Tue, Sep 24, 2013 at 6:03 AM, Karsten Loesing <karsten at torproject.org> wrote:
> On 9/23/13 12:53 AM, Sathyanarayanan Gunasekaran wrote:
>>
>> I don't understand how this will work when users just apt-get install
>> torperf. Ideally if someone writes a good experiment, they should send
>> the patches upstream and get it merged, and then we update torperf to
>> include those tests and then the users just update torperf with their
>> package managers.
>
> I agree with you that this is a rather unusual requirement and that
> adding new experiments to Torperf is the better approach.  That's why
> the paragraph said "should" and "ideally".  I added your concerns to the
> design document to make this clearer.  (Maybe we should mark
> requirements as either "must-do", "should-do", or "could-do"?)

Well, "ideally" implies that we want to do this at some point. Do we?

>
>>> It should be possible to run different experiments with different tor versions or binaries in the same Torperf service instance.
>>
>> I don't think we need this now. I'm totally ok with having users run
>> different torperf instances for different tor versions.
>
> Running multiple Torperf instances has disadvantages that I'm not sure
> how to work around.  For example, we want a single web server listening
> on port 80 for all experiments and for providing results.

Oh. I did not mean running multiple torperf instances
*simultaneously*; I just meant sequentially.

> Why do you think it's hard to run different tor versions or binaries in
> the same Torperf service instance?

Then each experiment needs to deal with locating, bootstrapping, and
shutting down Tor. We could just run a torperf test against a
particular tor version, once that's completed, we can run against
another tor version and so on. I'm not against this idea -- it can be
done. I just don't think it's high priority.

>>> It might be beneficial to provide a mechanism to download and verify the signature of new tor versions as they are released. The user could speficy if they plan to test stable, beta or alpha versions of tor with their Torperf instance.
>>
>> IMHO, torperf should just measure performance, not download Tor or
>> verify signatures. We have good package managers that do that already.
>
> Ah, we don't just want to measure packaged tors.  We might also want to
> measure older versions which aren't contained in package repositories
> anymore, and we might want to measure custom branches with performance
> tweaks.  Not sure if we actually want to verify signatures of tor versions.
>
> I think we should take Shadow's approach (or something similar).  Shadow
> can download a user-defined tor version ('--tor-version'), or it can
> build a local tor path ('--tor-prefix'):

If the user wants to run torperf against tor versions that are not
present in the package managers, then the user should download and
build tor -- not torperf. Once a local binary is present, the user can
run torperf against it with a --tor prefix.

> https://github.com/shadow/shadow/blob/master/setup#L109
>
> Do you see any problems with this?

Nope, this is perfectly fine. I just don't want torperf to download,
verify and build tor.

>>> A Torperf service instance should be able to accumulate results from its own experiments and remote Torperf service instances
>>
>> Torperf should not accumulate results from remote Torperf service
>> instances. If by "accumulate", you mean read another file from
>> /results which the *user* has downloaded, then yes. Torperf shouldn't
>> *download* result files from remote instances.
>
> Why not?  The alternative is to build another tool that downloads result
> files from remote instances.  That's what we do right now (see footnote:
> "For reference, the current Torperf produces measurement results which
> are re-formatted by metrics-db and visualized by metrics-web with help
> of metrics-lib.  Any change to Torperf triggers subsequent changes to
> the other three codebases, which is suboptimal.")

This could just be a wget script that downloads the results from
another server. I just don't want that to be a part of torperf.
Torperf should just measure performance and display data, IMHO -- not
worry about downloading and aggregating results from another system.
Or maybe we can do this later and change it to "Ideally torperf should
.."

>>> The new Torperf should come with an easy-to-use library to process its results
>>
>> Torperf results should just be JSON(or similar) files that already
>> have libraries and we should invent a new result format and write a
>> library for it.
>
> Yes, that's what I mean.  If you understood this differently, can you
> rephrase the paragraph?

"Torperf should store its results in a format that is widely used and
already has libraries(like JSON), so that other applications can use
the results and build on it". Maybe?

>>> request scheduler Start new requests following a previously configured schedule.
>>> request runner Handle a single request from creation over various possible sub states to timeout, failure, or completion.
>>
>> These are experiment specific. Some tests may not even need to do
>> requests. No need for these to be a part of torperf.
>
> I'm thinking how we can reduce code duplication as much as possible.
> The experiments in the design document all make requests, so it would be
> beneficial for them to have Torperf schedule and handle their requests.
>  If an experiment doesn't have the notion of request it doesn't have to
> use the request scheduler or runner.  But how would such an experiment
> work?  Do you have an example?

Nope I don't have an example. Maybe as I write the tests, I'll have a
better idea about the structure. Ignore this comment for now!

>>> results database Store request details, retrieve results, periodically delete old results if configured.
>>
>> Not sure if we really need a database. These tests look pretty simple to me.
>
> Rephrased to data store.  I still think a database makes sense here, but
> this is not a requirement.  As long as we can store, retrieve, and
> periodically delete results, everything's fine.
>

Cool!

> Again, thanks a lot for your input!
>
> Updated PDF:
>
> https://people.torproject.org/~karsten/volatile/torperf2.pdf

Great, thanks!

--Sathya