[metrics-bugs] #33258 [Metrics]: Add CSV file export of graphed data
Tor Bug Tracker & Wiki
blackhole at torproject.org
Mon Mar 30 09:08:30 UTC 2020
#33258: Add CSV file export of graphed data
-----------------------------------------+------------------------------
Reporter: karsten | Owner: metrics-team
Type: enhancement | Status: needs_review
Priority: Medium | Milestone:
Component: Metrics | Version:
Severity: Normal | Resolution:
Keywords: metrics-team-roadmap-2020Q1 | Actual Points:
Parent ID: #33327 | Points: 1
Reviewer: anarcat | Sponsor: Sponsor59
-----------------------------------------+------------------------------
Comment (by karsten):
Replying to [comment:7 robgjansen]:
> Karsten, I think all of the changes you have made here are improvements
over the original graphs. I copied the original code from scripts I was
using for Shadow experiments, which is why some of the plots probably
don't make sense anymore outside of Shadow.
>
> > we should expect developers wanting to play with the underlying data
and feeding it into their own tools
>
> This was exactly my sentiment too when I added the OnionPerf plotting; I
thought it would be more useful to start with the code I had been using
for Shadow than starting with nothing. :)
Sounds good! Thanks for taking a look at my patch.
> Also, I ended up copying a lot of the TGen parsing/plotting stuff into a
python package in [https://github.com/shadow/tgen/tree/master/tools the
tgen repository]. I thought it made more sense for the code that plots
tgen logs to coexist and be synchronized with tgen itself. This means we
now have two different tools to parse/plot tgen results, which I think is
fine, but let me know if you don't want "us" to be maintaining two
separate code-bases for this. I don't mind if you completely change the
way OnionPerf is plotting things so that the plots are most useful for
OnionPerf, and we'll make the tgen plots useful for Shadow. Does that
approach make sense?
Good to know. I certainly wouldn't want us to maintain code doing the same
thing in two separate code bases, if we can avoid it. Let's talk about how
much of this code overlaps and how much is different before deciding where
and how to maintain this code in the future. Here are some random
thoughts:
- Having the log-parsing code close to the log-writing code sounds like a
very good idea to me. The alternative is to update the log-parsing code in
OnionPerf every time we update the log-writing code. I'm not exactly sure
how easy it is to share this code, I just see the value in doing so.
- Does Shadow produce any data that is worth visualizing in addition to
tgen logs? If so, do you have plotting code in tgen and in Shadow?
Assuming we keep plotting code in OnionPerf, would it make sense to move
your tgen plotting code to Shadow? Of course, it doesn't hurt us or anyone
to keep it in tgen, I'm just thinking out loudly here.
- The patch I attached to this ticket only changes the tgen
visualizations. But this was just the first step. In the next steps I'd
like to add more visualizations that use more data than what we can learn
from tgen logs. For example, I'd like to add filters (#33328) to remove
measurements using certain relays and compare performance characteristics
to the baseline. Basic filters would remove relays by fingerprint (#33260)
using path information obtained from Tor control logs. More sophisticated
filters would incorporate Tor descriptors like consensuses or server
descriptors and allow filtering by relay flag, Tor version, or platform.
My current plan is to extend OnionPerf's analyze mode for these filters
and plot the result using newly added visualization code. I assume you
wouldn't need to do something like this in Shadow, because you'd rather
change the simulation parameters and re-run it, right?
- If we combine parsing/plotting code, we'll have to discuss and agree on
common definitions like Time To First Byte. The current OnionPerf
visualizations plot time from 'command' to 'first_byte', but Tor Metrics
uses 'start' to 'first_byte'. There's certainly value in talking about
these definitions and streamlining them. But there's also costs involved
in doing so.
I'm open to collaborating closer on this visualization code if it makes
sense. What do you think?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33258#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list