[metrics-bugs] #28799 [Metrics/Website]: Use R.cache to speed up drawing graphs
Tor Bug Tracker & Wiki
blackhole at torproject.org
Fri Dec 14 13:24:23 UTC 2018
#28799: Use R.cache to speed up drawing graphs
-----------------------------+-----------------------------------
Reporter: karsten | Owner: karsten
Type: enhancement | Status: needs_information
Priority: Medium | Milestone:
Component: Metrics/Website | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------+-----------------------------------
Comment (by karsten):
Replying to [comment:2 notirl]:
> This commit looks OK. I'm not sure about the approach though. We had
talked about using the same CSVs for these graphs as we make available for
download so that we don't have two different CSVs and it is easier to plot
custom graphs using our code as a starting point.
>
> For the graphs that I've been making for various requests I've been
using the readr library which works nicely with the tidyr universe of
packages. What would the performance impact be of reading the CSVs from a
ramdisk instead of caching them in R?
That's an interesting idea. Couple thoughts:
- Where and when would we write the per-graph CSV files that would then
become the starting point for graphs and partial CSV file exports?
- If we use R for this, the code will be rather simple, but we'd still
have an R part in our daily updater which we're currently trying to make
Java-only.
- We could execute some R code to write per-graph CSV files when
starting Rserve, but we'd have to re-run it whenever the daily updater has
finished. Sounds like it could get messy.
- If we move this code to Java, we might want to look into statistics
libraries to do something similar like what tidyr/dplyr does. The current
approach with Java Collections classes is a bit limited.
- The ramdisk sounds like it would be just as fast as the cache I'm
suggesting. But how would we make sure it always has the most recent data,
including after reboots?
Happy to discuss this more!
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28799#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list