[tor-bugs] #25100 [Metrics/CollecTor]: Make CollecTor's webstats module use less RAM and CPU time
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Feb 1 08:54:20 UTC 2018
#25100: Make CollecTor's webstats module use less RAM and CPU time
-------------------------------+--------------------------------
Reporter: karsten | Owner: iwakeh
Type: enhancement | Status: needs_revision
Priority: High | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------+--------------------------------
Comment (by karsten):
Replying to [comment:7 iwakeh]:
> True, so far we didn't trade memory for time, but got some improvements
that could be picked easily even winning some time here.
> Keeping counts of different sanitized lines in memory could also help
and might be only a small change; I'm looking into this next.
Aha! That sounds very promising, too. Maybe even leave out the date part
from sanitized lines and keep a bag of dates containing sanitized lines.
Something like `Map<String, Bag<LocalDate>>` (yes, I know that there's no
`Bag` type in Java; time to add Apache Commons Collections?). And later
when we write sanitized logs, we simply put in the date.
> But first, we should make sure that the performance tuning focuses on
the usual scenario (not the rare bulk import) before starting bigger
changes.
>
> 1. The usual import amount will logs of a few days, not the yearly logs,
right?
> 2. Major bulk imports like the initial one should work, but also appear
very rarely. Correct?
>
> Do you have some reasonable figures as example for each?
Agreed that bulk imports are rare. Still, they may happen. Maybe the
suggestion above resolves this relatively easily.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25100#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list