[metrics-bugs] #32890 [Metrics/CollecTor]: Remember processed files between module runs
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Jan 7 15:18:26 UTC 2020
#32890: Remember processed files between module runs
-----------------------------------+----------------------
Reporter: karsten | Owner: karsten
Type: defect | Status: assigned
Priority: Medium | Milestone:
Component: Metrics/CollecTor | Version:
Severity: Normal | Keywords:
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
-----------------------------------+----------------------
The three recently added modules to archive Snowflake statistics, bridge
pool assignments, and BridgeDB metrics have in common that they process
any input files regardless of whether they already processed them before.
The problem is that the input files processed by these modules are either
never removed (Snowflake statistics) or only removed manually by the
operator (bridge pool assignments and BridgeDB statistics).
The effect is that non-recent BridgeDB metrics and bridge pool assignments
are being placed in the indexed/recent/ directory in the next execution
after they are deleted for being older than 72 hours. The same would
happen with Snowflake statistics after the operator removes them from the
out/ directory.
The fix is to use a state file containing file names of previously
processed files and only process a file not found in there. This is the
same approach as taken for bridge descriptor tarballs.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32890>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list