[metrics-bugs] #25196 [Metrics/Statistics]: Cut off recent dates from several CSV files
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Mar 7 14:47:45 UTC 2018
#25196: Cut off recent dates from several CSV files
--------------------------------+------------------------------
Reporter: karsten | Owner: karsten
Type: defect | Status: needs_review
Priority: Medium | Milestone:
Component: Metrics/Statistics | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: iwakeh | Sponsor:
--------------------------------+------------------------------
Comment (by iwakeh):
Regarding webstats our [https://metrics.torproject.org/web-server-
logs.html#n-discarding-non-matching-lines spec] states in section 4.1:
In addition, log lines are treated differently according to the date
they contain:
During an import process the sanitizer takes all log line dates into
account and determines the reference interval as stretching from the
oldest date to the youngest date encountered. Depending on the reference
interval log lines are not yet processed, if their date is on the edges of
the reference interval, i.e., the date is not at least a day younger than
the older endpoint or the date is only LIMIT days older than the younger
endpoint, where LIMIT is initially set to two, but this might change if
necessary.
If the younger endpoint of the reference interval coincides with the
current system date, the day before is used as the new younger reference
interval endpoint, which ensures that the sanitizer won't publish logs
prematurely, i.e., before there is a chance that they are complete. Thus,
processing of log lines carrying such date is postponed.
All log lines with dates for which the sanitizer already published a
log file are discarded in order to avoid altering published logs.
This means that logs are published (earliest) two days before today; two
days before current system day.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25196#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list