[metrics-bugs] #23243 [Metrics/Metrics website]: write a spec for web-server-access log descriptors
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Aug 15 09:38:05 UTC 2017
#23243: write a spec for web-server-access log descriptors
-----------------------------------------+--------------------------
Reporter: iwakeh | Owner: metrics-team
Type: enhancement | Status: new
Priority: Medium | Milestone:
Component: Metrics/Metrics website | Version:
Severity: Normal | Keywords:
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
-----------------------------------------+--------------------------
This document should answer the following questions:
* What will the raw input data look like?
- compressed logs
- varying dates in log-lines despite the file being tagged with a single
date
- are there only GET log-lines of 200 responses to be expected?
- size could be huge (in future)
- exact input format (if possible to define)
- meta-data is provided in paths and filenames
- ...
* What will sanitized stored (on disk) logs look like?
- cleaned log-lines, define exact format, give examples (as this might
deviate from the current python sanitation)
- meta-data is provided in paths and filenames
- should files be reassembled, i.e., only log lines of a given date in a
descriptor for that log date?
- should storage (on disk) be in compressed files (opposed to storing
other descriptors uncompressed)?
- Should such log be stored (on disk) in reasonably sized chunks (once a
GB size is reached)?
- ...
Please add more.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23243>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list