[metrics-bugs] #22983 [Metrics/Library]: Add a Descriptor subinterface and implementation for Tor web server logs
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Jan 10 15:10:58 UTC 2018
#22983: Add a Descriptor subinterface and implementation for Tor web server logs
-----------------------------+-----------------------------------
Reporter: iwakeh | Owner: metrics-team
Type: enhancement | Status: needs_review
Priority: High | Milestone: metrics-lib 2.2.0
Component: Metrics/Library | Version:
Severity: Normal | Resolution:
Keywords: metrics-2018 | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-----------------------------+-----------------------------------
Comment (by karsten):
Replying to [comment:57 iwakeh]:
> Replying to [comment:56 karsten]:
> > I started looking at your branch, but it's a pretty big diff again, so
that'll take some time.
>
> You only need to look at the latest four commits. The others are
reviewed already (comment:47).
Oh. That information would have saved some time this morning. That also
explains why I have just one remark on the earlier commits and a couple
more on the first four!
> > Regarding Git history, we're going to squash all these commits (except
for unrelated ones like adding a space or package-info) into a single
commit that adds interfaces, implementations, and tests, right? I'm
asking, because you marked some commits as "fixup!", but not all of them.
Or do you have a separation in mind? If so, which?
> >
>
> Maybe, keep those that you find easier to understand the changes. I
used 'fixup' for making clear that a change is related to an earlier
review discussion.
There are two types of changes here:
1. Changes in the review process that require knowing the branch under
review. Ideally, we'd use "fixup" or "squash" commits for all commits made
during the review process. Alternatively, we can agree that commits will
be squashed prior to merge, which works for me, too.
2. Changes in the merge process that only require knowing master before
the merge. Somebody who didn't follow the review process, including our
future selfs, should see changes where we're not going forth and back
throughout commit series to achieve something that we could do in one
step.
> > By the way, what was the reason for rebasing your branch? It would
have been a tad bit easier to stay on the same branch until we're done
with the review process. Just saying for future reviews.
>
> You mentioned that above already ;-)
> (I think, last year I did the rebase because it seemed better to work on
a branch close to master.)
Maybe I ran into that rebase today when fetching from your repository and
seeing this branch to be force-updated. But it's good to know I mentioned
this before. With a few more samples you'll soon be able to compute my
attention span. :)
Anyway, here's what I found in your branch:
- `AccessLogLineSanVal` contains an explicit mention of `"FTP"`. Does
this mean we're considering FTP log lines to be valid? Why FTP in
particular and no other protocols? Or should we take that out and restrict
ourselves to HTTP? Not feeling strongly about this one.
- Further down below we're comparing `referenceDate` to `extractedDate`
and discarding lines where the two don't match. Not sure if this is
problematic, but I'm at least flagging it here as potential issue. If this
is a non-issue, feel free to ignore this comment.
- A few lines further down we're formatting a date using
`DateTimeFormatter.ofPattern()`. Is that expensive? Would we be able to
create the formatter just once and re-use it here? Not trying to optimize
prematurely, but trying to avoid shooting ourselves in the foot.
- Logging an error in case of catching a `Throwable` might become very
noisy. After all, we're processing third-party input here, and we can
typically proceed without parsing that line. The debug level seemed like
the better choice here.
- In `WebServerAccessLog`, what exactly is the log date? Is it the date
of contained requests, is it the date when the file was written to disk,
or what is it? This is still unclear in the docs.
When these remaining issues are clearer to me, do you mind me editing your
branch and preparing one or more "fixup"/"squash" commits? There are still
a few things that I'd like to document differently or where I'd like to
make the new code more similar to existing code (in the good sense). You
could still go through them and talk me out of making those changes. Sound
okay?
Oh, and here's another remark for future reviews: let's try to reduce the
time for reviewing and for revising a branch to a few days. Neither
reviewing nor revising is fun, but it's even less fun when doing it
several weeks later. I know this is a special case with the holiday break,
and I'm not blaming you any more than I'm blaming myself. Just an idea to
make reviews a little more fun. :)
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/22983#comment:58>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list