[tor-bugs] #24218 [Metrics/Statistics]: Implement new metrics-web module for IPv6 relay statistics
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Dec 6 11:14:19 UTC 2017
#24218: Implement new metrics-web module for IPv6 relay statistics
--------------------------------+------------------------------
Reporter: karsten | Owner: metrics-team
Type: enhancement | Status: needs_review
Priority: Medium | Milestone:
Component: Metrics/Statistics | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
--------------------------------+------------------------------
Changes (by karsten):
* status: new => needs_review
Comment:
So, I rewrote the earlier prototype into a metrics-web module that uses a
PostgreSQL database. Please review [https://gitweb.torproject.org/karsten
/metrics-web.git/log/?h=task-24218 my task-24218 branch].
Here are some first (meta) statistics on how it performs:
- Processed five weeks of descriptors from 2017-11-01 to 2017-12-04,
roughly 500M in XZ-compressed form plus recent descriptors from past three
days.
- Processing took ~12 minutes on my laptop.
- The resulting database has a size of ~1G before vacuuming and ~150M
afterwards.
Remaining tasks:
- Add a specification of the CSV file and three new graph pages to Tor
Metrics. I'll take care of this.
- Import the descriptor archive since 2008 somewhere, though not
necessarily on the production system. I can take care of this, but after
the first review round when it's clear whether the database schema can
stay.
- Find a way to test the `Database` class. I briefly tried testing it
with an in-memory HSQLDB database and got it working to some extent. But
we're using a few features that are specific to PostgreSQL and that we'll
have to replace in these tests. The result would be that we're testing
something slightly different that is similar to the PostgreSQL database
but not quite the same. And the code in `Database` looks trivial enough to
not contain the major bugs. I think I'd prefer to test the whole code with
real descriptors as input and a real test PostgreSQL database to do the
aggregation. Let's try to find a testing approach that we can later apply
to other modules. (This shouldn't block either review or deployment.)
- Write a specification of the new CSV file according to what we said
we'll do for
[https://trac.torproject.org/projects/tor/wiki/org/sponsors/Sponsor13
Sponsor 13].
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24218#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list