[tor-bugs] #10680 [Analysis]: Obtain attributes of current public bridges
Tor Bug Tracker & Wiki
blackhole at torproject.org
Wed Jan 22 07:41:54 UTC 2014
#10680: Obtain attributes of current public bridges
--------------------------+-----------------
Reporter: sysrqb | Owner:
Type: task | Status: new
Priority: normal | Milestone:
Component: Analysis | Version:
Resolution: | Keywords:
Actual Points: | Parent ID:
Points: |
--------------------------+-----------------
Comment (by karsten):
Replying to [comment:5 sysrqb]:
> Replying to [comment:1 karsten]:
> > I wonder, should we use the output of your script to complement the
[https://metrics.torproject.org/stats.html#servers servers.csv] file
provided by metrics-web? There are a few requirements for that, though:
>
> I think this would be great! I think the only data point we can't supply
is the country of the bridge, but that's not a huge loss.
Right, we'd have to include country codes in sanitized bridge descriptors
for that. But we also don't have country codes of relays these days, so
that's fine. Future work.
> I can definitely adapt the script to produce the necessary info and
output to csv. We might actually want more information, so I might make
this the default output (or produce the csv by providing a command line
option). Another option is to create a script specifically for this and
create a second script that produces a superset of metrics/bridge
attributes.
What additional information would you want to include? Maybe we can
extend the CSV file format? In theory, the format should allow pretty
much everything you'd want to include in a graph.
> > - Allow the script to be run on a periodically updated local
directory containing recent descriptors, for example, by running `rsync
-arz --delete --exclude 'relay-descriptors/votes' metrics.torproject.org
::metrics-recent in`.
> > - Remove all bridges that didn't have the `Running` flag in a bridge
network status. Only include server descriptors referenced from bridge
network statuses, and only include extra-info descriptors referenced from
server descriptors.
> > - Add the number of bridges in the EC2 cloud, that is, bridges whose
nickname starts with `"ec2bridger"`.
> > - Produce a bridges.csv output file similar to
[https://metrics.torproject.org/stats.html#servers servers.csv] that can
be merged with a new relays.csv (produced by current metrics-web) into
the new servers.csv.
> >
> None of these should be a problem.
>
> > If you're interested in writing such a script, I'd want to run it on
yatei and write the glue code to include your results on the metrics
website including making new graphs visualizing the new data.
> You'll accept a python script? :) I can write it in java, if you prefer,
though.
Python is fine! If you stick to the requirements above with all input
data coming from the rsync'ed directory and all output data being one or
more .csv files, then that's all I need to integrate your script into
metrics-web. Feel free to start hacking on this in a metrics-tasks.git
branch, and we'll move over the result to metrics-web when it's ready.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/10680#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list