[metrics-team] Hello from blackbird
Karsten Loesing
karsten at torproject.org
Thu Apr 4 15:59:24 UTC 2019
On 2019-03-24 20:01, Su Yu wrote:
> Hi Karsten,
Hi blackbird,
> Sorry for the late reply! I needed to learn how to read the consensuses
> (and I still have a lot to learn).
>
> - How does the shorter release cycle affect update behavior?
>
> I have been thinking about this question. With the data you shared, I
> plotted the age of servers vs. the age of their versions:
>
> server_age_version_age.png
> The "Age of server" is computed as the number of days between today and
> the date that the server started running.
Wait, the data I gave you doesn't say anything about when the server
started running. It's the date when relays were listed as running in a
consensus.
> The "Age of version" is the
> field "days_since_first_recommended" from the data table. The left plot
> uses the data of relays and the right is of the bridges.
>
> Overall, it seems that the older servers update more quickly, which is a
> little counter-intuitive. I guess this can be a result of how the
> dataset was constructed?
>
> However, to really address this question, we will need longitudinal data
> of the update behavior, instead of only one snapshot. I think this
> requires parsing the consensus files from multiple time points and
> compiling them. Did you create the above dataset by parsing the
> consensus? If so, would you think this is viable?
The data I gave you is not a snapshot. It's based on parsing all
consensuses back to 2007.
I wonder if this analysis makes more sense if you're parsing consensus
files yourself. Sorry that my data apparently confused you, that was not
my intention.
> On a separate note: I saw this
> <https://metrics.torproject.org/uncharted-data-flow.html> on the Metrics
> website and feel very interested. I wonder if there is more work/data
> relevant to this network?
The data is all available on the Tor Metrics website, though it requires
parsing raw data to produce a visualization like that. Note that while
it's certainly a nice visualization, it's not really a priority for us
at the moment to have more of those.
If you'd still like to help out with Tor Metrics tasks, maybe take a
look at Trac for open tickets in the Metrics/* components and at the
relevant Git repositories. If there's something you'd like to work on,
just comment on the ticket.
Thanks!
> Thanks,
> blackbird
All the best,
Karsten
> On Mon, Mar 11, 2019 at 11:00 AM Karsten Loesing <karsten at torproject.org
> <mailto:karsten at torproject.org>> wrote:
>
> Hi,
>
> sorry for the late reply. I guess I was hoping to find time to produce
> some more data for you, but that time has not materialized yet. Let me
> respond to your questions anyway, and maybe you'll be able to produce
> your own data from the original data.
>
> On 2019-03-01 22:27, Su Yu wrote:
> > Sorry! I realized I should've added some caption to the figures.
> >
> > - In the two histograms are the distribution of the age of Tor
> versions.
> > Most people have their Tor between 0 - 250 days old and there is a
> > long-tail distribution.
> > - In the box plot, the boxes contain the first quantile to the third
> > quantile of data points, and the line in the center is the median. The
> > upper and lower "whiskers" show the maximum and minimum of the
> data, and
> > the points above the top whisker are outliers. It appears that
> there are
> > less bridges with very old versions, but the bridges and relays are
> > similar in keeping up with new versions.
> >
> > You're definitely right that the temporal changes are important. I'll
> > focus on this in some follow-up analysis. I have a couple of questions
> > regarding this:
> >
> > 1. What is the "date" column in the csv file you shared, specifically?
>
> It's the date when relays were listed as running in a consensus.
>
> > 2. What's a good way to see the the unattended updates data? I can
> look
> > into it if you could point me to a general direction.
>
> There's no explicit data about unattended updates. My idea was that, if
> a relay updates really soon after a new version comes out, it's likely
> using unattended updates. But we do not know which relays have
> unattended updates configured on their system.
>
> > 3. It seems many of the potential questions will require new data. I'm
> > happy to work on data generation/cleaning; but is there a good way to
> > share the datasets or figures? They may also be too large for the mail
> > list..
>
> Figures should be fine on the mailing list. If you have larger datasets,
> can you upload them somewhere and link to them? Otherwise figures and
> descriptions on the mailing list will have to do for now.
>
> So, if you want to work with the original data, you should take a look
> at consensuses here:
>
> https://metrics.torproject.org/collector.html#type-network-status-consensus-3
>
> In particular, here are some lines contained in consensuses that might
> be relevant:
>
> valid-after 2019-03-11 14:00:00
>
> server-versions
> 0.2.9.15,0.2.9.16,0.2.9.17,0.3.4.10,0.3.4.11,0.3.5.7,0.3.5.8,0.4.0.1-alpha,0.4.0.2-alpha
>
> r seele AAoQ1DAR6kkoo19hBAX5K0QztNw qbFrFVLkIeCAYnciYZP5lRs4P1s
> 2019-03-11 11:17:29 67.174.243.193 9001 0
>
> s Running Stable V2Dir Valid
>
> v Tor 0.3.5.7
>
> Consensuses are specified here:
>
> https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
>
> Thanks!
>
> All the best,
> Karsten
>
>
> >
> > Hi Teor - thank you for chiming in! The release schedule page looks
> > awesome. I notice there is not any mention of dev versions - is that
> > information also available somewhere?
> >
> > Thanks!
> >
> > blackbird
> >
> >
> >
> >
> >
> > On Thu, Feb 28, 2019 at 6:43 PM teor <teor at riseup.net
> <mailto:teor at riseup.net>
> > <mailto:teor at riseup.net <mailto:teor at riseup.net>>> wrote:
> >
> > Hi,
> >
> > On 28 Feb 2019, at 19:00, Karsten Loesing
> <karsten at torproject.org <mailto:karsten at torproject.org>
> > <mailto:karsten at torproject.org
> <mailto:karsten at torproject.org>>> wrote:
> >>
> >> On 2019-02-22 23:05, Su Yu wrote:
> >>>
> >>> I did some quick plotting in Jupyter notebook (see below;
> the figures
> >>> are also attached separately). Regarding the relays vs. bridges
> >>> question
> >>> that you mentioned, it seems the bridges are better at keeping
> >>> themselves not /too/ outdated, but they're actually not that
> >>> different
> >>> in keeping up-to-date?
> >>
> >> Thanks for making these graphs. Though it's hard (for me) to
> interpret
> >> these results.
> >>
> >> One reason might be that these graphs are considering a time
> frame of
> >> over 1 decade. A lot of things have changed over that time frame:
> >>
> >> - The network has grown a lot over the years, which means
> that recent
> >> years have a greater weight in those graphs than distant
> years. This
> >> doesn't have to be a bad thing, it's just probably not
> intended and
> >> possibly surprising when interpreting the results.
> >>
> >> - Release cycles have changed, with a much shorter cycle in
> the last
> >> year or two as compared to earlier years. This may skew results
> >> even more.
> >>
> >> If I were to continue this analysis I'd try to look more at
> >> changes over
> >> time. Things I'd look at:
> >>
> >> - How does the shorter release cycle affect update behavior? It's
> >> probably useful to look at Tor's change log to get an idea when
> >> versions
> >> have been updated, when versions have been sunset, and which
> versions
> >> have long-term support.
> >
> > We have a summary page for past releases and our release schedule:
> >
> https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases#Calendar:
> >
> > T
> >
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 528 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20190404/4e950724/attachment.sig>
More information about the metrics-team
mailing list