[tor-project] minutes from the sysadmin meeting

Antoine Beaupré anarcat at torproject.org
Tue Nov 8 15:43:51 UTC 2022


Here's your monthly dose of sysadmin news!

# Roll call: who's there and emergencies

anarcat, gaba, kez, lavamind

# Dashboard review

We did our normal per-user check-in:

 * https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&assignee_username=anarcat
 * https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&assignee_username=kez
 * https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&assignee_username=lavamind

... and briefly reviewed the general dashboards:

 * https://gitlab.torproject.org/tpo/tpa/team/-/boards/117
 * https://gitlab.torproject.org/groups/tpo/web/-/boards
 * https://gitlab.torproject.org/groups/tpo/tpa/-/boards

We need to rethink the web board triage, as mentioned in the last
point of this meeting.

# TPA-RFC-42: 2023 roadmap

Gaba brought up a few items we need to plan for, and schedule:

 * donate page rewrite (kez)
 * sponsor9:
   * self-host discourse (Q1-Q2 < june 2023)
   * RT and cdr.link evaluation (Q1-Q2, gus): "improve our frontdesk
     tool by exploring the possibility of migrating to a better tool
     that can manage messaging apps with our users"
   * download page changes (kez? currently blocked on nico)
 * weblate transition (CI changes pending, lavamind following up)
 * developer portal (dev.torproject.org), in Hugo, from ura.design
   ([tpo/web/dev#6][])

Those are tasks that either TPA will need to do themselves or assist
other people in. Gaba also went through the work planned for 2023 in
general to see what would affect TPA.

We then discussed anarcat's roadmap proposal ([TPA-RFC-42][]):

 * do the bookworm upgrades, this includes:
   * puppet server 7
   * puppet agent 7
   * plan would be:
     * Q1-Q2: deploy new machines with bookworm
     * Q1-Q4: upgrade existing machines to bookworm
 * email services migration (e.g. execute TPA-RFC-31, still need to
   decide the scope, proposal coming up)
 * possibly retire schleuder (e.g. execute TPA-RFC-41, currently
   waiting for feedback from the community council)
 * complete the cymru migration (e.g. execute TPA-RFC-40)
 * retire gitolite/gitweb (e.g. execute TPA-RFC-36)
 * retire SVN (e.g. execute TPA-RFC-11)
 * monitoring system overhaul (TPA-RFC-33)
 * deploy a Puppet CI
   * e.g. make the Puppet repo public, possibly by removing private content
     and just creating a "graft" to have a new repository without old
     history (as opposed to rewriting the entire history, because then
     we don't know if we have confidential stuff in the old history)
   * there are disagreements on whether or not we should make the
     repository public in the first place, as it's not exactly "state
     of the art" puppet code, which could be embarrassing
   * there's also a concern that we don't need CI as long as we don't
     have actual tests to run (but it's also kind of pointless to have
     CI without tests to run...), but for now we already have the
     objective of running linting checks on push ([tpo/tpa/team#31226][])
 * plan for summer vacations

[tpo/web/dev#6]: https://gitlab.torproject.org/tpo/web/dev/-/issues/6
[TPA-RFC-42]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40924
[tpo/tpa/team#31226]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/31226

# Web team organisation

Postponed to next meeting. anarcat will join Gaba's next triage
session with gus to see how that goes.

# Metrics of the month

 * hosts in Puppet: 95, LDAP: 95, Prometheus exporters: 163
 * number of Apache servers monitored: 29, hits per second: 715
 * number of self-hosted nameservers: 6, mail servers: 10
 * pending upgrades: 0, reboots: 4
 * average load: 0.64, memory available: 4.61 TiB/5.74 TiB, running
   processes: 736
 * disk free/total: 32.50 TiB/92.28 TiB
 * bytes sent: 363.66 MB/s, received: 215.11 MB/s
 * planned bullseye upgrades completion date: 2022-11-01
 * [GitLab tickets][]: 175 tickets including...
   * open: 0
   * icebox: 144
   * backlog: 17
   * next: 4
   * doing: 7
   * needs review: 1
   * needs information: 2
   * (closed: 2934)

 [Gitlab tickets]: https://gitlab.torproject.org/tpo/tpa/team/-/boards

Upgrade prediction graph lives at:

https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/upgrades/bullseye/

Now also available as the main Grafana dashboard. Head to
<https://grafana.torproject.org/>, change the time period to 30 days,
and wait a while for results to render.

# Number of the month: 12

Progress on bullseye upgrades mostly flat-lined at 12 machines since
August. We actually have three *less* bullseye servers now, down to 83
from 86.

-- 
Antoine Beaupré
torproject.org system administration


More information about the tor-project mailing list