[tor-project] minutes from the sysadmin meeting
Antoine Beaupré
anarcat at torproject.org
Tue Nov 8 15:43:51 UTC 2022
Here's your monthly dose of sysadmin news!
# Roll call: who's there and emergencies
anarcat, gaba, kez, lavamind
# Dashboard review
We did our normal per-user check-in:
* https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&assignee_username=anarcat
* https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&assignee_username=kez
* https://gitlab.torproject.org/groups/tpo/-/boards?scope=all&utf8=%E2%9C%93&assignee_username=lavamind
... and briefly reviewed the general dashboards:
* https://gitlab.torproject.org/tpo/tpa/team/-/boards/117
* https://gitlab.torproject.org/groups/tpo/web/-/boards
* https://gitlab.torproject.org/groups/tpo/tpa/-/boards
We need to rethink the web board triage, as mentioned in the last
point of this meeting.
# TPA-RFC-42: 2023 roadmap
Gaba brought up a few items we need to plan for, and schedule:
* donate page rewrite (kez)
* sponsor9:
* self-host discourse (Q1-Q2 < june 2023)
* RT and cdr.link evaluation (Q1-Q2, gus): "improve our frontdesk
tool by exploring the possibility of migrating to a better tool
that can manage messaging apps with our users"
* download page changes (kez? currently blocked on nico)
* weblate transition (CI changes pending, lavamind following up)
* developer portal (dev.torproject.org), in Hugo, from ura.design
([tpo/web/dev#6][])
Those are tasks that either TPA will need to do themselves or assist
other people in. Gaba also went through the work planned for 2023 in
general to see what would affect TPA.
We then discussed anarcat's roadmap proposal ([TPA-RFC-42][]):
* do the bookworm upgrades, this includes:
* puppet server 7
* puppet agent 7
* plan would be:
* Q1-Q2: deploy new machines with bookworm
* Q1-Q4: upgrade existing machines to bookworm
* email services migration (e.g. execute TPA-RFC-31, still need to
decide the scope, proposal coming up)
* possibly retire schleuder (e.g. execute TPA-RFC-41, currently
waiting for feedback from the community council)
* complete the cymru migration (e.g. execute TPA-RFC-40)
* retire gitolite/gitweb (e.g. execute TPA-RFC-36)
* retire SVN (e.g. execute TPA-RFC-11)
* monitoring system overhaul (TPA-RFC-33)
* deploy a Puppet CI
* e.g. make the Puppet repo public, possibly by removing private content
and just creating a "graft" to have a new repository without old
history (as opposed to rewriting the entire history, because then
we don't know if we have confidential stuff in the old history)
* there are disagreements on whether or not we should make the
repository public in the first place, as it's not exactly "state
of the art" puppet code, which could be embarrassing
* there's also a concern that we don't need CI as long as we don't
have actual tests to run (but it's also kind of pointless to have
CI without tests to run...), but for now we already have the
objective of running linting checks on push ([tpo/tpa/team#31226][])
* plan for summer vacations
[tpo/web/dev#6]: https://gitlab.torproject.org/tpo/web/dev/-/issues/6
[TPA-RFC-42]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40924
[tpo/tpa/team#31226]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/31226
# Web team organisation
Postponed to next meeting. anarcat will join Gaba's next triage
session with gus to see how that goes.
# Metrics of the month
* hosts in Puppet: 95, LDAP: 95, Prometheus exporters: 163
* number of Apache servers monitored: 29, hits per second: 715
* number of self-hosted nameservers: 6, mail servers: 10
* pending upgrades: 0, reboots: 4
* average load: 0.64, memory available: 4.61 TiB/5.74 TiB, running
processes: 736
* disk free/total: 32.50 TiB/92.28 TiB
* bytes sent: 363.66 MB/s, received: 215.11 MB/s
* planned bullseye upgrades completion date: 2022-11-01
* [GitLab tickets][]: 175 tickets including...
* open: 0
* icebox: 144
* backlog: 17
* next: 4
* doing: 7
* needs review: 1
* needs information: 2
* (closed: 2934)
[Gitlab tickets]: https://gitlab.torproject.org/tpo/tpa/team/-/boards
Upgrade prediction graph lives at:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/upgrades/bullseye/
Now also available as the main Grafana dashboard. Head to
<https://grafana.torproject.org/>, change the time period to 30 days,
and wait a while for results to render.
# Number of the month: 12
Progress on bullseye upgrades mostly flat-lined at 12 machines since
August. We actually have three *less* bullseye servers now, down to 83
from 86.
--
Antoine Beaupré
torproject.org system administration
More information about the tor-project
mailing list