[tor-bugs] #33406 [Internal Services/Tor Sysadmin Team]: automate reboots
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Feb 20 23:31:25 UTC 2020
#33406: automate reboots
-------------------------------------------------+-------------------------
Reporter: anarcat | Owner: tpa
Type: project | Status: new
Priority: Low | Milestone:
Component: Internal Services/Tor Sysadmin | Version:
Team | Keywords: tpa-
Severity: Major | roadmap-march
Actual Points: | Parent ID:
Points: | Reviewer:
Sponsor: |
-------------------------------------------------+-------------------------
in #31957 we have worked on automating upgrades, but that's only part of
the problem. we also need to reboot in some situations.
we have various mechanisms to do so right now:
* `tsa-misc/reboot-host` - reboot script for kvm boxes, kind of a mess,
to be removed when we finish the kvm-ganeti migration
* `tsa-misc/reboot-guest` - reboot a single host. kind of a hack, but
useful to reboot a single machine
* `misc/multi-tool/torproject-reboot-simple` - iterate over all hosts
with `rebootPolicy=justdoit` in LDAP and reboot them with `torproject-
reboot-many`
* `misc/multi-tool/torproject-reboot-simple` - iterate over all hosts
with `rebootPolicy=rotation` in LDAP and reboot them with `torproject-
reboot-many`, with a 30 minute delay between each host
* `ganeti-reboot-cluster` - a tool to reboot the ganeti cluster
There are various problems with all this:
* the `torproject-reboot-*` scripts do not take care of
`rebootPolicy=manual` hosts
* the `ganeti-reboot-cluster` script has been known to fail if a cluster
is unbalanced
* the `ganeti-reboot-cluster` script currently fails when hosts talk to
each other over IPv6 somehow
* we have 5 different ways of performing reboots, we should have just one
script that does it all
* reboot-{host,guest} do not check if hosts need reboot before rebooting
(but the multi-tool does)
In short, this is kind of a mess, and we should refactor this. We should
consider using needrestart, which knows how to reboot individual hosts.
I also added a [https://github.com/xneelo/hetzner-needrestart/issues/23
feature request to the needrestart puppet module] to expose its knowledge
as a puppet fact, so we can use that information from PuppetDB instead of
SSH'ing in each host and calling the dsa-* tools.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33406>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list