[metrics-bugs] #33941 [Internal Services/Tor Sysadmin Team]: Nagios checks for op-??.onionperf.torproject.net
Tor Bug Tracker & Wiki
blackhole at torproject.org
Tue Apr 28 09:51:21 UTC 2020
#33941: Nagios checks for op-??.onionperf.torproject.net
-------------------------------------------------+---------------------
Reporter: karsten | Owner: tpa
Type: task | Status: new
Priority: Medium | Milestone:
Component: Internal Services/Tor Sysadmin Team | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------------------+---------------------
Comment (by karsten):
Okay. This is less about my personal preference for Nagios or against
Prometheus. It's more about getting something very simple deployed for
monitoring our OnionPerf instances in the next few days. If you prefer
Prometheus for doing that, I'm fine with that. I hope that I don't have to
learn much about Prometheus but can treat it as a black box that runs a
application-specific check script and sends me an alert if something's
broken. To be honest, that's also how I treat Nagios. Ultimately, this
should be your decision. I'm just bringing in the soft requirement to have
three running checks for `op-{nl,us,hk}2.onionperf.torproject.net` by the
end of the month. If that's an impossible requirement I'll have to make
new plans about keeping an AWS instance alive that I'd prefer to
terminate.
In the meantime I worked a bit on the log-file-downloading idea and came
up with a slightly optimized plan: each OnionPerf instance could update a
status file once per minute that it makes available via its web server,
and Nagios or Prometheus could process that file and alert if something's
off. That file could simply contain the latest ISO-8601 timestamp when
OnionPerf found itself to be fully operational, like
`2020-04-28T09:31:19Z`. Nagios or Prometheus would then send out an alert
if that file cannot be downloaded or the contained timestamp cannot be
parsed or is older than one hour. How does this sound?
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33941#comment:4>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the metrics-bugs
mailing list