[tor-bugs] #33972 [Internal Services/Tor Sysadmin Team]: Add Nagios check for CollecTor
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Apr 23 15:48:12 UTC 2020
#33972: Add Nagios check for CollecTor
-------------------------------------------------+-------------------------
Reporter: karsten | Owner: tpa
Type: task | Status:
| needs_review
Priority: Medium | Milestone:
Component: Internal Services/Tor Sysadmin Team | Version:
Severity: Normal | Resolution:
Keywords: | Actual Points:
Parent ID: | Points:
Reviewer: | Sponsor:
-------------------------------------------------+-------------------------
Comment (by anarcat):
Replying to [ticket:33972 karsten]:
> We currently have a metrics-specific Nagios host that we want to shut
down soon. One of its checks is to see whether CollecTor's files are
becoming unavailable or stale. This check is not easily transferable to
Tor's Nagios host, because it depends on a code base that is not being
maintained anymore and that we want to deploy on Tor's Nagios host. That's
why I rewrote this check in a simple Python script to be deployed on Tor's
Nagios instance.
>
> Questions:
>
> - anarcat and/or weasel: do you have any concerns about deploying this
check in Tor's Nagios host alongside the
[https://gitweb.torproject.org/admin/tor-nagios.git/tree/tor-nagios-
checks/checks/tor-check-onionoo Onionoo check]?
I reviewed the code quickly, and it looks reasonable. Assuming performance
is acceptable, this should be fine.
> - irl: do you spot any checks in this Python script that are way off,
or other checks that are missing?
>
> - atagar, other Python people: do you mind reviewing the Python code
for general code improvements? The goal is to have a single, self-
contained, easy-to-read Python script that produces just the data we need
for Nagios to send out alerts.
I would add to that "runs fast". The way Nagios schedules checks makes it
suffer if there's a check that takes too long. Think "open TCP port"
instead of "make a full HTTP request that downloads a 3MB file" or "...
renders a complex report". :) We have some leeway of course, but if it can
be optimized, it's a definite plus.
I would also mention there's a "nagiosplugin" python module that could be
used instead of rolling our own behavior.
https://pypi.org/project/nagiosplugin/
It might be overkill for this simple plugin, but could be useful if you
want to actually send metrics like age and so on and have them processable
on the other side (which we don't currently do, mind you).
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33972#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list