My services have been available via Tor .onion for around 7 years now, but my monitoring of their availability has always been relatively limited. I did previously have
smokeping running reachability tests, but other than that there's been a reliance on me noticing that things weren't right (or perhaps receiving reports that an
.onion was misbehaving).
Part of the reason for this is that there's never (to my knowledge) been a good centralised way to monitor the health of a Tor install. Nyx is a fantastic command line tool, but relies on the operator logging into their box: it's akin to relying on
top to monitor CPU usage.
I've always figured that it should be possible to monitor the
tor daemon more effectively, but never really quite got around to do anything about it.
This week, I decided to take a pop at it, and a quick scan over Tor's control port spec revealed how easy it should be to collect stats.
This documentation details how to use my new Tor Daemon Plugin for Telegraf to collect metrics from a Tor daemon.
The full list of statistics collected can be seen in the plugin's README, but they include
bytes_rx: total bytes received by Tor
bytes_tx: total bytes transmitted by Tor
uptime: Tor daemon uptime
version_status: Tor's assessment of whether the installed version is OK to use
- Accounting information: is a quota set? If so, how much is left?
- Reachability test statuses
- Guard node states
Although my main focus is on monitoring the availability of my onion services, the plugin can be used to monitor tor relays, bridges and exit nodes too.