Monitoring I2PD with Telegraf
I recently made my site available via I2P as an eepsite, using i2pd as my I2P client.
However, establishing connectivity was only really the first part of that - I needed to be able to monitor i2pd
so that I could see when it goes down (or has other issues) rather than waiting for people to complain the the eepsite isn't reachable.
I do the vast majority of my system monitoring using Telegraf. A quick search didn't yield any Telegraf plugins for i2pd
- so, I created my own Telegraf Plugin for i2pd.
The plugin's currently very rough around the edges, but does what I currently need. Because i2pd
only exposes it's stats as HTML it relies on scraping the webconsole.
This documentation details how to set Telegraf up to report stats from i2pd
into InfluxDB so that I could generate graphs in Chronograf.
Installing and Configuring the plugin
I've got i2pd
configured to expose it's web interface on port 7070
(the default), so I just needed to put a copy of my plugin onto the server (into /usr/local/bin
)
git clone git@github.com:bentasker/telegraf-plugins.git
cp i2pd-statistics/i2pd-statistics.py /usr/local/bin
And then configured Telegraf to run the plugin - I added the following TOML to /etc/telegraf/telegraf.conf
[[inputs.exec]]
commands = ["/usr/local/bin/i2pd-statistics.py"]
data_format = "influx"
Then, it was just a case of restarting Telegraf
systemctl restart telegraf
Graphing
Once the data started coming in, I created a dashboard to expose useful statistics
The flux underneath the Creation Success rate gauge is pretty straightforward
from(bucket: "telegraf/autogen")
|> range(start: v.timeRangeStart)
|> filter(fn: (r) => r._measurement == "i2pd")
|> filter(fn: (r) => r.host == v.host)
|> filter(fn: (r) => r._field == "tunnel_creation_success_rate")
|> last()
As is the network throughput query
from(bucket: "telegraf/autogen")
|> range(start: v.timeRangeStart)
|> filter(fn: (r) => r._measurement == "i2pd")
|> filter(fn: (r) => r.host == v.host)
|> filter(fn: (r) => r._field == "in_avg_bps" or r._field == "out_avg_bps")
|> keep(columns: ["_value","_time","host","_field"])
We can output a similar graph to show the transit throughput, using field transit_avg_bps
(I2PD reports only the outbound transit rate).
We can also show a count over time for Routers, FloodFills and LeaseSets. We don't necessarily need to know what each of these are (although if you're interested, this explains it) to monitor them - what we're look for is sudden changes.
from(bucket: "telegraf/autogen")
|> range(start: v.timeRangeStart)
|> filter(fn: (r) => r._measurement == "i2pd")
|> filter(fn: (r) => r.host == v.host)
|> filter(fn: (r) => r._field == "routers" or r._field == "floodfills" or r._field == "leasesets")
|> keep(columns: ["_time","_field","_value","host"])
We can also get information about the state of tunnels over time (state being one of building
, established
, exploring
, expiring
or failed
).
from(bucket: "telegraf/autogen")
|> range(start: v.timeRangeStart)
|> filter(fn: (r) => r._measurement == "i2pd")
|> filter(fn: (r) => r.host == v.host)
|> filter(fn: (r) => r._field == "tunnel_count" )
|> filter(fn: (r) => r.direction == "inbound" )
|> keep(columns: ["_time","_value","host","tunnel_state"])
With the same graph for outbound tunnels simply being a case of changing the filter on r.direction
to outbound
.
Conclusion
It's not perfect - the plugin is very rough and ready (I am intending to tidy it) and also has to rely on screen-scraping i2pd
's webconsole, so a future release of i2pd
might well break collection.
But, it does allow me, via Telegraf, to keep an eye on the state of i2pd
to spot if the eepsite goes down.