Prometheus-alertmanager and graphana (especially graphana!) seem a bit too involved for monitoring my homelab (prometheus itself is fine: it does collect a lot of statistics I don’t care about, but it doesn’t require configuration so it doesn’t bother me).
Do you know of simpler alternatives?
My goals are relatively simple:
- get a notification when any systemd service fails
- get a notification if there is not much space left on a disk
- get a notification if one of the above can’t be determined (eg. server down, config error, …)
Seeing graphs with basic system metrics (eg. cpu/ram usage) would be nice, but it’s not super-important.
I am a dev so writing a script that checks for whatever I need is way simpler than learning/writing/testing yaml configuration (in fact, I was about to write a script to send heartbeats to something like Uptime Kuma or Tianji before I thought of asking you for a nicer solution).
I’m currently using InfluxDB + Telegraf + Grafana combination to monitor Linux systems and k3s pods. It’s basically same as Prometheus, but InfluxDB uses push model, which makes it easier to develop tools for collecting custom time series data.
For alerts and dashboards, I think Grafana is the simplest and most hassle free solution available at the moment.
Is there a self hosted OpenTelemetry consumer?
Edit: found better resources
https://linuxhandbook.com/syslog-guide/
https://github.com/linuxserver/docker-syslog-ng
That should be a good place to start. Syslog will do what you want.
I mean, you get a lot of advantages from fluffy pretty systems. But extracting data from df and systemctl and curling it into telegram is going to be like a 10 line bash script called from a one-line cron job.
I pump a lot of complicated metrics through Prometheus / grafana to get graphs and history.
Most of my critical stuff is still in Nagios and instead of using nagios standardized plugins I just query the operating system directly in bash.