Heartbeat for cron, backups and daemons

The most common pain of a system administrator: cron “sort of works”, but the last backup in /var/backups is three weeks old, and you find out about it exactly when you need to restore the database.

Instead of an “active” check with dashboards and alerts in Prometheus you can do a passive check using Heartbeat: cron simply sends a short ping to Notifly on every successful run, and if the next ping doesn’t arrive in time — Notifly will send a notification.

Step 1. Create a heartbeat in the admin panel

app.notifly.ru → section Heartbeats → Create.

Settings for a typical hourly backup:

Поле	Значение
Название	`Cron бэкапа PostgreSQL`
Канал	`infra` (любой, через который шлёте алёрты)
Интервал (сек)	`3600` (раз в час)
Допуск (сек)	`300` (5 минут запас)
Текст alert	`Бэкап PostgreSQL не запустился — проверьте сервер!`
Приоритет alert	`9` — loud popup notification
Текст recovery	`Бэкап снова работает.`

Copy the Ping URL from the table — it looks like https://your-notifly/heartbeat/ping/H....

Step 2. Trigger the ping from cron

/etc/cron.d/pg-backup:

NOTIFLY_PING="https://your-notifly/heartbeat/ping/H..."

0 * * * * postgres /usr/local/bin/pg-backup.sh \
  && curl -fsS "$NOTIFLY_PING" -o /dev/null

The key idea: the ping is invoked via &&, so only on script success. If the backup returns a non-zero code — the ping won’t be sent, and after an hour+5 minutes an alert “Backup did not run” will arrive.

Step 3. Same scheme for a systemd timer

If you moved from cron to a systemd timer, a drop-in is done without editing the original:

sudo systemctl edit pg-backup.service

[Service]
ExecStartPost=/usr/bin/curl -fsS https://your-notifly/heartbeat/ping/H... -o /dev/null

ExecStartPost runs only if the main ExecStart completed successfully.

Step 4. Check

Simulate a “failed” backup — stub out pg-backup.sh so that it immediately exits with an error:

sudo systemctl edit --runtime --force fake-broken-backup.service <<EOF
[Service]
Type=oneshot
ExecStart=/bin/false
ExecStartPost=/usr/bin/curl -fsS https://your-notifly/heartbeat/ping/H... -o /dev/null
EOF

Start it — the ping won’t go out, and after the interval+ tolerance seconds a push notification will arrive.

Use cases

Backups of any databases and filesystems — can be invaluable during a panic.
Certificates and Let’s Encrypt cron renewals — heartbeat once a day, alert “certbot didn’t run”.
Log rotation — once a week.
Imports/exports between systems — heartbeat hourly/daily/weekly.
IoT sensors — the device “calls home” every 5 minutes via curl, a network outage will immediately trigger an alert.

Pausing during maintenance

If you know the server will undergo maintenance and you don’t want to receive false alerts:

In the admin panel: the ⏸ icon in the heartbeat row;
via API: POST /heartbeat/<id>/pause, then …/resume;
via MCP: ask the assistant “pause backup heartbeat for an hour”.

When you return — resume, and the check will resume from the new time.

Why this is more reliable than “on-error” alerts

Active alerts (“something broke”) stay silent if cron didn’t run at all, if the crontab disappeared, or if the server was shut down. A heartbeat check stays silent only when everything actually works: you have the server, cron, the network, and the script completed successfully. Any hole in that chain — and within a minute you’ll receive a notification.