Skip to content

restic-monitor: Never Miss a Failed Backup

I run restic to back up my servers. One day I noticed a repository hadn’t received a new snapshot in weeks. The backup job had been silently failing, and I had no idea.

That’s the worst kind of failure: everything appears fine until you actually need the backup.

/stv0g/restic-monitor is a small Go tool that checks whether each of your restic repositories has a recent enough snapshot and reports the result to an Uptime Kuma push monitor. Run it on a timer and you’ll know within hours if a backup stops working.

FreeCAD + Sylvac = InstrumentInput
FreeCAD + Sylvac = InstrumentInput.

Backup tools are generally good at telling you when a backup starts failing. What they don’t do is tell you when the backup job itself stops running — cron exits silently, a service gets disabled after an update, a mount point disappears or the machine is simply not running. The backup appears to exist; it just hasn’t been updated in days.

Uptime Kuma’s push monitors flip this around: your service sends a heartbeat on each successful run, and if it stops arriving, the monitor alerts you. restic-monitor sends that heartbeat.

The tool reads a YAML config file listing your repositories and how fresh their snapshots should be.

config.yaml
defaults:
url: "s3:https://s3.example.com/restic-{{ .Name }}"
max_age: 1d
push_url: https://uptime-kuma.example.com/api/push/
extra_env:
AWS_ACCESS_KEY_ID: "your-key-id"
AWS_SECRET_ACCESS_KEY: "your-secret"
repositories:
web-server:
push_token: yDaQM5YUuS3rczSvLsOucs4LqHSRDhu6
db-server:
push_token: ap2F52Qjmup9dxyPcWa1eNt125AukcrV
max_age: 6h

The defaults section reduces repetition — including Go template syntax in url, where {{ .Name }} expands to the repository key. Each repository gets its own push token and max_age.

restic-monitor only needs to read your repositories — it never writes, prunes, or modifies anything. For S3-backed repos the monitoring credentials can be a dedicated IAM user with a minimal policy:

s3-policy.json
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": ["arn:aws:s3:::your-bucket", "arn:aws:s3:::your-bucket/*"]
}

The default mode calls restic snapshots, which needs the repository password to decrypt the index. Setting RESTIC_MONITOR_FAST=true drops that requirement for S3-backed repositories: instead of invoking restic, it lists objects under snapshots/ directly via the S3 API and uses the object’s last-modified time as a proxy. Only s3:GetObject and s3:ListBucket are needed — no password, and a compromised monitoring host cannot decrypt your backup data. For non-S3 backends it falls back to the normal restic path automatically.

Architecture: VPS backs up to S3; Monitoring Host (containing restic-monitor and Uptime Kuma) reads snapshots from S3 and restic-monitor pushes heartbeat to Uptime Kuma.
restic-monitor and Uptime Kuma run on a monitoring host independent of the backup source and storage..

Because restic-monitor only needs read access, it can run on a completely separate machine — not the server being backed up, and not the machine hosting the backup storage. If the backup machine also runs the health check, a full system failure silences both at once. A third machine makes it a genuine independent audit.

In practice I run restic-monitor on my home server, checking repositories on an object storage provider for data originating on my VPS — three independent failure domains, one status dashboard.

restic-monitor can tell you that snapshots exist and are recent. It cannot tell you whether they are actually recoverable — corrupted or missing data, missing pack files, or a version mismatch all produce valid-looking snapshots that fail at restore time.

Use restic check for integrity verification and periodically do a real restore. Neither is automated here; schedule them separately and actually run them.

The flake ships a ready-made NixOS module that creates a dedicated system user, a oneshot service, and a timer.

flake.nix
{
inputs = {
nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";
restic-monitor.url = "git+https://codeberg.org/stv0g/restic-monitor";
};
outputs = { nixpkgs, restic-monitor, ... }: {
nixosConfigurations.my-host = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
restic-monitor.nixosModules.default
{
nixpkgs.overlays = [ restic-monitor.overlays.default ];
services.restic-monitor = {
enable = true;
interval = "*-*-* 06,18:00:00";
# loaded via systemd LoadCredential into a private tmpfs
environmentFile = "/run/secrets/restic-monitor-env";
settings = {
defaults = {
url = "s3:https://s3.example.com/restic-{{ .Name }}";
push_url = "https://uptime-kuma.example.com/api/push/";
};
repositories = {
web-server.push_token = "yDaQM5YUuS3rczSvLsOucs4LqHSRDhu6";
db-server = { push_token = "ap2F52Qjmup9dxyPcWa1eNt125AukcrV"; max_age = "6h"; };
};
};
};
}
];
};
};
}

The environmentFile is loaded via LoadCredential so the credentials file on disk only needs to be readable by root — the secrets are exposed to the service through a private tmpfs, not a plain environment variable visible in systemctl show.

If you find this useful, consider sponsoring on Liberapay.