zigford.org

About | Links | Scripts
Sharing linux/windows scripts and tips

Scrubbing my data - BTRFS

April 23, 2020 — Jesse Harris

It's no secret I use BTRFS, so I have a fair amount of data stored on this filesystem. With most popular filesystems you have no way of knowing if your data is the same read as was originally written. A few modern filesystems support a function known as scrubbing.


Scrubbing is:

btrfs scrub is used to scrub a btrfs filesystem, which will read all data and metadata blocks from all devices and verify checksums. Automatically repair corrupted blocks if there’s a correct copy available.

On the most common filesystems1 data is written and read but never validated. Between writing and reading data, sometimes the data can get changed. This could occur due to a media fault or firmware bug or user error. (dd the wrong volume anyone?)

Before this technology was available, it was always a worry that my most precious stored family photos and videos could at any time become silently corrupted without my knowing!

Now thanks to BTRFS (or ZFS if your more into the BSDs) you never have to suffer not knowing about data corruption, and as a bonus you can prevent it too. (if you have mirrored volumes)

My BTRFS Volumes

I currently have the following volumes in BTRFS:

  • root sd card of the Raspberry Pi 4 hosting this site 32 gigs
  • 7Tb external spinning disk attached to the RPI4
  • 256Gb ssd boot volume on my main desktop PC
  • 2Tb ssd home volume on my main desktop PC
  • 3Tb + 2Tb external spinning disks connected via USB3 to my main desktop PC
  • 500Gb nvme luks encrypted volume on precision 5510
  • 1Tb external luks encrypted backup 2.5" USB2 sometimes attached to Precision 5510

Backing them up

All of these systems have their root/home volumes backed up at least once via BTRFS send and in the case of the desktop PC, it also performs daily snapshots with restic to a cloud storage system.

RPI:

Daily snapshots sent to it's locally attached 7Tb volume Occasionally I btrfs send the 7Tb volume to the 3+2 spanned external disks on the PC. I'm not too fussed if I loose the 7Tb volume. It is mainly backup and some unimportant media.

PC:

I wrote a little bash script to btrfs snapshot and send incrementally backups every hour for the last 24 hours, day for the last month and monthly until it gets to a % full. These go to the 2+3 Tb external disks. Restic performs cloud backups

Precision:

Thee same bash script on a systemd timer, sends backups to my external disk when it's connected

Scrubbing it all

Not long ago I was running my home volumes on mirrored spinning disks. In the modern era these were beginning to feel slow and one drive started failing. BTRFS scrub would fix errors for me then:

screenshot

You can see that a few errors were corrected because another copy of the block was available. The whole 800+Gb of data took an hour 44 to scrub. Compare that to today:

screenshot

In this screenshot the bottom left terminal is a scrub of my new 2Tb ssd. Bottom right is 256Gb boot volume. Top left is a snapshot of a backup of the 7Tb volume on the 2+3Tb spanning disks. And top right is the 7Tb volume itself connected to the RPI4.

Tags: btrfs, backup


  1. Ext2, Ext3, Ext4, XFS, NTFS, APFS, HFS+