BTRFS - how risky?

This article on arstechnica about running btrfs kinda scares me as i am running it on omnia with NAS in 2 disk raid.

What is the local experience with btrfs ?

3 Likes

BTRFS is also used by some major companies (along with distributions) in the Linux space. SUSE (and the community variant openSUSE) use and support (enterprise-grade support, that is) BTRFS on all their important products.

I personally have been using it not only on the Omnia but also on all my Linux machines without issues, and the snapshotting capabilities (along with copy-on-write) are extremely useful and saved my skin more than once.

2 Likes

Honestly, for turris use case as filesystem of a single “disk” btrfs is certainly suited, even according to that article. While I like the author and he makes some valid points about btrfs this article is IMHO written from the perspective of someone quite happy with ZFS.

(As an aside, I am running btrfs as raid5 since a few years and even replaced a disk in the array, so things can work, but I agree that too little happens automatically, and it is quite stressful to read up on btrfs work-flows for dealing with degraded arrays, if one has such an array at hand ;))

1 Like

I’ve been using it for all my data for many years, no issues (main system etc.) – note that there was no raid 5 or 6.

2 Likes

Let’s not forget that Jim Salter, who wrote this piece, is really a ZFS advocate. He has written countless articles on ZFS, OpenZFS, tweaks, strategies etc. ZFS guys tend to look down on BTRFS, rightly or not I don’t know but there is a trend :slight_smile:
My own experience with BTRFS is very good. I use it on desktop machines. I like all the subvolumes and snapshot features. Maybe RAID capabilities are terrible, I can’t tell. But the BTRFS wiki is quite clear there…

1 Like

Synology NASes that have enough memory to handle BTRFS use it heavily for few years now already so if it is used in best NAS supplier I would not say it is half finished. Snapshots are absolutely great functionality that allows you not only rollback to certain point in time but it is also possible to get only needed files from certain time point of snapshots. I would definitely go for BTRFS anywhere where is possible and available.

1 Like

I’ve had several problems with btrfs while on Turris OS 3.x . Several times, I’ve lost all my data because the btrfs meta-structures got damaged (if it’s just file corruption, it’s okay, but damaging the core btrfs structures renders the whole fs unusable). Since upgrading to TOS 5.x, I haven’t had any of this kinds of issues (but I’ve been on 5.x for less than a year).

1 Like

I have been using BTRFS on Turris for like 3 years now, without issues.

I have also been using BTRFS for my home server (RPi + HDD via USB) and lost my BTRFS filesystem. I am not sure about the reason of this failure, but in my case the reason was probably faulty USB-SATA connector.

Since then I have switched my server to amd64, Proxmox and ZFS. Snapshots are a great feature of both BTRFS and ZFS, but they are more flexible on BTRFS. On ZFS I was disappointed when I realised that rollback to old snapshots destroys all snapshots between the snapshot you are rolling back to and now.

On the other side with BTRFS you can freely roll back to any snapshot without destroying any other snapshot. Which also means you can “branch” snapshots.

3 Likes

I think the article’s root cause is BTRFS’s lack of a Merkle tree. Like ZFS, BTRFS uses extensive checksums to ensure that the filesystem structures and data are correct. Unlike ZFS, BTRFS doesn’t have a mechanism to make sure that the data are current. That’s a danger when the drive doesn’t fail cleanly, but comes in and out.

On the other hand, the ZFS Merkle tree must be kept correct. If any hash bits get flipped and then written to disk, then the entire subtree becomes presumed corrupt. Really bad when it’s the root node. That’s one reason why backups are considered most important with ZFS, and also one reason why the especially paranoid insist on ECC RAM.

This is most relevant when dealing with multiple disks and RAID. On a single disk (or MMC flash memory, in the Turris), BTRFS seems safe enough.

1 Like

What is the local experience with btrfs ?

turris omnia is the only device where I have btrfs.
On production servers I use xfs, gpfs, sometimes rhel/rocky VDO as a scratch space.

1 Like

I fiddled quite a lot with btrfs, on hardware far from reliable, including raid 0 and 1. I once lost a volume and data, years ago to an unidentified reason. btrfs improved a lot in the past years. Each time I got corruption since then, it was due to poor hardware, cables or power loss, each time I was able to recover the volume through scrub, balance or repair. Worst case I was able to mount ro and extract the data.

Also, it depends what is your use case. RAID 1 is not about backup, RAID 1 is about availability! If I recall correctly, with btrfs RAID 1, if you lose drive(s) so that the data redundancy falls below 2, the filesystem remounts read only until the faulty drive is replaced. In your case, that means losing 1 of the 2 disks.

I was using btrfs in a RAID1 configuration for a few months on Omnia and lost the array for no obvious reason. As someone who’s used ZFS in production 15 years ago, I was horrified by the current state of the btrfs toolset and immediately switched to mdadm/ext4 for the data storage purposes.

That said, the btrfs root drive (no RAID) has had no issues since the Indiegogo days of my Turris Omnia.

3 Likes

I once lost a btrfs filesystem on a single external HDD. I think it was still on Turris OS 3.x.

It was related to a power loss, after which I started a scrub, which was interrupted for some reason (possibly another power loss). Then the FS could not be mounted, and standard recovery utils failed. I think I tried reapplying last transactions, or rolling them back(?), and finding a previous root, without success.
Fortunately, I had a very recent backup (btrfs send | btrfs receive), which I restored.

I don’t blame btrfs per se. The filesystem was quite old, and my first/only btrfs. I remember experimenting with some deduplication utilities, which might have broken something.

To reduce risk of unclean shutdowns, I would recommend using a UPS, especially if you have “picky” circuit breakers that trip when connecting a more-power-hungry-than-usual device like a crappy lab power supply.
And of course keep your backups are up to date, like they should be :wink:

1 Like

This topic was automatically closed after 60 days. New replies are no longer allowed.