Nine years have passed since I first had enough data to care about. If you’re like me, you’re always evolving your setup. When I look back at the last few years, the biggest shift for me is that I have moved away from paying for data services like Dropbox and Crashplan, and instead pay for local hardware and hardware services. The main driver for this is keeping my data more private, but I’ve found it also makes for a much better setup overall. Syncthing and Nextcloud have become my synchronization backbone, and the Btrfs filesystem the basis for my backups. I have accounts with DigitalOcean and Hetzner to make use of virtual private servers (VPS) and dedicated servers (DS) respectively. My costs have increased from $30 to $70 per month, but I think it’s a small price for the freedom obtained.
Storage Options Today Are Much Better
A few big things have changed in the world of data storage over the last decade:
- Hardware is One-Third Cheaper - As of this article you can buy a shiny new 16 TB drive for $500. That’s only $31 per TB! For non-cutting edge disk sizes you can get in the mid-20s per terabyte. That’s about 1/3rd of the cost in 2011. In general, hardware is much cheaper, quieter, and more energy efficient, which adds a lot of possibilities for how you approach the problem of storing a data.
- Modern File Systems Are Stable - File systems in general are complex and take a long time to mature. ZFS (developed since 2001) and Btrfs (developed since 2007) are no longer newcomers. Both offer modern file system features like copy on write, bit rot protection, and software data profiles that replace hardware RAID controllers.
- Modern File Systems Over Hardware RAID - I distinctly recall the frustration that came with figuring my first hardware RAID setup. And even when I got it working, it felt like a fragile proprietary niche that I was not properly trained for. Almost all my knowledge came from various online forums instead of formal documents. Contrast that with solutions like Btrfs and ZFS, which have excellent documentation online and in the man pages. In addition, one can find seemingly endless examples of how to set things up. There are probably good reasons a company might want hardware RAID, but in a personal homelab, ZFS/Btrfs are hard to beat.
- Free, Open-Source Sync Software is Really Good - There was no Syncthing or Nextcloud in 2011. Options like these coupled with cheap hardware and modern file systems makes self-hosted solutions quite easy to put together with a minimal understanding of servers.
Changes I Have Made
This table highlights where I started in 2011 and what I’ve changed since then.
|Requirement||My 2011 Solution||My 2020 Solution|
|Local Storage||4 TB, hardware RAID||20 TB, Btrfs RAID|
|Local Backup||None||20 TB, Btrfs RAID|
|Remote Backup||CrashPlan ($10/mo)||20 TB, Btrfs RAID via Hetzner ($70/mo)|
|Speediness of Remote Backup||1 TB per month||1 TB per day|
|Centralized Sync||Dropbox, 2 people ($20/mo)||Nextcloud via
DigitalOcean VPS ($5/mo)
|Privacy Weak Points||Dropbox||None|
|Data Transmission||All encrypted||All encrypted|
|Data at Rest||Lacking local full disk encryption.||Local & Remote LUKS encryption.|
Overall, I really like my current approach compared to 2011, even if it costs a little more than what I started with, but allow me to elaborate a bit more on some specific areas.
Ease of Use: Improved
Dropbox used to be easy enough that I’d look past the things that bothered me. In 2018 they announced they would drop support for most Linux file systems. This motivated me to finally consider alternatives. Crashplan, though it had remained constant, was performing increasingly worse as my data set grew. Additionally, they tied themselves to the more enterprise-oriented OSes, which meant it got problematic to run on any more cutting edge Linux distribution like Fedora.
This is when I found Syncthing. It is now the shining star for me now. It performs well, is super easy to setup, and runs on everything. Most importantly it’s decentralized. So if a node in my Syncthing network goes down, it doesn’t matter. This is hugely important as one of my biggest concerns with hosting my own solution was having a single point of failure. Syncthing completely removes that problem.
I also added Nextcloud into the mix. It’s essentially a 1:1 replacement for core Dropbox features. Of note, it has an iOS app that handles automatic photo uploads. Aside from photos, there are a few documents I like to have easily accessible and shared with my wife, and Nextcloud handles this well. When installing it, I decided to use the snap, which is almost no effort to install and maintain. I decided to give Nextcloud its own VPS. I could save the $5 and consolidate it on the Hetzner dedicated server, but I prefer to have it isolated.
Resiliency: Mostly Improved
Both Dropbox and Crashplan had multiple copies of my data, which was great on the remote end. Locally, I had only my RAID to offer some fault tolerance and no local snapshots or backups. When manipulating large data sets, it would take quite a while (sometimes days) before I had a true backup. If I did have a local failure I was always at risk of losing a day’s work. Plus, when you have a sync lag like that, you start to get into weird situations where the file you’re syncing is changing multiple times. This always seemed to make Dropbox slow down a lot.
My current local setup is much more resilient with two independent servers frequently snapshotting data and sitting on Btrfs RAID. This should cover the problems I’m most likely to have, e.g., I break something locally, or something locally fails. Additionally, I’ve removed the hardware RAID card, which is one less thing that might fail. My remote backup is less resilient with only 1 copy existing, so if my house burns down and multiple disks fail on my remote then I’ve lost everything. The big gain is that if I manipulate a large amount of local data, it only takes minutes before I have both local and remote backups.
Future consideration: If something like Backblaze B2 gets cheaper, I might consider it. Right now, it’s just a bit more expensive than the dedicated server for the amount of data I have. I’d gain more remote resiliency, but lose my dead simple setup. We’ll see.
Security & Privacy: Much Better
I really don’t like undocumented algorithms (or possibly employees/contractors) pouring over my data. Dropbox is unwilling to let me turn this sort of thing off, which is really unfortunate. There are similar services like Tresorit, which do allow you to have this sort of privacy, but are much more expensive. With all my data seen only by me, I see privacy as a major improvement.
Security was pretty good as both Dropbox and Crashplan take appropriate measures to ensure that data isn’t accidentally leaked or lost. Anytime you roll your own solutions you inherently take on some security risk, i.e., Dropbox’s security team is probably better than me. So my security probably got a bit worse, but not in any measurable way. To mitigate this, I avoid installing any extraneous apps for Nextcloud or doing anything too complicated in hopes that I keep the attack surface small.
Third party services do what they do, so there’s no real added utility beyond their offered service.
One of the nice side effects of having remote servers is that you can use them for other things too. For a tinkerer like me, it’s just a far more fun setup.
Recurring Cost: Doubled
$30/mo for 2x2TB Dropbox and Unlimited Crashplan backup vs. $70/mo for 25 GB Nextcloud on a DigitalOcean VPS and 20 TB Backup on a Hetzer SX62 DS. Obviously, I’m spending more. Is it worth the additional $40? I think so, but clearly that’s a subjective assessment.
If you started worrying about data more than a decade ago, I’d highly recommend browsing the current options as there are some really great options to choose from. Then again, if it hasn’t crossed your mind to look, maybe you already found the perfect solution.
Know of other offsite commentary? Let me know!