From 36abeb13c5a35a37a8714ac117c77f4b21c97a40 Mon Sep 17 00:00:00 2001 From: Kaan Genc Date: Thu, 10 Mar 2022 00:41:44 -0500 Subject: [PATCH] add "my local data storage setup" --- build.boot | 2 +- content/raid.md | 188 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 189 insertions(+), 1 deletion(-) create mode 100644 content/raid.md diff --git a/build.boot b/build.boot index 31f4271..eeee34d 100644 --- a/build.boot +++ b/build.boot @@ -22,7 +22,7 @@ (sift :to-resource #{#"^extra/(.*)"}) (sift :to-resource #{#"^CNAME"}) (garden :styles-var 'site.styles/base :output-to "main.css" :pretty-print (if optimize? false true)) - (cljs :optimizations (if optimize? :advanced :none) :source-map (if optimize? false true)) + ;(cljs :optimizations (if optimize? :advanced :none) :source-map (if optimize? false true)) (perun/ttr) ;; Time to read (perun/word-count) (perun/render :renderer 'site.core/page) diff --git a/content/raid.md b/content/raid.md new file mode 100644 index 0000000..96faac7 --- /dev/null +++ b/content/raid.md @@ -0,0 +1,188 @@ +--- +title: My local data storage setup +date: 2022-03-10 +--- + +Recently, I've needed a bit more storage. In the past I've relied on Google +Drive, but if you need a lot of space Google Drive becomes prohibitively +expensive. The largest option available, 2 TB, runs you $100 a year at the time +of writing. While Google Drive comes with a lot of features, it also comes with +a lot of privacy concerns, and I need more than 2 TB anyway. Another option +would be Backblaze B2 or AWS S3, but the cost is even higher. Just to set a +point of comparison, 16 TB of storage would cost $960 a year with B2 and a +whopping $4000 a year with S3. + +Luckily in reality, the cost of storage per GB has been coming down steadily. +Large hard drives are cheap to come by, and while these drives are not +incredibly fast, they are much faster than the speed of my internet connection. +Hard drives it is then! + +While I could get a very large hard drive, it's generally a better idea to get +multiple smaller hard drives. That's because these drives often offer a better +$/GB rate, but also because it allows us to mitigate the risk of data loss. So +after a bit of search, I found these "Seagate Barracuda Compute 4TB" drives. You +can find them on [Amazon](https://www.amazon.com/gp/product/B07D9C7SQH/) or +[BestBuy](https://www.bestbuy.com/site/seagate-barracuda-4tb-internal-sata-hard-drive-for-desktops/6387158.p?skuId=6387158). + +These hard drives are available for $70 each at the time I'm writing this,and I bought 6 of them. +This gets me to around $420, plus a bit more for SATA cables. +Looking at [Backblaze Hard Drive Stats](https://www.backblaze.com/blog/backblaze-drive-stats-for-2021/), +I think it's fair to assume these drives will last at least 5 years. +Dividing the cost by the expected lifetime, that gets me $84 per year, far below what the cloud storage costs! +It's of course not as reliable, and it requires maintenance on my end, but +the difference in price is just too far to ignore. + +## Setup + +I decided to set this all up inside my desktop computer. I have a large case so +fitting all the hard drives in is not a big problem, and my motherboard does +support 6 SATA drives (in addition to the NVMe that I'm booting off of). I also +run Linux on my desktop computer, so I've got all the required software +available. + +For the software side of things, I decided to go with `mdadm` and `ext4`. There +are also other options available like ZFS (not included in the linux kernel) or +btrfs (raid-5 and raid-6 are known to be unreliable), but this was the setup I +found the most comfortable and easy to understand for me. How it works is that +`mdadm` combines the disks and presents it as a block device, then `ext4` +formats and uses the block device the same way you use it with any regular +drive. + +### Steps + +I was originally planning to write the steps I followed here, but in truth I +just followed whatever the [ArchLinux wiki](https://wiki.archlinux.org/title/RAID#Installation) +told me. So I'll just recommend you follow that as well. + +The only thing I'll warn you is that the wiki doesn't clearly note just how long +this process takes. It took almost a week for the array to build, and until the +build is complete the array runs at a reduced performance. Be patient, and just +give it some time to finish. As a reminder, you can always check the build +status with `cat /dev/mdstat`. + +## Preventative maintenance + +Hard drives have a tendency to fail, and because RAID arrays are resilient, the +failures can go unnoticed. You **need** to regularly check that the array is +okay. Unfortunately, while there are quite a few resources online on how to set +up RAID, very few of them actually talk about how to set up scrubs (full scans +to look for errors) and error monitoring. + +For my setup, I decided to set up systemd to check and report issues. For this, +I first set up 2 timers: 1 that checks if there are any reported errors on the +RAID array, and another that scrubs the RAID array. Systemd timers are 2 parts, +a service file and a timer file, so here's all the files. + +- `array-scrub.service` + ```toml + [Unit] + Description=Scrub the disk array + After=multi-user.target + OnFailure=report-failure-email@array-scrub.service + + [Service] + Type=oneshot + User=root + ExecStart=bash -c '/usr/bin/echo check > /sys/block/md127/md/sync_action' + + [Install] + WantedBy=multi-user.target + ``` +- `array-scrub.timer` + ```toml + [Unit] + Description=Periodically scrub the array. + + [Timer] + OnCalendar=Sat *-*-* 05:00:00 + + [Install] + WantedBy=timers.target + ``` + +The timer above is the scrub operation, it tells RAID to scan the drives for +errors. It actually takes up to a couple days in my experience for the scan to +complete, so I run it once a week. + +- `array-report.service` + ```toml + [Unit] + Description=Check raid array errors that were found during a scrub or normal operation and report them. + After=multi-user.target + OnFailure=report-failure-email@array-report.service + + [Service] + Type=oneshot + ExecStart=/usr/bin/mdadm -D /dev/md127 + + [Install] + WantedBy=multi-user.target + ``` +- `array-report.timer` + ```toml + [Unit] + Description=Periodically report any issues in the array. + + [Timer] + OnCalendar=daily + + [Install] + WantedBy=timers.target + ``` + +And this timer above checks the RAID array status to see if there were any +errors found. This timer runs much more often (once a day), because it's +instant, and also because RAID can find errors during regular operation even +when you are not actively running a scan. + +### Error reporting + +Another important thing here is this line in the service file: +```toml +OnFailure=report-failure-email@array-report.service +``` + +The automated checks are of no use if I don't know when something actually +fails. Luckily, systemd can run a service when another service fails, so I'm +using this to report failures to myself. Here's what the service file looks like: + +- `report-failure-email@.service` + ```toml + [Unit] + Description=status email for %i to user + + [Service] + Type=oneshot + ExecStart=/usr/local/bin/systemd-email address %i + User=root + ``` +- `/usr/local/bin/systemd-email` + ```sh + #!/bin/sh + + /usr/bin/sendmail -t < + Subject: Failure on $2 + Content-Transfer-Encoding: 8bit + Content-Type: text/plain; charset=UTF-8 + + $(systemctl status --lines 100 --no-pager "$2") + ERRMAIL + ``` + +The service just runs this shell script, which is just a wrapper around +sendmail. The `%i` in the service is the part after `@` when you use the +service, you can see that the `OnFailure` hook puts `array-report` after the `@` +which then gets passed to the email service, which then passes it on to the mail +script. + +To send emails, you also need to set up `sendmail`. I decided to install +[msmtp](https://wiki.archlinux.org/title/Msmtp), and set it up to use my GMail +account to send me an email. + +To test if the error reporting works, edit the `array-report.service` and change +the line `ExecStart` line to `ExecStart=false`. Then run the report service with +`systemd start array-report.service`, you should now get an email letting you +know that the `array-report` service failed, and attaches the last 100 lines of +the service status to the email.