• 0 Posts
  • 2 Comments
Joined 29 days ago
cake
Cake day: April 7th, 2026

help-circle
  • You have enough failures on each disk to make me suspect an issue with the usb-connected drive bay. I ran into similar issues with a cheap pci-e sata adapter, where little hiccups and latency in the communication layer would cause zfs to take disks offline randomly. Read, write, and checksum errors would slowly accumulate across all of the disks. Switched that machine to a proper enterprise hba, the issues vanished, and the disks are all healthy 3-4 years later.


  • Everything I run, I deploy and manage with ansible.

    When I’m building out the role/playbook for a new service, I make sure to build in any special upgrade tasks it might have and tag them. When it’s time to run infrastructure-wide updates, I can run my single upgrade playbook and pull in the upgrade tasks for everything everywhere - new packages, container images, git releases, and all the service restart steps to load them.

    It’s more work at the beginning to set the role/playbook up properly, but it makes maintaining everything so much nicer (which I think is vital to keep it all fun and manageable).