Add NixOS configuration for john-endesktop ZFS/NFS server
Create configuration to migrate john-endesktop from Arch Linux to NixOS while maintaining existing ZFS pools (media JBOD and swarmvols mirror) and NFS exports for k3s cluster. Configuration includes: - ZFS support with automatic pool import - NFS server exporting both pools to 10.0.0.0/24 - Correct ZFS hostid (007f0101) to resolve hostid warnings - Btrfs root filesystem on nvme0n1p5 (810GB) - Comprehensive migration plan with rollback procedures The migration is designed to be safe with Arch Linux remaining bootable as a fallback until NixOS is verified stable.
This commit is contained in:
423
machines/john-endesktop/MIGRATION_PLAN.md
Normal file
423
machines/john-endesktop/MIGRATION_PLAN.md
Normal file
@@ -0,0 +1,423 @@
|
||||
# Migration Plan: Arch Linux to NixOS on john-endesktop (ZFS/NFS Server)
|
||||
|
||||
## Overview
|
||||
This document outlines the plan to migrate the john-endesktop server from Arch Linux to NixOS while maintaining the existing ZFS pools and NFS exports that serve your k3s cluster.
|
||||
|
||||
## Current System State
|
||||
|
||||
### Hardware
|
||||
- **Boot disk**: nvme0n1
|
||||
- nvme0n1p3: 1000M EFI partition (UUID: F5C6-D570)
|
||||
- nvme0n1p4: 120GB ext4 / (current Arch root)
|
||||
- nvme0n1p5: 810GB - **Target for NixOS** (being removed from media pool)
|
||||
- **Network**: enp0s31f6 @ 10.0.0.43/24 (DHCP)
|
||||
|
||||
### ZFS Pools
|
||||
- **media**: ~3.5TB JBOD pool (2 drives after nvme0n1p5 removal)
|
||||
- wwn-0x50014ee2ba653d70-part2
|
||||
- ata-WDC_WD20EZBX-00AYRA0_WD-WX62D627X7Z8-part2
|
||||
- Contains: /media/media/nix (bind mounted to /nix on Arch)
|
||||
- NFS: Shared to 10.0.0.0/24 via ZFS sharenfs property
|
||||
|
||||
- **swarmvols**: 928GB mirror pool - **PRODUCTION DATA**
|
||||
- wwn-0x5002538f52707e2d-part2
|
||||
- wwn-0x5002538f52707e81-part2
|
||||
- Contains: iocage jails and k3s persistent volumes
|
||||
- NFS: Shared to 10.0.0.0/24 via ZFS sharenfs property
|
||||
- Backed up nightly to remote borg
|
||||
|
||||
### Services
|
||||
- NFS server exporting /media and /swarmvols to k3s cluster
|
||||
- ZFS managing pools with automatic exports via sharenfs property
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Before Starting
|
||||
1. ✅ Ensure nvme0n1p5 removal from media pool is complete
|
||||
```bash
|
||||
ssh 10.0.0.43 "zpool status media"
|
||||
# Should show no "removing" devices
|
||||
```
|
||||
|
||||
2. ✅ Verify recent backups exist
|
||||
```bash
|
||||
# Verify swarmvols backup is recent (< 24 hours)
|
||||
# Check your borg backup system
|
||||
```
|
||||
|
||||
3. ✅ Notify k3s cluster users of planned maintenance window
|
||||
- NFS shares will be unavailable during migration
|
||||
- Estimate: 30-60 minutes downtime
|
||||
|
||||
4. ✅ Build NixOS configuration from your workstation
|
||||
```bash
|
||||
cd ~/nixos-configs
|
||||
nix build .#nixosConfigurations.john-endesktop.config.system.build.toplevel
|
||||
```
|
||||
|
||||
## Migration Steps
|
||||
|
||||
### Phase 1: Prepare NixOS Installation Media
|
||||
|
||||
1. **Download NixOS minimal ISO**
|
||||
```bash
|
||||
wget https://channels.nixos.org/nixos-25.11/latest-nixos-minimal-x86_64-linux.iso
|
||||
```
|
||||
|
||||
2. **Create bootable USB**
|
||||
```bash
|
||||
# Identify USB device (e.g., /dev/sdb)
|
||||
lsblk
|
||||
# Write ISO to USB
|
||||
sudo dd if=latest-nixos-minimal-x86_64-linux.iso of=/dev/sdX bs=4M status=progress
|
||||
sudo sync
|
||||
```
|
||||
|
||||
### Phase 2: Backup and Shutdown
|
||||
|
||||
1. **On the server, verify ZFS pool status**
|
||||
```bash
|
||||
ssh 10.0.0.43 "zpool status"
|
||||
ssh 10.0.0.43 "zfs list"
|
||||
```
|
||||
|
||||
2. **Export ZFS pools cleanly**
|
||||
```bash
|
||||
ssh 10.0.0.43 "sudo zpool export media"
|
||||
ssh 10.0.0.43 "sudo zpool export swarmvols"
|
||||
```
|
||||
|
||||
3. **Shutdown Arch Linux**
|
||||
```bash
|
||||
ssh 10.0.0.43 "sudo shutdown -h now"
|
||||
```
|
||||
|
||||
### Phase 3: Install NixOS
|
||||
|
||||
1. **Boot from NixOS USB**
|
||||
- Insert USB drive
|
||||
- Power on and select USB in boot menu
|
||||
|
||||
2. **Connect to network**
|
||||
```bash
|
||||
# If DHCP doesn't work automatically:
|
||||
sudo systemctl start dhcpcd
|
||||
ip a # Verify you have 10.0.0.43 or another IP
|
||||
```
|
||||
|
||||
3. **Enable SSH for remote installation (recommended)**
|
||||
```bash
|
||||
# Set password for nixos user
|
||||
sudo passwd nixos
|
||||
# Start SSH
|
||||
sudo systemctl start sshd
|
||||
# From your workstation:
|
||||
ssh nixos@10.0.0.43
|
||||
```
|
||||
|
||||
4. **Partition nvme0n1p5 with btrfs**
|
||||
```bash
|
||||
# Verify the device is clear
|
||||
lsblk
|
||||
sudo wipefs -a /dev/nvme0n1p5
|
||||
|
||||
# Create btrfs filesystem
|
||||
sudo mkfs.btrfs -L nixos /dev/nvme0n1p5
|
||||
|
||||
# Mount and create subvolumes
|
||||
sudo mount /dev/nvme0n1p5 /mnt
|
||||
sudo btrfs subvolume create /mnt/@
|
||||
sudo btrfs subvolume create /mnt/@home
|
||||
sudo btrfs subvolume create /mnt/@nix
|
||||
sudo btrfs subvolume create /mnt/@log
|
||||
sudo umount /mnt
|
||||
|
||||
# Mount root subvolume
|
||||
sudo mount -o subvol=@,compress=zstd,noatime /dev/nvme0n1p5 /mnt
|
||||
|
||||
# Create mount points
|
||||
sudo mkdir -p /mnt/{boot,home,nix,var/log}
|
||||
|
||||
# Mount other subvolumes
|
||||
sudo mount -o subvol=@home,compress=zstd,noatime /dev/nvme0n1p5 /mnt/home
|
||||
sudo mount -o subvol=@nix,compress=zstd,noatime /dev/nvme0n1p5 /mnt/nix
|
||||
sudo mount -o subvol=@log,compress=zstd,noatime /dev/nvme0n1p5 /mnt/var/log
|
||||
|
||||
# Mount EFI partition
|
||||
sudo mount /dev/nvme0n1p3 /mnt/boot
|
||||
```
|
||||
|
||||
5. **Import ZFS pools**
|
||||
```bash
|
||||
# Import pools (should be visible)
|
||||
sudo zpool import
|
||||
|
||||
# Import with force if needed due to hostid
|
||||
sudo zpool import -f media
|
||||
sudo zpool import -f swarmvols
|
||||
|
||||
# Verify pools are mounted
|
||||
zfs list
|
||||
ls -la /media /swarmvols
|
||||
```
|
||||
|
||||
6. **Generate initial hardware configuration**
|
||||
```bash
|
||||
sudo nixos-generate-config --root /mnt
|
||||
```
|
||||
|
||||
7. **Get the new root filesystem UUID**
|
||||
```bash
|
||||
blkid /dev/nvme0n1p5
|
||||
# Note the UUID for updating hardware-configuration.nix
|
||||
```
|
||||
|
||||
8. **Copy your NixOS configuration to the server**
|
||||
```bash
|
||||
# From your workstation:
|
||||
scp -r ~/nixos-configs/machines/john-endesktop/* nixos@10.0.0.43:/tmp/
|
||||
|
||||
# On server:
|
||||
sudo mkdir -p /mnt/etc/nixos
|
||||
sudo cp /tmp/configuration.nix /mnt/etc/nixos/
|
||||
sudo cp /tmp/hardware-configuration.nix /mnt/etc/nixos/
|
||||
|
||||
# Edit hardware-configuration.nix to update the root filesystem UUID
|
||||
sudo nano /mnt/etc/nixos/hardware-configuration.nix
|
||||
# Change: device = "/dev/disk/by-uuid/CHANGE-THIS-TO-YOUR-UUID";
|
||||
# To: device = "/dev/disk/by-uuid/[UUID from blkid]";
|
||||
```
|
||||
|
||||
9. **Install NixOS**
|
||||
```bash
|
||||
sudo nixos-install
|
||||
|
||||
# Set root password when prompted
|
||||
# Set user password
|
||||
sudo nixos-install --no-root-passwd
|
||||
```
|
||||
|
||||
10. **Reboot into NixOS**
|
||||
```bash
|
||||
sudo reboot
|
||||
# Remove USB drive
|
||||
```
|
||||
|
||||
### Phase 4: Post-Installation Verification
|
||||
|
||||
1. **Boot into NixOS and verify system**
|
||||
```bash
|
||||
ssh johno@10.0.0.43
|
||||
|
||||
# Check NixOS version
|
||||
nixos-version
|
||||
|
||||
# Verify hostname
|
||||
hostname # Should be: john-endesktop
|
||||
```
|
||||
|
||||
2. **Verify ZFS pools imported correctly**
|
||||
```bash
|
||||
zpool status
|
||||
zpool list
|
||||
zfs list
|
||||
|
||||
# Check for hostid mismatch warnings (should be gone)
|
||||
# Verify both pools show ONLINE status
|
||||
```
|
||||
|
||||
3. **Verify NFS exports are active**
|
||||
```bash
|
||||
sudo exportfs -v
|
||||
systemctl status nfs-server
|
||||
|
||||
# Should see /media and /swarmvols exported to 10.0.0.0/24
|
||||
```
|
||||
|
||||
4. **Test NFS mount from another machine**
|
||||
```bash
|
||||
# From a k3s node or your workstation:
|
||||
sudo mount -t nfs 10.0.0.43:/swarmvols /mnt
|
||||
ls -la /mnt
|
||||
sudo umount /mnt
|
||||
|
||||
sudo mount -t nfs 10.0.0.43:/media /mnt
|
||||
ls -la /mnt
|
||||
sudo umount /mnt
|
||||
```
|
||||
|
||||
5. **Verify ZFS sharenfs properties preserved**
|
||||
```bash
|
||||
zfs get sharenfs media
|
||||
zfs get sharenfs swarmvols
|
||||
|
||||
# Should show: sec=sys,mountpoint,no_subtree_check,no_root_squash,rw=@10.0.0.0/24
|
||||
```
|
||||
|
||||
6. **Check swap device**
|
||||
```bash
|
||||
swapon --show
|
||||
free -h
|
||||
# Should show /dev/zvol/media/swap
|
||||
```
|
||||
|
||||
### Phase 5: Restore k3s Cluster Access
|
||||
|
||||
1. **Restart k3s nodes or remount NFS shares**
|
||||
```bash
|
||||
# On each k3s node:
|
||||
sudo systemctl restart k3s # or k3s-agent
|
||||
```
|
||||
|
||||
2. **Verify k3s pods have access to persistent volumes**
|
||||
```bash
|
||||
# On k3s master:
|
||||
kubectl get pv
|
||||
kubectl get pvc
|
||||
# Check that volumes are bound and accessible
|
||||
```
|
||||
|
||||
## Rollback Plan
|
||||
|
||||
If something goes wrong during migration, you can roll back to Arch Linux:
|
||||
|
||||
### Quick Rollback (If NixOS won't boot)
|
||||
|
||||
1. **Boot from NixOS USB (or Arch USB)**
|
||||
|
||||
2. **Import ZFS pools**
|
||||
```bash
|
||||
sudo zpool import -f media
|
||||
sudo zpool import -f swarmvols
|
||||
```
|
||||
|
||||
3. **Start NFS manually (temporary)**
|
||||
```bash
|
||||
sudo mkdir -p /media /swarmvols
|
||||
sudo systemctl start nfs-server
|
||||
sudo exportfs -o rw,sync,no_subtree_check,no_root_squash 10.0.0.0/24:/media
|
||||
sudo exportfs -o rw,sync,no_subtree_check,no_root_squash 10.0.0.0/24:/swarmvols
|
||||
sudo exportfs -v
|
||||
```
|
||||
This will restore k3s cluster access immediately while you diagnose.
|
||||
|
||||
4. **Boot back into Arch Linux**
|
||||
```bash
|
||||
# Reboot and select nvme0n1p4 (Arch) in GRUB/boot menu
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
5. **Verify Arch boots and services start**
|
||||
```bash
|
||||
ssh johno@10.0.0.43
|
||||
zpool status
|
||||
systemctl status nfs-server
|
||||
```
|
||||
|
||||
### Full Rollback (If needed)
|
||||
|
||||
1. **Follow Quick Rollback steps above**
|
||||
|
||||
2. **Re-add nvme0n1p5 to media pool (if desired)**
|
||||
```bash
|
||||
# Only if you want to restore the original configuration
|
||||
sudo zpool add media /dev/nvme0n1p5
|
||||
```
|
||||
|
||||
3. **Clean up NixOS partition**
|
||||
```bash
|
||||
# If you want to reclaim nvme0n1p5 for other uses
|
||||
sudo wipefs -a /dev/nvme0n1p5
|
||||
```
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
### Data Safety
|
||||
- ✅ **swarmvols** (production): Mirrored + nightly borg backups
|
||||
- ⚠️ **media** (important): JBOD - no redundancy, but not catastrophic
|
||||
- ✅ **NixOS install**: Separate partition, doesn't touch ZFS pools
|
||||
- ✅ **Arch Linux**: Remains bootable on nvme0n1p4 until verified
|
||||
|
||||
### Service Continuity
|
||||
- Downtime: 30-60 minutes expected
|
||||
- k3s cluster: Will reconnect automatically when NFS returns
|
||||
- Rollback time: < 10 minutes to restore Arch
|
||||
|
||||
### Testing Approach
|
||||
1. Test NFS exports from NixOS live environment before installation
|
||||
2. Test single NFS mount from k3s node before full cluster restart
|
||||
3. Keep Arch Linux boot option until 24-48 hours of stable NixOS operation
|
||||
|
||||
## Post-Migration Tasks
|
||||
|
||||
After successful migration and 24-48 hours of stable operation:
|
||||
|
||||
1. **Update k3s NFS mounts (if needed)**
|
||||
- Verify no hardcoded references to old system
|
||||
|
||||
2. **Optional: Repurpose Arch partition**
|
||||
```bash
|
||||
# After you're confident NixOS is stable
|
||||
# You can wipe nvme0n1p4 and repurpose it
|
||||
```
|
||||
|
||||
3. **Update documentation**
|
||||
- Update infrastructure docs with NixOS configuration
|
||||
- Document any deviations from this plan
|
||||
|
||||
4. **Consider setting up NixOS remote deployment**
|
||||
```bash
|
||||
# From your workstation:
|
||||
nixos-rebuild switch --target-host johno@10.0.0.43 --flake .#john-endesktop
|
||||
```
|
||||
|
||||
## Timeline
|
||||
|
||||
- **Preparation**: 1-2 hours (testing config build, downloading ISO)
|
||||
- **Migration window**: 1-2 hours (installation + verification)
|
||||
- **Verification period**: 24-48 hours (before removing Arch)
|
||||
- **Total**: ~3 days from start to declaring success
|
||||
|
||||
## Emergency Contacts
|
||||
|
||||
- Borg backup location: [Document your borg repo location]
|
||||
- K3s cluster nodes: [Document your k3s nodes]
|
||||
- Critical services on k3s: [Document what's running that depends on these NFS shares]
|
||||
|
||||
## Checklist
|
||||
|
||||
Pre-migration:
|
||||
- [x] nvme0n1p5 removal from media pool complete
|
||||
- [ ] Recent backup verified (< 24 hours)
|
||||
- [ ] Maintenance window scheduled
|
||||
- [ ] NixOS ISO downloaded
|
||||
- [ ] Bootable USB created
|
||||
- [ ] NixOS config builds successfully
|
||||
|
||||
During migration:
|
||||
- [ ] Arch Linux shutdown cleanly
|
||||
- [ ] ZFS pools exported
|
||||
- [ ] Booted from NixOS USB
|
||||
- [ ] nvme0n1p5 formatted with btrfs
|
||||
- [ ] Btrfs subvolumes created
|
||||
- [ ] ZFS pools imported
|
||||
- [ ] NixOS installed
|
||||
- [ ] Root password set
|
||||
|
||||
Post-migration:
|
||||
- [ ] NixOS boots successfully
|
||||
- [ ] ZFS pools mounted automatically
|
||||
- [ ] NFS server running
|
||||
- [ ] NFS exports verified
|
||||
- [ ] Test mount from k3s node successful
|
||||
- [ ] k3s cluster reconnected
|
||||
- [ ] Persistent volumes accessible
|
||||
- [ ] No hostid warnings in zpool status
|
||||
- [ ] Arch Linux still bootable (for rollback)
|
||||
|
||||
Final verification (after 24-48 hours):
|
||||
- [ ] All services stable
|
||||
- [ ] No unexpected issues
|
||||
- [ ] Performance acceptable
|
||||
- [ ] Ready to remove Arch partition (optional)
|
||||
- [ ] Ready to remove /swarmvols/media-backup (optional)
|
||||
Reference in New Issue
Block a user