424 lines
11 KiB
Markdown
424 lines
11 KiB
Markdown
# Migration Plan: Arch Linux to NixOS on john-endesktop (ZFS/NFS Server)
|
|
|
|
## Overview
|
|
This document outlines the plan to migrate the john-endesktop server from Arch Linux to NixOS while maintaining the existing ZFS pools and NFS exports that serve your k3s cluster.
|
|
|
|
## Current System State
|
|
|
|
### Hardware
|
|
- **Boot disk**: nvme0n1
|
|
- nvme0n1p3: 1000M EFI partition (UUID: F5C6-D570)
|
|
- nvme0n1p4: 120GB ext4 / (current Arch root)
|
|
- nvme0n1p5: 810GB - **Target for NixOS** (being removed from media pool)
|
|
- **Network**: enp0s31f6 @ 10.0.0.43/24 (DHCP)
|
|
|
|
### ZFS Pools
|
|
- **media**: ~3.5TB JBOD pool (2 drives after nvme0n1p5 removal)
|
|
- wwn-0x50014ee2ba653d70-part2
|
|
- ata-WDC_WD20EZBX-00AYRA0_WD-WX62D627X7Z8-part2
|
|
- Contains: /media/media/nix (bind mounted to /nix on Arch)
|
|
- NFS: Shared to 10.0.0.0/24 via ZFS sharenfs property
|
|
|
|
- **swarmvols**: 928GB mirror pool - **PRODUCTION DATA**
|
|
- wwn-0x5002538f52707e2d-part2
|
|
- wwn-0x5002538f52707e81-part2
|
|
- Contains: iocage jails and k3s persistent volumes
|
|
- NFS: Shared to 10.0.0.0/24 via ZFS sharenfs property
|
|
- Backed up nightly to remote borg
|
|
|
|
### Services
|
|
- NFS server exporting /media and /swarmvols to k3s cluster
|
|
- ZFS managing pools with automatic exports via sharenfs property
|
|
|
|
## Prerequisites
|
|
|
|
### Before Starting
|
|
1. ✅ Ensure nvme0n1p5 removal from media pool is complete
|
|
```bash
|
|
ssh 10.0.0.43 "zpool status media"
|
|
# Should show no "removing" devices
|
|
```
|
|
|
|
2. ✅ Verify recent backups exist
|
|
```bash
|
|
# Verify swarmvols backup is recent (< 24 hours)
|
|
# Check your borg backup system
|
|
```
|
|
|
|
3. ✅ Notify k3s cluster users of planned maintenance window
|
|
- NFS shares will be unavailable during migration
|
|
- Estimate: 30-60 minutes downtime
|
|
|
|
4. ✅ Build NixOS configuration from your workstation
|
|
```bash
|
|
cd ~/nixos-configs
|
|
nix build .#nixosConfigurations.john-endesktop.config.system.build.toplevel
|
|
```
|
|
|
|
## Migration Steps
|
|
|
|
### Phase 1: Prepare NixOS Installation Media
|
|
|
|
1. **Download NixOS minimal ISO**
|
|
```bash
|
|
wget https://channels.nixos.org/nixos-25.11/latest-nixos-minimal-x86_64-linux.iso
|
|
```
|
|
|
|
2. **Create bootable USB**
|
|
```bash
|
|
# Identify USB device (e.g., /dev/sdb)
|
|
lsblk
|
|
# Write ISO to USB
|
|
sudo dd if=latest-nixos-minimal-x86_64-linux.iso of=/dev/sdX bs=4M status=progress
|
|
sudo sync
|
|
```
|
|
|
|
### Phase 2: Backup and Shutdown
|
|
|
|
1. **On the server, verify ZFS pool status**
|
|
```bash
|
|
ssh 10.0.0.43 "zpool status"
|
|
ssh 10.0.0.43 "zfs list"
|
|
```
|
|
|
|
2. **Export ZFS pools cleanly**
|
|
```bash
|
|
ssh 10.0.0.43 "sudo zpool export media"
|
|
ssh 10.0.0.43 "sudo zpool export swarmvols"
|
|
```
|
|
|
|
3. **Shutdown Arch Linux**
|
|
```bash
|
|
ssh 10.0.0.43 "sudo shutdown -h now"
|
|
```
|
|
|
|
### Phase 3: Install NixOS
|
|
|
|
1. **Boot from NixOS USB**
|
|
- Insert USB drive
|
|
- Power on and select USB in boot menu
|
|
|
|
2. **Connect to network**
|
|
```bash
|
|
# If DHCP doesn't work automatically:
|
|
sudo systemctl start dhcpcd
|
|
ip a # Verify you have 10.0.0.43 or another IP
|
|
```
|
|
|
|
3. **Enable SSH for remote installation (recommended)**
|
|
```bash
|
|
# Set password for nixos user
|
|
sudo passwd nixos
|
|
# Start SSH
|
|
sudo systemctl start sshd
|
|
# From your workstation:
|
|
ssh nixos@10.0.0.43
|
|
```
|
|
|
|
4. **Partition nvme0n1p5 with btrfs**
|
|
```bash
|
|
# Verify the device is clear
|
|
lsblk
|
|
sudo wipefs -a /dev/nvme0n1p5
|
|
|
|
# Create btrfs filesystem
|
|
sudo mkfs.btrfs -L nixos /dev/nvme0n1p5
|
|
|
|
# Mount and create subvolumes
|
|
sudo mount /dev/nvme0n1p5 /mnt
|
|
sudo btrfs subvolume create /mnt/@
|
|
sudo btrfs subvolume create /mnt/@home
|
|
sudo btrfs subvolume create /mnt/@nix
|
|
sudo btrfs subvolume create /mnt/@log
|
|
sudo umount /mnt
|
|
|
|
# Mount root subvolume
|
|
sudo mount -o subvol=@,compress=zstd,noatime /dev/nvme0n1p5 /mnt
|
|
|
|
# Create mount points
|
|
sudo mkdir -p /mnt/{boot,home,nix,var/log}
|
|
|
|
# Mount other subvolumes
|
|
sudo mount -o subvol=@home,compress=zstd,noatime /dev/nvme0n1p5 /mnt/home
|
|
sudo mount -o subvol=@nix,compress=zstd,noatime /dev/nvme0n1p5 /mnt/nix
|
|
sudo mount -o subvol=@log,compress=zstd,noatime /dev/nvme0n1p5 /mnt/var/log
|
|
|
|
# Mount EFI partition
|
|
sudo mount /dev/nvme0n1p3 /mnt/boot
|
|
```
|
|
|
|
5. **Import ZFS pools**
|
|
```bash
|
|
# Import pools (should be visible)
|
|
sudo zpool import
|
|
|
|
# Import with force if needed due to hostid
|
|
sudo zpool import -f media
|
|
sudo zpool import -f swarmvols
|
|
|
|
# Verify pools are mounted
|
|
zfs list
|
|
ls -la /media /swarmvols
|
|
```
|
|
|
|
6. **Generate initial hardware configuration**
|
|
```bash
|
|
sudo nixos-generate-config --root /mnt
|
|
```
|
|
|
|
7. **Get the new root filesystem UUID**
|
|
```bash
|
|
blkid /dev/nvme0n1p5
|
|
# Note the UUID for updating hardware-configuration.nix
|
|
```
|
|
|
|
8. **Copy your NixOS configuration to the server**
|
|
```bash
|
|
# From your workstation:
|
|
scp -r ~/nixos-configs/machines/john-endesktop/* nixos@10.0.0.43:/tmp/
|
|
|
|
# On server:
|
|
sudo mkdir -p /mnt/etc/nixos
|
|
sudo cp /tmp/configuration.nix /mnt/etc/nixos/
|
|
sudo cp /tmp/hardware-configuration.nix /mnt/etc/nixos/
|
|
|
|
# Edit hardware-configuration.nix to update the root filesystem UUID
|
|
sudo nano /mnt/etc/nixos/hardware-configuration.nix
|
|
# Change: device = "/dev/disk/by-uuid/CHANGE-THIS-TO-YOUR-UUID";
|
|
# To: device = "/dev/disk/by-uuid/[UUID from blkid]";
|
|
```
|
|
|
|
9. **Install NixOS**
|
|
```bash
|
|
sudo nixos-install
|
|
|
|
# Set root password when prompted
|
|
# Set user password
|
|
sudo nixos-install --no-root-passwd
|
|
```
|
|
|
|
10. **Reboot into NixOS**
|
|
```bash
|
|
sudo reboot
|
|
# Remove USB drive
|
|
```
|
|
|
|
### Phase 4: Post-Installation Verification
|
|
|
|
1. **Boot into NixOS and verify system**
|
|
```bash
|
|
ssh johno@10.0.0.43
|
|
|
|
# Check NixOS version
|
|
nixos-version
|
|
|
|
# Verify hostname
|
|
hostname # Should be: john-endesktop
|
|
```
|
|
|
|
2. **Verify ZFS pools imported correctly**
|
|
```bash
|
|
zpool status
|
|
zpool list
|
|
zfs list
|
|
|
|
# Check for hostid mismatch warnings (should be gone)
|
|
# Verify both pools show ONLINE status
|
|
```
|
|
|
|
3. **Verify NFS exports are active**
|
|
```bash
|
|
sudo exportfs -v
|
|
systemctl status nfs-server
|
|
|
|
# Should see /media and /swarmvols exported to 10.0.0.0/24
|
|
```
|
|
|
|
4. **Test NFS mount from another machine**
|
|
```bash
|
|
# From a k3s node or your workstation:
|
|
sudo mount -t nfs 10.0.0.43:/swarmvols /mnt
|
|
ls -la /mnt
|
|
sudo umount /mnt
|
|
|
|
sudo mount -t nfs 10.0.0.43:/media /mnt
|
|
ls -la /mnt
|
|
sudo umount /mnt
|
|
```
|
|
|
|
5. **Verify ZFS sharenfs properties preserved**
|
|
```bash
|
|
zfs get sharenfs media
|
|
zfs get sharenfs swarmvols
|
|
|
|
# Should show: sec=sys,mountpoint,no_subtree_check,no_root_squash,rw=@10.0.0.0/24
|
|
```
|
|
|
|
6. **Check swap device**
|
|
```bash
|
|
swapon --show
|
|
free -h
|
|
# Should show /dev/zvol/media/swap
|
|
```
|
|
|
|
### Phase 5: Restore k3s Cluster Access
|
|
|
|
1. **Restart k3s nodes or remount NFS shares**
|
|
```bash
|
|
# On each k3s node:
|
|
sudo systemctl restart k3s # or k3s-agent
|
|
```
|
|
|
|
2. **Verify k3s pods have access to persistent volumes**
|
|
```bash
|
|
# On k3s master:
|
|
kubectl get pv
|
|
kubectl get pvc
|
|
# Check that volumes are bound and accessible
|
|
```
|
|
|
|
## Rollback Plan
|
|
|
|
If something goes wrong during migration, you can roll back to Arch Linux:
|
|
|
|
### Quick Rollback (If NixOS won't boot)
|
|
|
|
1. **Boot from NixOS USB (or Arch USB)**
|
|
|
|
2. **Import ZFS pools**
|
|
```bash
|
|
sudo zpool import -f media
|
|
sudo zpool import -f swarmvols
|
|
```
|
|
|
|
3. **Start NFS manually (temporary)**
|
|
```bash
|
|
sudo mkdir -p /media /swarmvols
|
|
sudo systemctl start nfs-server
|
|
sudo exportfs -o rw,sync,no_subtree_check,no_root_squash 10.0.0.0/24:/media
|
|
sudo exportfs -o rw,sync,no_subtree_check,no_root_squash 10.0.0.0/24:/swarmvols
|
|
sudo exportfs -v
|
|
```
|
|
This will restore k3s cluster access immediately while you diagnose.
|
|
|
|
4. **Boot back into Arch Linux**
|
|
```bash
|
|
# Reboot and select nvme0n1p4 (Arch) in GRUB/boot menu
|
|
sudo reboot
|
|
```
|
|
|
|
5. **Verify Arch boots and services start**
|
|
```bash
|
|
ssh johno@10.0.0.43
|
|
zpool status
|
|
systemctl status nfs-server
|
|
```
|
|
|
|
### Full Rollback (If needed)
|
|
|
|
1. **Follow Quick Rollback steps above**
|
|
|
|
2. **Re-add nvme0n1p5 to media pool (if desired)**
|
|
```bash
|
|
# Only if you want to restore the original configuration
|
|
sudo zpool add media /dev/nvme0n1p5
|
|
```
|
|
|
|
3. **Clean up NixOS partition**
|
|
```bash
|
|
# If you want to reclaim nvme0n1p5 for other uses
|
|
sudo wipefs -a /dev/nvme0n1p5
|
|
```
|
|
|
|
## Risk Mitigation
|
|
|
|
### Data Safety
|
|
- ✅ **swarmvols** (production): Mirrored + nightly borg backups
|
|
- ⚠️ **media** (important): JBOD - no redundancy, but not catastrophic
|
|
- ✅ **NixOS install**: Separate partition, doesn't touch ZFS pools
|
|
- ✅ **Arch Linux**: Remains bootable on nvme0n1p4 until verified
|
|
|
|
### Service Continuity
|
|
- Downtime: 30-60 minutes expected
|
|
- k3s cluster: Will reconnect automatically when NFS returns
|
|
- Rollback time: < 10 minutes to restore Arch
|
|
|
|
### Testing Approach
|
|
1. Test NFS exports from NixOS live environment before installation
|
|
2. Test single NFS mount from k3s node before full cluster restart
|
|
3. Keep Arch Linux boot option until 24-48 hours of stable NixOS operation
|
|
|
|
## Post-Migration Tasks
|
|
|
|
After successful migration and 24-48 hours of stable operation:
|
|
|
|
1. **Update k3s NFS mounts (if needed)**
|
|
- Verify no hardcoded references to old system
|
|
|
|
2. **Optional: Repurpose Arch partition**
|
|
```bash
|
|
# After you're confident NixOS is stable
|
|
# You can wipe nvme0n1p4 and repurpose it
|
|
```
|
|
|
|
3. **Update documentation**
|
|
- Update infrastructure docs with NixOS configuration
|
|
- Document any deviations from this plan
|
|
|
|
4. **Consider setting up NixOS remote deployment**
|
|
```bash
|
|
# From your workstation:
|
|
nixos-rebuild switch --target-host johno@10.0.0.43 --flake .#john-endesktop
|
|
```
|
|
|
|
## Timeline
|
|
|
|
- **Preparation**: 1-2 hours (testing config build, downloading ISO)
|
|
- **Migration window**: 1-2 hours (installation + verification)
|
|
- **Verification period**: 24-48 hours (before removing Arch)
|
|
- **Total**: ~3 days from start to declaring success
|
|
|
|
## Emergency Contacts
|
|
|
|
- Borg backup location: [Document your borg repo location]
|
|
- K3s cluster nodes: [Document your k3s nodes]
|
|
- Critical services on k3s: [Document what's running that depends on these NFS shares]
|
|
|
|
## Checklist
|
|
|
|
Pre-migration:
|
|
- [x] nvme0n1p5 removal from media pool complete
|
|
- [ ] Recent backup verified (< 24 hours)
|
|
- [ ] Maintenance window scheduled
|
|
- [ ] NixOS ISO downloaded
|
|
- [ ] Bootable USB created
|
|
- [ ] NixOS config builds successfully
|
|
|
|
During migration:
|
|
- [ ] ZFS pools exported
|
|
- [ ] Arch Linux shutdown cleanly
|
|
- [ ] Booted from NixOS USB
|
|
- [ ] nvme0n1p5 formatted with btrfs
|
|
- [ ] Btrfs subvolumes created
|
|
- [ ] ZFS pools imported
|
|
- [ ] NixOS installed
|
|
- [ ] Root password set
|
|
|
|
Post-migration:
|
|
- [ ] NixOS boots successfully
|
|
- [ ] ZFS pools mounted automatically
|
|
- [ ] NFS server running
|
|
- [ ] NFS exports verified
|
|
- [ ] Test mount from k3s node successful
|
|
- [ ] k3s cluster reconnected
|
|
- [ ] Persistent volumes accessible
|
|
- [ ] No hostid warnings in zpool status
|
|
- [ ] Arch Linux still bootable (for rollback)
|
|
|
|
Final verification (after 24-48 hours):
|
|
- [ ] All services stable
|
|
- [ ] No unexpected issues
|
|
- [ ] Performance acceptable
|
|
- [ ] Ready to remove Arch partition (optional)
|
|
- [ ] Ready to remove /swarmvols/media-backup (optional)
|