Baremetal

File system setup

Set the disko options for the machine and run:

$(nix --extra-experimental-features "flakes nix-command" build --print-out-paths --no-link -L '.#nixosConfigurations.HOSTNAME.config.system.build.diskoScript')

When adding new disks the paths under /dev/disk/by-id/ should be used, so that the script is idempotent across device restarts.

Install new server

  • Copy the nix files from an existing, similar host.
  • Disable all secrets until after the installation is finished.
  • Set simd.arch option to the output of nix --extra-experimental-features "flakes nix-command" shell nixpkgs#gcc -c gcc -march=native -Q --help=target | grep march and update the comment next to it
    • If that returns x86_64 search on a search engine for the ark.intel.com entry for the processor which can be found by catting /proc/cpuinfo
  • Generate networking.hostId with head -c4 /dev/urandom | od -A none -t x4 according to the options description.
  • Boot live ISO
    • If your ssh key is not baked into the iso, set a password for the nixos with passwd to be able to log in over ssh.
  • rsync the this directory into the live system.
  • Stop the raid if one booted up mdadm --stop /dev/md126
  • generate and apply disk layout with disko (see above).
  • Generate hardware-configuration.nix with sudo nixos-generate-config --root /mnt.
    • If luks disks should be decrypted in initrd over ssh then:
      • Make sure boot.initrd.luks.devices.*.devices is set.
      • Enable boot.initrd.network.enable
      • Add the required kernel modules, which can be found with lshw -C network (look for driver=), for the network interfaces in initrd to boot.initrd.availableKernelModules.
  • Install nixos system with sudo nixos-install --root /mnt --no-channel-copy --flake .#HOSTNAME.
  • After a reboot add the age key to sops-nix with nix shell nixpkgs#ssh-to-age and ssh-to-age < /etc/ssh/ssh_host_ed25519_key.pub.
  • Add /etc/machine-id and luks password to sops secrets.
  • Enable and deploy secrets again.
  • Improve new machine setup by automating easy to automate steps and document others.
  • Commit everything and push

Unlocking LUKs encrypted disks remotely

  • Connect per ssh to the server as root and with port 4748 eg: shh root@server9.cluster.zentralwerk.org -p 4748 It is recommended to write the following into your ssh_config for each server:
    Host server09-unlock
      Hostname server9.cluster.zentralwerk.org
      IdentityFile ~/.ssh/id_ed25519
      Port 4748
      PubkeyAuthentication yes
      User root
    
  • If the password prompt unexpectedly closes in the initrd shell or the boot process doesn't start within a few seconds after entering all disk passwords run: systemctl start default.target
  • You'll find the passwords in the hosts sops file, eg: sops hosts/server10/secrets.yaml