How we reduced core unit boot time from hours to minutes
Key Points
- Stopped UEFI linear probe of network boot interfaces
- Declared boot interface early in PXE stage
- Reduced firmware upgrade time ~4h → ~3min
Summary
Cloudflare's Gen12 core servers experienced boots stretching from minutes to hours after a firmware update. The root cause was an over‑eager linear search through every available UEFI network boot interface: each failed interface waited ~5 minutes for a timeout before trying the next, compounding across multiple reboots required by firmware upgrades. By declaring the correct network boot interface early in the PXE pre‑boot stage, collaborating with vendors to expose programmatic boot-order controls, and improving iPXE automation, we reduced fleet-wide firmware upgrade time from ~4 hours to ~3 minutes and subsequent boots to under a minute.
Key Points
- Root cause: UEFI blindly probed interfaces (HTTPS IPv4 → iPXE → etc.), causing ~20 minutes per cycle of wasted time and nearly 4 hours per firmware upgrade stack.
- Primary fix: declare the network boot interface order up front in the PXE/pre‑boot stage so the firmware does not perform a linear search.
- Vendor fixes: required BIOS/UEFI updates to expose Network Boot settings (lazy‑loaded EFI_IFR_REF3) and remove immutable Force Priority token preventing programmatic changes.
- Automation changes:
- Reordered boot automation to set boot interface before repeated firmware reboots.
- Added state validation to detect and reapply settings if a firmware upgrade resets config.
- Implemented pattern matching for heterogeneous NIC strings (e.g. ".*HTTP.*IPv4.*P1") to select the correct interface without full vendor strings.
- Added uefi-same-hex flag to avoid expensive show/compare cycles in iPXE and perform a single set when needed.
- Edge cases handled: legacy UEFI versions without boot ordering support, and the persistence problem where settings can be cleared by upgrades (addressed via validation+reapply).
Practical steps for engineers
- Inspect serial console early in failure cases to detect repeated network boot timeouts.
- Force the correct network boot interface in the PXE/pre‑boot stage before initiating multi‑reboot workflows.
- Collaborate with OEMs to expose programmatic boot-order controls or provide firmware that does not lazy‑load Boot Order fields.
- Use pattern matching for NIC identifiers when vendor strings vary; plan to standardize vendor strings long term.
- Avoid round‑trip read/compare in iPXE by using a checksum/hex comparison flag to decide whether to run set and reboot.
- Add a post‑change validation step that re‑applies config and reboots if firmware upgrades reset settings.
Outcome
- Firmware upgrade automation: nearly 4 hours → ~3 minutes.
- Subsequent single boot: ~20 minutes → <1 minute.
These changes restored predictable, automated fleet upgrades and eliminated manual intervention during boot/fleet rollouts.