OCPEDGE-2737: Enable aarch64 native KVM support for agent-based installation#1908
OCPEDGE-2737: Enable aarch64 native KVM support for agent-based installation#1908fonta-rh wants to merge 5 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @fonta-rh. Thanks for your PR. I'm waiting for a openshift-metal3 member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Replace hardcoded x86_64 architecture strings with $(uname -m) or
${ARCH} so that agent-based installation scripts work on aarch64
hosts (e.g. AWS Graviton bare metal).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The metal3-dev-env baremetalvm.xml.j2 template hardcodes cortex-a57
for all aarch64 VMs. This CPU model only works under QEMU emulation;
native aarch64 KVM (e.g. AWS Graviton bare metal) requires
host-passthrough.
Narrow the CPU-section conditional from `{% if is_aarch64 %}` to
`{% if is_aarch64 and libvirt_domain_type == 'qemu' %}` so that
native KVM falls through to host-passthrough. The other three
is_aarch64 blocks (<os>, <features>, VNC) are unaffected — the sed
targets only the CPU block by matching its adjacent HTML comment line.
Tested on AWS c7g.metal (Graviton3) with OCP 4.22.0-rc.5 — full
fencing-IPI deployment with Pacemaker/STONITH operational.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15e7ec5 to
15c417f
Compare
On aarch64 (virt machine type), virt-xml defaults the CDROM bus to USB when no bus is specified. The virt machine type has no USB controller, causing "USB is disabled for this domain" errors when attaching the agent ISO at step 06. Set target.bus explicitly on all five virt-xml CDROM attachment calls: sata on x86_64 (q35 default), scsi on aarch64 (matching the bus already configured by 02_configure_host.sh). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
/jira refresh |
The pinned metal3-dev-env hardcodes `linux-amd64` in the go_tarball template variable. The 01_install_requirements.sh script already detects the host architecture and passes GOARCH as an Ansible extra var, but the template ignores it. Add a sed patch (alongside the existing Ansible version patch) to make go_tarball use the GOARCH variable, so the correct Go binary is downloaded on aarch64 hosts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On aarch64 (virt machine type), AAVMF firmware does not reliably honor boot_order for SCSI CDROM vs virtio disk. After the agent-based installer writes the OS to disk and reboots, VMs boot back into the installation ISO instead of from disk, causing installing-pending-user-action timeout. Fix: eject CDROM media after VMs have booted from the ISO. The CoreOS live agent runs entirely in RAM, so the ISO is not needed after boot. When the agent triggers a reboot after image write, the empty CDROM is skipped and the VM boots from disk. Gated on aarch64 only — x86_64 OVMF handles boot_order correctly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Enable IPI and agent-based installation (ABI) on native aarch64 KVM hosts (e.g. AWS Graviton bare metal). Three independent fixes that together unblock the full agent path on aarch64 (and IPI while we're at it):
{% if is_aarch64 %}conditional in the VM template so native KVM falls through tohost-passthroughinstead of the emulation-onlycortex-a57x86_64strings with$(uname -m)/${ARCH}across agent scripts and RHCOS boot image resolutiontarget.busexplicitly on allvirt-xmlCDROM attachment calls (scsion aarch64,sataon x86_64) to prevent thevirtmachine type from defaulting to USB (which doesn't exist)Changes
02_configure_host.shagent/06_agent_create_cluster.shCDROM_BUSvariable; settarget.buson all 5virt-xmlCDROM linesagent/common.shx86_64→$(uname -m)agent/iscsi_utils.shagent/07_agent_add_extraworker_nodes.shagent/iso_no_registry.shagent/01_agent_requirements.shrhcos.sh.architectures.x86_64→ host arch)Test plan
x86_64/sataon x86 hosts)02_configure_host.sh:<os>,<features>, VNC conditionals still fire correctly (only CPU block narrowed)Supersedes #1910.
🤖 Generated with Claude Code