Skip to content

Update fw-manager to support firmware upgrades for the new ASIC#26862

Draft
ganglyu wants to merge 2 commits intosonic-net:masterfrom
ganglyu:fw_manager_update
Draft

Update fw-manager to support firmware upgrades for the new ASIC#26862
ganglyu wants to merge 2 commits intosonic-net:masterfrom
ganglyu:fw_manager_update

Conversation

@ganglyu
Copy link
Copy Markdown
Contributor

@ganglyu ganglyu commented Apr 17, 2026

Why I did it

The existing fw-manager does not support firmware upgrades for SPC6 ASICs.

Work item tracking
  • Microsoft ADO (number only):

How I did it

  1. Extended SpectrumFirmwareManager.run_firmware_update() to load the Spectrum kernel driver via sx-kernel.sh start before invoking mlxfwmanager, and unload it via sx-kernel.sh stop in a finally block after the burn completes. Loading the driver enables DMA-based flash access, which reduces burn time. Unloading after burn lets sx-kernel.service bring the driver up cleanly for syncd.
  2. Changed the mlxfwmanager/flint device argument from self.pci_id (PCI BDF, e.g. 01:00.0) to the fixed MST path /dev/mst/mt53124_pciconf0 (PCI config access). This is required for SPC6 firmware burn per MFT/product specification.
  3. Increased TimeoutSec in mlnx-fw-manager.service from 300 s to 900 s to accommodate SPC6 burn time, which can exceed 10 minutes.
  4. Increased timeout_per_asic in firmware_coordinator.py from 600 s to 900 s to align with the service-level timeout.
  5. Updated unit tests to mock _run_sx_kernel and added test_run_firmware_update_sx_kernel_start_fails_still_burns to verify graceful degradation when the driver load fails.

How to verify it

  1. Run unit tests:
  2. On SPC6 device, trigger firmware upgrade and confirm:
  • Service completes without timeout
  • Logs show sx-kernel.sh start before burn and sx-kernel.sh stop after
  • syncd comes up cleanly after reboot

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

ganglyu added 2 commits April 17, 2026 13:01
Signed-off-by: Geoffrey Lyu <glv@nvidia.com>
Signed-off-by: Geoffrey Lyu <glv@nvidia.com>
@ganglyu ganglyu force-pushed the fw_manager_update branch from ccd5d66 to 124030c Compare April 17, 2026 05:01
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants