[critical_services] Exclude snmp from tracking list on BMC topology#24084
Merged
Blueve merged 1 commit intosonic-net:masterfrom Apr 21, 2026
Merged
Conversation
snmp is not deployed on BMC image (SNMP runs on the host NOS, not on the management-plane BMC board). Without this, a wide range of tests fail during teardown/sanity with 'All critical services should be fully started!' while the final snapshot shows snmp down and all other services up. Mirrors the existing DPU exclusion pattern a few lines above. Signed-off-by: zitingguo <zitingguo@microsoft.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
lizhijianrd
approved these changes
Apr 21, 2026
Blueve
approved these changes
Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Summary:
Exclude
snmpfrom the critical-services tracking list when the DUT is running on a BMC topology.On a SONiC BMC board (and BMC topologies in general) the BMC runs a management-plane SONiC image. SNMP is not deployed on BMC by design ΓÇö it runs on the host NOS, not on the BMC. As a result, the default
critical_services = [pmon, snmp, database, lldp]tracking list is incorrect for BMC, and a wide range of tests fail during teardown/sanity with:with the final snapshot being
{'pmon': True, 'snmp': False, 'database': True, 'lldp': True}.This PR mirrors the existing DPU pattern (a few lines above, introduced for the same reason on DPU) and removes
snmpfrom the tracking list when'bmc' in topo_type.Type of change
Back port request
Approach
What is the motivation for this PR?
Unblock ~20 test failures on the SONiC BMC testbed whose only failure mode is the framework tracking a service (
snmp) that is not deployed on SONiC BMC. Tests affected (all failed on teardown / critical process check with the samesnmp: Falsefootprint):dns/static_dns/test_static_dns.pydns/test_dns_resolv_conf.pygeneric_config_updater/test_cacl.pyplatform_tests/test_reboot.pysyslog/test_logrotate.pycontainer_hardening/test_container_hardening.pydb_migrator/test_migrate_dns.pygolden_config_infra/test_config_reload_with_rendered_golden_config.pyoverride_config_table/test_override_config_table.pyscp/test_scp_copy.pytacacs/test_authorization.pyHow did you do it?
In
MultiAsicSonicHost.critical_services_tracking_list(), added a BMC guard right after the existing DPU guard:No behavioral change for non-BMC topologies.
How did you verify/test it?
Verified on a SONiC BMC DUT (
testbed-host1-komodo-bmc, Komodo BMC board) by applying the patch locally in the sonic-mgmt container and re-running one of the affected tests,syslog/test_logrotate.py::test_logrotate_small_size.Failed: All critical services should be fully started!, snapshot ={'pmon': True, 'snmp': False, 'database': True, 'lldp': True}.critical_services_fully_startedcheck now passes (framework-tracked services list becomes{pmon, database, lldp}, matching what is actually deployed on BMC); execution proceeds past that check.Any platform specific information?
Only affects DUTs whose
topo_typecontainsbmc.Supported testbed topology if it's a new test case?