Skip to content

azdev extension add/remove does not invalidate commandIndex.json, causing stale command index and test failures #9716

@FumingZhang

Description

@FumingZhang

Describe the bug

When multiple extensions are tested sequentially in CI, a stale commandIndex.json can cause the second extension's commands to never override the core CLI module's commands. This results in incorrect API version being used, causing all VCR-recorded scenario tests to fail.

Root Cause
azdev extension add (in azdev/operations/extensions/__init__.py) installs extensions via raw pip install -e, but does not call CommandIndex().invalidate() afterward. The CLI's own az extension add properly invalidates the command index (azure-cli-core/extension/operations.py L352), but azdev bypasses this path entirely.

Similarly, azdev extension remove uses pip uninstall without invalidating the index.

How the Failure Occurs
The CI test runner (test_source.py) tests multiple extensions sequentially:

for pkg_name, ext_path in ALL_TESTS:
    azdev extension add ext_name      # No index invalidation!
    azdev test ... pkg_name
    azdev extension remove ext_name   # No index invalidation!

When the list order is [oracle-database, aks-preview]:

  1. azdev extension add oracle-database — installs oracle-database (no index invalidation)
  2. azdev test oracle-database — during test discovery, az cloud show is invoked, which triggers MainCommandsLoader.load_command_table(). This creates commandIndex.json mapping aks['azure.cli.command_modules.acs'] (core module only, since aks-preview is not installed yet)
  3. azdev extension remove oracle-database — removes oracle-database (no index invalidation)
  4. azdev extension add aks-preview — installs aks-preview (no index invalidation)
  5. azdev test aks-previewaz cloud show runs again, but commandIndex.json already exists with valid version (2.84.0) and cloud profile (latest), so the stale index is reused
  6. Each scenario test calls self.cmd('aks create ...')DummyCli.invoke()MainCommandsLoader.load_command_table() → command index maps aks to only ['azure.cli.command_modules.acs']aks-preview extension is never loaded → core module's aks_create runs with the GA SDK (azure-mgmt-containerservice, api-version=2026-01-01) instead of the preview SDK (api-version=2026-01-02-preview) → VCR cassette query matcher fails → 262 tests fail

Why This Is Non-Deterministic
ALL_TESTS is built by iterating os.listdir(SRC_PATH), which returns entries in arbitrary filesystem order. The order can differ between runs/environments:

  • When aks-preview happens to be listed first: it gets tested before any index exists → index is created WITH the extension → tests pass
  • When another extension is listed first: index is created WITHOUT aks-preview → tests fail

Related command

azdev extension add/remove

Errors

The integration tests queued for various Python versions on the same PR (for example, #9665) are showing inconsistent results; some tests failed while others passed.

FAILED src/aks-preview/azext_aks_preview/tests/latest/test_aks_safeguards.py::AksSafeguardsScenario::test_aks_deployment_safeguards_basic
FAILED src/aks-preview/azext_aks_preview/tests/latest/test_aks_safeguards.py::AksSafeguardsScenario::test_aks_deployment_safeguards_with_pss
== 262 failed, 693 passed, 86 skipped, 6 subtests passed in 412.13s (0:06:52) ==

https://dev.azure.com/azclitools/public/_build/results?buildId=304552&view=logs&j=edd23f8e-0483-53ea-dfe4-85f031ec115a&t=758d86e6-87e1-58dd-c45e-4b2170f268bc&l=65188

[gw0] [ 99%] PASSED src/aks-preview/azext_aks_preview/tests/latest/test_vm_skus.py::TestAksIsVmSkuAvailable::test_zone_restriction_ignored_when_zone_flag_false 
src/aks-preview/azext_aks_preview/tests/latest/test_vm_skus.py::TestAksIsVmSkuAvailable::test_zone_restriction_partial_zones_restricted_returns_true 
[gw0] [100%] PASSED src/aks-preview/azext_aks_preview/tests/latest/test_vm_skus.py::TestAksIsVmSkuAvailable::test_zone_restriction_partial_zones_restricted_returns_true 

- generated xml file: /home/cloudtest/.azdev/env_config/mnt/vss/_work/1/s/env/test_results.xml -
======== 955 passed, 86 skipped, 6 subtests passed in 412.58s (0:06:52) ========

https://dev.azure.com/azclitools/public/_build/results?buildId=304552&view=logs&j=85692d73-7fd1-54cf-e8a2-b651dc664eb4&t=5d21126a-3fd1-538a-4d62-56dc95b84a7a&l=4609

Issue script & Debug output

N/A

Expected behavior

The integration tests queued for various Python versions on the same PR (for example, #9665) should have consistent results.

Environment Summary

N/A

Additional context

No response

Metadata

Metadata

Assignees

Labels

Azure CLI TeamThe command of the issue is owned by Azure CLI teambugThis issue requires a change to an existing behavior in the product in order to be resolved.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions