Skip to content

LCORE-1329: Adding new MCP E2E Tests#1294

Open
jrobertboos wants to merge 10 commits intolightspeed-core:mainfrom
jrobertboos:lcore-1329
Open

LCORE-1329: Adding new MCP E2E Tests#1294
jrobertboos wants to merge 10 commits intolightspeed-core:mainfrom
jrobertboos:lcore-1329

Conversation

@jrobertboos
Copy link
Contributor

@jrobertboos jrobertboos commented Mar 9, 2026

Description

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Tests

    • Added comprehensive end-to-end coverage for multiple MCP authentication methods (file, Kubernetes, OAuth, client), including success/error scenarios across key endpoints and new test annotations for selective runs.
  • Chores

    • Added new test configurations and a test secret for invalid-token scenarios; enhanced test environment management to clear storage and unregister external toolgroups between scenarios.
  • Documentation

    • Updated end-to-end testing docs to reflect utility module renaming and behavior.

- Introduced new YAML configuration files for MCP authentication methods: file, Kubernetes, and OAuth.
- Updated existing MCP configuration to include new authentication methods.
- Enhanced end-to-end tests to cover scenarios for file-based, Kubernetes, and OAuth authentication.
- Added utility functions for unregistering MCP toolgroups in the testing environment.
- Implemented new step definitions for checking response body content in tests.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

Walkthrough

Adds end-to-end MCP authentication tests and supporting artifacts: new config files (file, Kubernetes, client, OAuth) for library/server modes, expanded feature tests, environment handling for MCP configs, utilities to unregister MCP toolgroups and clear Llama Stack storage, a secret file, and a docker volume mount for the invalid token.

Changes

Cohort / File(s) Summary
Docker configuration
docker-compose.yaml
Added a volume mount exposing host secret at /tmp/invalid-mcp-secret-token into the lightspeed-stack service.
Library-mode configs
tests/e2e/configuration/library-mode/...
tests/e2e/configuration/library-mode/lightspeed-stack-mcp.yaml, ...-mcp-file-auth.yaml, ...-mcp-client-auth.yaml, ...-mcp-kubernetes-auth.yaml, ...-mcp-oauth-auth.yaml, ...-invalid-mcp-file-auth.yaml
Added multiple library-mode YAMLs for MCP auth variants (file, invalid-file, client, kubernetes, oauth); updated base MCP list (removed provider_id, added mcp-kubernetes/mcp-file/mcp-client entries).
Server-mode configs
tests/e2e/configuration/server-mode/...
tests/e2e/configuration/server-mode/lightspeed-stack-mcp.yaml, ...-mcp-file-auth.yaml, ...-mcp-client-auth.yaml, ...-mcp-kubernetes-auth.yaml, ...-mcp-oauth-auth.yaml, ...-invalid-mcp-file-auth.yaml
Added server-mode YAMLs for MCP auth variants; updated base MCP list (removed provider_id, added additional MCP server entries).
E2E feature tests
tests/e2e/features/mcp.feature
Large expansion of MCP test scenarios covering success/failure paths across tools, query, streaming_query for file, Kubernetes, client, and OAuth auth schemes; new config tags added to scenarios.
Test environment & steps
tests/e2e/features/environment.py, tests/e2e/features/steps/common_http.py
Environment: map new MCP config names, resolve config from scenario tags, call unregister_mcp_toolgroups or clear_llama_stack_storage as appropriate, switch config and restart service; After hooks updated. Steps: added negative response-body assertion step.
Llama Stack test utilities
tests/e2e/utils/llama_stack_utils.py
Introduced async helpers to list/unregister toolgroups from Llama Stack and public synchronous wrapper unregister_mcp_toolgroups(); improved error handling and resource cleanup.
General test utilities
tests/e2e/utils/utils.py
Added clear_llama_stack_storage() to remove ~/.llama inside the lightspeed-stack container (no-op in Prow), with timeout handling.
Secrets & docs
tests/e2e/secrets/invalid-mcp-token, docs/e2e_testing.md
Added a file containing the literal invalid-token for negative tests; updated docs to reflect renaming of test utility module to llama_stack_utils.py.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change as adding new MCP E2E tests, which aligns with the substantial additions of test configurations, feature tests, and test utilities across the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can enforce grammar and style rules using `languagetool`.

Configure the reviews.tools.languagetool setting to enable/disable rules and categories. Refer to the LanguageTool Community to learn more.

- Added a new invalid MCP token file for testing purposes.
- Updated the Docker Compose configuration to mount the invalid MCP token.
- Introduced a new YAML configuration for testing invalid MCP file authentication.
- Enhanced the test scenarios to include checks for invalid MCP file authentication.
- Updated feature files to reflect the new authentication configurations.
…handling

- Added new configurations for invalid, Kubernetes, client, and OAuth MCP authentication methods.
- Updated scenario handling to reference the new configuration paths for Kubernetes, client, and OAuth authentication.
- Enhanced the testing environment to support the new authentication configurations.
@jrobertboos
Copy link
Contributor Author

So currently the following e2e tests fail:

Failing scenarios:
  tests/e2e/features/mcp.feature:11  Check if tools endpoint succeeds when MCP file-based auth token is passed
  tests/e2e/features/mcp.feature:49  Check if tools endpoint reports error when MCP file-based invalid auth token is passed
  tests/e2e/features/mcp.feature:64  Check if query endpoint reports error when MCP file-based invalid auth token is passed
  tests/e2e/features/mcp.feature:82  Check if streaming_query endpoint reports error when MCP file-based invalid auth token is passed
  tests/e2e/features/mcp.feature:101  Check if tools endpoint succeeds when MCP kubernetes auth token is passed
  tests/e2e/features/mcp.feature:142  Check if tools endpoint reports error when MCP kubernetes invalid auth token is passed
  tests/e2e/features/mcp.feature:158  Check if query endpoint reports error when MCP kubernetes invalid auth token is passed
  tests/e2e/features/mcp.feature:177  Check if streaming_query endpoint reports error when MCP kubernetes invalid auth token is passed
  tests/e2e/features/mcp.feature:197  Check if tools endpoint succeeds by skipping when MCP client-provided auth token is omitted

These failing tests can be grouped into the following bugs:
/tools does not handle authorization with static token (path) or kubernetes

  tests/e2e/features/mcp.feature:11  Check if tools endpoint succeeds when MCP file-based auth token is passed
  tests/e2e/features/mcp.feature:49  Check if tools endpoint reports error when MCP file-based invalid auth token is passed
  tests/e2e/features/mcp.feature:101  Check if tools endpoint succeeds when MCP kubernetes auth token is passed
  tests/e2e/features/mcp.feature:142  Check if tools endpoint reports error when MCP kubernetes invalid auth token is passed

/tools does not handle skipping toolgroup when client token is not provided

  tests/e2e/features/mcp.feature:197  Check if tools endpoint succeeds by skipping when MCP client-provided auth token is omitted

/query and /streaming_query does not catch authorization error with static token (path) or kubernetes

  tests/e2e/features/mcp.feature:64  Check if query endpoint reports error when MCP file-based invalid auth token is passed
  tests/e2e/features/mcp.feature:82  Check if streaming_query endpoint reports error when MCP file-based invalid auth token is passed
  tests/e2e/features/mcp.feature:158  Check if query endpoint reports error when MCP kubernetes invalid auth token is passed
  tests/e2e/features/mcp.feature:177  Check if streaming_query endpoint reports error when MCP kubernetes invalid auth token is passed

@jrobertboos jrobertboos marked this pull request as ready for review March 12, 2026 20:26
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/e2e/features/mcp.feature (1)

440-447: ⚠️ Potential issue | 🟡 Minor

Use the same OAuth fixture token as the other happy-path cases.

The server-mode query happy path right above this uses Bearer oauth-test-token, but this streaming case switches to Bearer test-token. If those fixtures diverge, this scenario stops covering the same auth path and will fail independently.

🔧 Suggested fix
     And I set the "MCP-HEADERS" header to
     """
-    {"mcp-oauth": {"Authorization": "Bearer test-token"}}
+    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}}
     """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/features/mcp.feature` around lines 440 - 447, The streaming-query
scenario "Check if streaming_query endpoint succeeds when MCP OAuth auth token
is passed" uses a different fixture token ("Bearer test-token") than the other
happy-path cases; update the MCP-HEADERS block in that Scenario to use the same
fixture token "Bearer oauth-test-token" so it matches the server-mode happy-path
tests and continues to exercise the same OAuth fixture.
🧹 Nitpick comments (3)
tests/e2e/utils/llama_stack_tools.py (2)

35-48: Explicit return None is unnecessary.

Line 45 returns None explicitly, but the function return type is None. A bare return or simply omitting it would be clearer.

✨ Minor cleanup
         if e.status_code == 400 and "not found" in str(e).lower():
-            return None
+            return
         raise
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/utils/llama_stack_tools.py` around lines 35 - 48, The function
_unregister_toolgroup_async contains an explicit "return None" inside the
APIStatusError handler which is redundant given the annotated return type is
None; remove the explicit "return None" (either replace with a bare "return" or
just omit the return entirely) in the block that checks if e.status_code == 400
and "not found" in str(e).lower(), leaving the rest of the error handling
(raising other errors) and the finally block that calls await client.close()
unchanged.

51-65: Multiple client instances created during iteration.

The _unregister_mcp_toolgroups_async function creates a client at line 53 to list toolgroups, then _unregister_toolgroup_async creates a new client for each toolgroup to unregister. This results in N+1 client instances for N toolgroups, each requiring cleanup.

Consider reusing the client instance:

♻️ Proposed refactor to reuse client
 async def _unregister_mcp_toolgroups_async() -> None:
     """Unregister all MCP toolgroups."""
     client = _get_llama_stack_client()
     try:
         toolgroups = await client.toolgroups.list()
         for toolgroup in toolgroups:
             if (
                 toolgroup.identifier
                 and toolgroup.provider_id == "model-context-protocol"
             ):
-                await _unregister_toolgroup_async(toolgroup.identifier)
+                try:
+                    await client.toolgroups.unregister(toolgroup.identifier)
+                except APIStatusError as e:
+                    # 400 "not found": toolgroup already absent, continue
+                    if not (e.status_code == 400 and "not found" in str(e).lower()):
+                        raise
     except APIConnectionError:
         raise
     finally:
         await client.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/utils/llama_stack_tools.py` around lines 51 - 65, The function
_unregister_mcp_toolgroups_async currently creates a client to list toolgroups
but calls _unregister_toolgroup_async for each item which creates its own client
per toolgroup; refactor to reuse a single Llama Stack client by modifying
_unregister_toolgroup_async (or adding a helper like
_unregister_toolgroup_with_client) to accept an existing client instance and use
that to perform the unregister operation, then call that variant from
_unregister_mcp_toolgroups_async to avoid N+1 clients; ensure the shared client
is closed once in the outer function’s finally block and preserve the existing
APIConnectionError rethrow behavior.
tests/e2e/features/mcp.feature (1)

421-428: Drop the unused mcp-client header from this server-mode-only scenario.

This scenario is already gated with @skip-in-library-mode, so it runs against the server-mode OAuth config. tests/e2e/configuration/server-mode/lightspeed-stack-mcp-oauth-auth.yaml only registers mcp-oauth on Lines 21-25, so the extra mcp-client entry does not participate in the test and makes failures harder to attribute.

✂️ Suggested cleanup
     And I set the "MCP-HEADERS" header to
     """
-    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}}
+    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}}
     """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/features/mcp.feature` around lines 421 - 428, Remove the unused
"mcp-client" header from the MCP-HEADERS payload in the Scenario "Check if query
endpoint succeeds when MCP OAuth auth token is passed" so the test only sets
{"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}}; locate the Scenario
block that sets the "MCP-HEADERS" header and delete the "mcp-client" JSON entry
to match the server-mode MCP OAuth config (registered as mcp-oauth).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 10-12: Replace blanket `@skip` tags that disable tests
unconditionally with the targeted `@skip-in-library-mode` tag so the /tools and
invalid-auth scenarios still run in non-library modes; specifically, for the
Scenario "Check if tools endpoint succeeds when MCP file-based auth token is
passed" and the other scenarios at the same locations (lines referenced in the
review), change the tag from `@skip` to `@skip-in-library-mode` (keep
`@MCPFileAuthConfig` intact) so only library-mode runs are skipped while
preserving these regression checks.

---

Outside diff comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 440-447: The streaming-query scenario "Check if streaming_query
endpoint succeeds when MCP OAuth auth token is passed" uses a different fixture
token ("Bearer test-token") than the other happy-path cases; update the
MCP-HEADERS block in that Scenario to use the same fixture token "Bearer
oauth-test-token" so it matches the server-mode happy-path tests and continues
to exercise the same OAuth fixture.

---

Nitpick comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 421-428: Remove the unused "mcp-client" header from the
MCP-HEADERS payload in the Scenario "Check if query endpoint succeeds when MCP
OAuth auth token is passed" so the test only sets {"mcp-oauth":
{"Authorization": "Bearer oauth-test-token"}}; locate the Scenario block that
sets the "MCP-HEADERS" header and delete the "mcp-client" JSON entry to match
the server-mode MCP OAuth config (registered as mcp-oauth).

In `@tests/e2e/utils/llama_stack_tools.py`:
- Around line 35-48: The function _unregister_toolgroup_async contains an
explicit "return None" inside the APIStatusError handler which is redundant
given the annotated return type is None; remove the explicit "return None"
(either replace with a bare "return" or just omit the return entirely) in the
block that checks if e.status_code == 400 and "not found" in str(e).lower(),
leaving the rest of the error handling (raising other errors) and the finally
block that calls await client.close() unchanged.
- Around line 51-65: The function _unregister_mcp_toolgroups_async currently
creates a client to list toolgroups but calls _unregister_toolgroup_async for
each item which creates its own client per toolgroup; refactor to reuse a single
Llama Stack client by modifying _unregister_toolgroup_async (or adding a helper
like _unregister_toolgroup_with_client) to accept an existing client instance
and use that to perform the unregister operation, then call that variant from
_unregister_mcp_toolgroups_async to avoid N+1 clients; ensure the shared client
is closed once in the outer function’s finally block and preserve the existing
APIConnectionError rethrow behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b5f69e52-f673-4161-9333-abb5b7d2fdbd

📥 Commits

Reviewing files that changed from the base of the PR and between e73778a and 62f8a53.

📒 Files selected for processing (18)
  • docker-compose.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-invalid-mcp-file-auth.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-mcp-client-auth.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-mcp-file-auth.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-mcp-kubernetes-auth.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-mcp-oauth-auth.yaml
  • tests/e2e/configuration/library-mode/lightspeed-stack-mcp.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-invalid-mcp-file-auth.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-mcp-client-auth.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-mcp-file-auth.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-mcp-kubernetes-auth.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-mcp-oauth-auth.yaml
  • tests/e2e/configuration/server-mode/lightspeed-stack-mcp.yaml
  • tests/e2e/features/environment.py
  • tests/e2e/features/mcp.feature
  • tests/e2e/features/steps/common_http.py
  • tests/e2e/secrets/invalid-mcp-token
  • tests/e2e/utils/llama_stack_tools.py

Comment on lines +10 to +12
@skip
@MCPFileAuthConfig
Scenario: Check if tools endpoint succeeds when MCP file-based auth token is passed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't merge these scenarios behind blanket @skip tags.

The targeted @skip-in-library-mode cases are understandable, but these plain @skips disable the exact /tools and invalid-auth paths called out as broken in the PR notes. That leaves the new MCP suite green without actually protecting those regressions in any mode.

Also applies to: 49-51, 65-67, 84-86, 104-106, 146-148, 163-165, 183-185, 204-206

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/features/mcp.feature` around lines 10 - 12, Replace blanket `@skip`
tags that disable tests unconditionally with the targeted `@skip-in-library-mode`
tag so the /tools and invalid-auth scenarios still run in non-library modes;
specifically, for the Scenario "Check if tools endpoint succeeds when MCP
file-based auth token is passed" and the other scenarios at the same locations
(lines referenced in the review), change the tag from `@skip` to
`@skip-in-library-mode` (keep `@MCPFileAuthConfig` intact) so only library-mode runs
are skipped while preserving these regression checks.

Then The status code of the response is 200
And The response should contain following fragments
| Fragments in LLM response |
| Hello |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add And The body of the response does not contain mcp-client

- Renamed `llama_stack_shields.py` to `llama_stack_utils.py` and expanded its functionality to manage both toolgroups and shields.
- Removed the deprecated `llama_stack_tools.py` file.
- Updated E2E test scenarios to utilize the new utility functions for unregistering toolgroups and clearing Llama Stack storage.
- Enhanced feature files to include comments indicating pending fixes for skipped scenarios.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/e2e/features/mcp.feature (2)

441-447: ⚠️ Potential issue | 🟠 Major

Use the same OAuth fixture in the streaming success case.

The earlier @MCPOAuthAuthConfig success scenarios use oauth-test-token and keep mcp-client authorized as well. This one switches to Bearer test-token only, so it is no longer exercising the same auth matrix and can fail for the wrong reason.

Suggested change
     And I set the "MCP-HEADERS" header to
     """
-    {"mcp-oauth": {"Authorization": "Bearer test-token"}}
+    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}}
     """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/features/mcp.feature` around lines 441 - 447, Replace the ad-hoc
token in the `@MCPOAuthAuthConfig` streaming success scenario with the same
fixture used by the other success cases: set the "MCP-HEADERS" payload to use
"Authorization": "Bearer oauth-test-token" and ensure the same "mcp-client"
authorization entry is present (i.e., mirror the header structure used by the
earlier MCPOAuth success scenarios) so the test exercises the identical auth
matrix.

460-478: ⚠️ Potential issue | 🟠 Major

Keep mcp-client valid in the OAuth negative cases.

Under the same @MCPOAuthAuthConfig, the success path authorizes both mcp-oauth and mcp-client. Omitting mcp-client here broadens the failure surface, so these 401s can come from a different unauthorized toolgroup instead of the invalid OAuth token.

Suggested change
-    {"mcp-oauth": {"Authorization": "Bearer invalid-token"}}
+    {"mcp-oauth": {"Authorization": "Bearer invalid-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}}

Apply the same header payload to all three OAuth-invalid scenarios.

Also applies to: 480-501, 503-524

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/features/mcp.feature` around lines 460 - 478, The scenario's
failure is caused by omitting a valid mcp-client header so the 401 may be due to
missing client auth rather than the invalid mcp-oauth token; update the
"MCP-HEADERS" payload in the scenario "Check if tools endpoint reports error
when MCP OAuth invalid auth token is passed" to include the same valid
"mcp-client" entry used in the successful OAuth scenario (alongside "mcp-oauth":
{"Authorization": "Bearer invalid-token"}) so only the OAuth token is invalid;
apply the same change to the other two OAuth-invalid scenarios referenced (the
blocks around lines 480-501 and 503-524) so all negative OAuth tests keep a
valid mcp-client header.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/utils/utils.py`:
- Around line 254-285: The clear_llama_stack_storage function currently ignores
non-zero docker exits and lacks the Prow short-circuit mentioned in the
docstring; update clear_llama_stack_storage to first return immediately when
running under Prow (e.g., check an environment flag such as PROW_JOB_ID or
another repo-standard CI var) and then run each docker exec with strict failure
semantics (use subprocess.run(..., check=True) or inspect
CompletedProcess.returncode and raise subprocess.CalledProcessError on failure)
so any failed removal aborts the test setup instead of silently continuing;
reference clear_llama_stack_storage, _LLAMA_STORAGE_ROOT, the commands list, and
the subprocess.run calls to locate and change the logic.

---

Outside diff comments:
In `@tests/e2e/features/mcp.feature`:
- Around line 441-447: Replace the ad-hoc token in the `@MCPOAuthAuthConfig`
streaming success scenario with the same fixture used by the other success
cases: set the "MCP-HEADERS" payload to use "Authorization": "Bearer
oauth-test-token" and ensure the same "mcp-client" authorization entry is
present (i.e., mirror the header structure used by the earlier MCPOAuth success
scenarios) so the test exercises the identical auth matrix.
- Around line 460-478: The scenario's failure is caused by omitting a valid
mcp-client header so the 401 may be due to missing client auth rather than the
invalid mcp-oauth token; update the "MCP-HEADERS" payload in the scenario "Check
if tools endpoint reports error when MCP OAuth invalid auth token is passed" to
include the same valid "mcp-client" entry used in the successful OAuth scenario
(alongside "mcp-oauth": {"Authorization": "Bearer invalid-token"}) so only the
OAuth token is invalid; apply the same change to the other two OAuth-invalid
scenarios referenced (the blocks around lines 480-501 and 503-524) so all
negative OAuth tests keep a valid mcp-client header.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 81ffe43b-ff9c-4cc2-8150-a3f164e79c99

📥 Commits

Reviewing files that changed from the base of the PR and between 62f8a53 and bcd351f.

📒 Files selected for processing (5)
  • docs/e2e_testing.md
  • tests/e2e/features/environment.py
  • tests/e2e/features/mcp.feature
  • tests/e2e/utils/llama_stack_utils.py
  • tests/e2e/utils/utils.py

Comment on lines +254 to +285
def clear_llama_stack_storage(container_name: str = "lightspeed-stack") -> None:
"""Clear Llama Stack storage in library mode (embedded Llama Stack).

Removes SQLite/KV store databases and file storage contents so that
toolgroups and other persisted state are reset. Used before MCP config
scenarios when not running in server mode (no separate Llama Stack to
unregister toolgroups from). Only runs when using Docker (skipped in Prow).

Parameters:
container_name (str): Docker container name (default "lightspeed-stack").

Returns:
None
"""

storage_root = _LLAMA_STORAGE_ROOT
commands = [
f"rm -f {storage_root}/rag/kv_store.db",
f"rm -f {storage_root}/sql_store.db",
f"rm -rf {storage_root}/files/*",
]
try:
for cmd in commands:
subprocess.run(
["docker", "exec", container_name, "sh", "-c", cmd],
capture_output=True,
text=True,
timeout=10,
check=False,
)
except subprocess.TimeoutExpired as e:
print(f"Warning: Could not clear Llama Stack storage: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the storage reset fail fast.

This helper now sits in scenario setup, but non-zero docker exec exits are ignored and the Prow short-circuit described in the docstring is missing. That lets stale embedded MCP state leak into the next scenario while the suite proceeds as if cleanup succeeded. Moving executable code directly under the docstring also clears the current pydocstyle D202 failure.

Suggested change
 def clear_llama_stack_storage(container_name: str = "lightspeed-stack") -> None:
     """Clear Llama Stack storage in library mode (embedded Llama Stack).
@@
     Returns:
         None
     """
-
+    if is_prow_environment():
+        return
+
     storage_root = _LLAMA_STORAGE_ROOT
     commands = [
         f"rm -f {storage_root}/rag/kv_store.db",
         f"rm -f {storage_root}/sql_store.db",
         f"rm -rf {storage_root}/files/*",
     ]
-    try:
-        for cmd in commands:
-            subprocess.run(
-                ["docker", "exec", container_name, "sh", "-c", cmd],
-                capture_output=True,
-                text=True,
-                timeout=10,
-                check=False,
-            )
-    except subprocess.TimeoutExpired as e:
-        print(f"Warning: Could not clear Llama Stack storage: {e}")
+    for cmd in commands:
+        subprocess.run(
+            ["docker", "exec", container_name, "sh", "-c", cmd],
+            capture_output=True,
+            text=True,
+            timeout=10,
+            check=True,
+        )
🧰 Tools
🪛 GitHub Actions: Pydocstyle

[error] 255-255: pydocstyle: D202 - No blank lines allowed after function docstring (found 1).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/utils/utils.py` around lines 254 - 285, The
clear_llama_stack_storage function currently ignores non-zero docker exits and
lacks the Prow short-circuit mentioned in the docstring; update
clear_llama_stack_storage to first return immediately when running under Prow
(e.g., check an environment flag such as PROW_JOB_ID or another repo-standard CI
var) and then run each docker exec with strict failure semantics (use
subprocess.run(..., check=True) or inspect CompletedProcess.returncode and raise
subprocess.CalledProcessError on failure) so any failed removal aborts the test
setup instead of silently continuing; reference clear_llama_stack_storage,
_LLAMA_STORAGE_ROOT, the commands list, and the subprocess.run calls to locate
and change the logic.

- Added checks to ensure the response body does not contain 'mcp-client' in MCP feature tests.
- Simplified the `clear_llama_stack_storage` function to remove the entire `~/.llama` directory instead of specific files, improving clarity and efficiency.
- Updated comments to reflect the changes in storage clearing logic.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/features/mcp.feature`:
- Line 417: In the OAuth scenarios in mcp.feature the request headers include
both "mcp-oauth" and "mcp-client", which allows client-token auth to mask OAuth
regressions; update the failing scenarios so the header map only contains
"mcp-oauth": {"Authorization": "Bearer oauth-test-token"} (remove the
"mcp-client" entry) and likewise ensure client-auth scenarios only include
"mcp-client": {"Authorization": "Bearer client-test-token"} so each auth method
is tested in isolation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1377a782-da9d-45ee-a3a9-74e72e555377

📥 Commits

Reviewing files that changed from the base of the PR and between bcd351f and 14a205f.

📒 Files selected for processing (2)
  • tests/e2e/features/mcp.feature
  • tests/e2e/utils/utils.py

And I set the "MCP-HEADERS" header to
"""
{"mcp-oauth": {"Authorization": "Bearer test-token"}}
{"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

OAuth success tests are not isolated from other auth methods.

Passing both mcp-oauth and mcp-client credentials can let these scenarios pass even if OAuth handling regresses, which weakens signal for this suite.

🔧 Suggested fix
-    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}}
+    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}}
...
-    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}, "mcp-client": {"Authorization": "Bearer client-test-token"}}
+    {"mcp-oauth": {"Authorization": "Bearer oauth-test-token"}}

Also applies to: 429-429

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/features/mcp.feature` at line 417, In the OAuth scenarios in
mcp.feature the request headers include both "mcp-oauth" and "mcp-client", which
allows client-token auth to mask OAuth regressions; update the failing scenarios
so the header map only contains "mcp-oauth": {"Authorization": "Bearer
oauth-test-token"} (remove the "mcp-client" entry) and likewise ensure
client-auth scenarios only include "mcp-client": {"Authorization": "Bearer
client-test-token"} so each auth method is tested in isolation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants