Integrate gevals integration tests into e2e test suite and release process

## Summary

Follow-up to #586 (Add automated integration tests using mcpchecker/gevals).

The gevals integration tests currently run as a separate workflow (`.github/workflows/gevals.yaml`) with their own Kind cluster setup. This issue tracks integrating these tests into our existing e2e test infrastructure for better resource efficiency and consistency.

## Goals

1. **Integrate into e2e test suite** - Reuse the existing e2e test environment rather than spinning up a separate Kind cluster
2. **Test isolation** - Ensure gevals tests don't interfere with existing e2e tests (setup/teardown, environment state)
3. **On-demand triggering** - Add support for running gevals tests via PR comment (building on #527's `/test-e2e` infrastructure)
4. **Release process documentation** - Document manual execution of full test suite including gevals tests

## Tasks

### 1. E2E Integration

- [ ] Refactor gevals tests to run against the existing e2e Kind cluster setup
- [ ] Ensure test isolation:
  - Avoid modifying shared test resources (ConfigMaps, Secrets, MCPServerRegistrations)
  - Use dedicated namespaces or unique resource names if needed
  - Clean up any resources created during gevals tests
- [ ] Consider whether gevals tests should run as part of `happy` or `full` suite, or as a separate `gevals` suite

### 2. On-Demand Workflow Updates

Building on #527's `/test-e2e happy|full` command:

- [ ] Add `/test-e2e gevals` as a new suite option (or integrate into `full`)
- [ ] Update `.github/workflows/e2e-on-demand.yaml` to support the new suite
- [ ] Document when to use gevals tests on PRs (e.g., changes to broker, router, tool routing)

### 3. Release Process Documentation

- [ ] Create or update `RELEASE.md` with pre-release testing checklist
- [ ] Include instructions for manually running:
  - Standard e2e tests (`make test-e2e`)
  - Gevals integration tests
  - Conformance tests
- [ ] Document any required environment setup (LLM API keys, Ollama fallback)

## Considerations

### Test Environment Isolation

The gevals tests use an LLM agent that interacts with the gateway. Key isolation concerns:

- **Tool state**: Gevals tests invoke tools - ensure they don't conflict with other test servers
- **Server registration**: If gevals tests register additional MCP servers, ensure cleanup
- **Authentication state**: Verify credentials used don't interfere with other tests

### LLM Dependencies

The gevals workflow has fallback to local Ollama if paid LLM secrets aren't available. For e2e integration:

- Consider whether Ollama setup should be part of standard e2e setup
- Document the model caching behavior for local development
- Ensure CI runners have sufficient resources for Ollama if used

## Related

- Follow-up to #586 (gevals PR)
- Builds on #527 (on-demand e2e workflow)
- Related to #452 (original integration tests issue)

## References

- [evals/README.md](https://github.com/Kuadrant/mcp-gateway/blob/main/evals/README.md) - Gevals configuration and usage
- [gevals documentation](https://github.com/genmcp/gevals)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate gevals integration tests into e2e test suite and release process #601

Summary

Goals

Tasks

1. E2E Integration

2. On-Demand Workflow Updates

3. Release Process Documentation

Considerations

Test Environment Isolation

LLM Dependencies

Related

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate gevals integration tests into e2e test suite and release process #601

Description

Summary

Goals

Tasks

1. E2E Integration

2. On-Demand Workflow Updates

3. Release Process Documentation

Considerations

Test Environment Isolation

LLM Dependencies

Related

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions