docs: plan for #304 — add Kubernetes autoscaling and capacity probes#311
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
|
||
| This lands in the `src/vip_tests/workbench/` test suite as a new feature file (or extension to the existing `test_session_capacity.feature`). The test DSL (step definitions) will be in a corresponding `.py` file that uses two client layers: | ||
|
|
||
| - A new `src/vip/clients/kubernetes.py` client for read-only cluster queries (node counts, pod scheduling, resource quotas). This wraps either `kubectl` CLI calls or the `kubernetes` Python SDK. |
There was a problem hiding this comment.
@copilot would the user be responsible for setting up this access? We would just inherit the access?
There was a problem hiding this comment.
Yes — VIP inherits the ambient credentials. The kubernetes client will use whatever kubeconfig is already present in the environment (~/.kube/config, the KUBECONFIG env var, or an in-cluster service account token), exactly the same way kubectl does. No credential fields will be stored in vip.toml; the user is responsible for ensuring their environment has the appropriate cluster access configured before running these tests. I've updated the plan to make this explicit (commit 9e29abd).
|
|
||
| ## Open questions | ||
|
|
||
| - **UNCONFIRMED:** Should `src/vip/clients/kubernetes.py` use `kubectl` CLI (already a VIP install dependency) or the `kubernetes` Python SDK? SDK has cleaner error handling but adds a dependency. CLI path is simpler if `kubectl` is already available from `vip cluster` commands. |
There was a problem hiding this comment.
@copilot if we use the SDK do we inherit credentials or do we need to set them up separately? If separately, this is a non-starter
There was a problem hiding this comment.
The SDK inherits the same ambient credentials as kubectl — it calls load_kube_config() (reads ~/.kube/config / KUBECONFIG) or load_incluster_config() for in-cluster pod contexts. No separate setup required. Updated the plan to use the SDK and removed that open question (commit 9c5b8f4).
| ## Open questions | ||
|
|
||
| - **UNCONFIRMED:** Should `src/vip/clients/kubernetes.py` use `kubectl` CLI (already a VIP install dependency) or the `kubernetes` Python SDK? SDK has cleaner error handling but adds a dependency. CLI path is simpler if `kubectl` is already available from `vip cluster` commands. | ||
| - **UNCONFIRMED:** Should the "quick succession" scenarios use Python threading, asyncio, or Playwright's multi-context API for concurrent launches? Threading is simplest but may not exercise the race condition reliably. |
There was a problem hiding this comment.
Resolved — asyncio it is. Updated the plan accordingly (commit 9c5b8f4).
Co-authored-by: ian-flores <18703558+ian-flores@users.noreply.github.com>
|
|
||
| - **UNCONFIRMED:** Should `src/vip/clients/kubernetes.py` use `kubectl` CLI (already a VIP install dependency) or the `kubernetes` Python SDK? SDK has cleaner error handling but adds a dependency. CLI path is simpler if `kubectl` is already available from `vip cluster` commands. | ||
| - **UNCONFIRMED:** Should the "quick succession" scenarios use Python threading, asyncio, or Playwright's multi-context API for concurrent launches? Threading is simplest but may not exercise the race condition reliably. | ||
| - Should the new scenarios live in a separate `test_session_capacity_k8s.feature` or extend the existing `test_session_capacity.feature` with an `@kubernetes` tag? Separate file keeps the distinction clear. |
There was a problem hiding this comment.
Resolved — the scenarios will live in a separate test_session_capacity_k8s.feature file. Updated the plan (commit 9c5b8f4).
…ate feature file Co-authored-by: ian-flores <18703558+ian-flores@users.noreply.github.com>
|
#304) - Add WorkbenchKubernetesConfig dataclass to config.py with fields for namespace, node_pool_profiles, max_sessions, profile_cpu_limit, and profile_memory_limit_gib; wire it into WorkbenchConfig.from_dict - Add src/vip/clients/kubernetes.py: read-only KubernetesClient using the kubernetes SDK with lazy import (RuntimeError on missing package); exposes node_count, running_session_pods, pod_node_pool, resource_quota, and pod_resource_limits - Add test_session_capacity_k8s.feature with six Gherkin scenarios: autoscaler adds a node, session lands after scale-up, session count respects max, quick-succession sessions reach Active, profile routes to correct node pool, and profile enforces CPU/memory limits - Add step definitions test_session_capacity_k8s.py for all six scenarios; each scenario skips cleanly when [workbench.kubernetes] is absent - Move shared '@given("Workbench is accessible and I am logged in")' step to workbench/conftest.py so both capacity feature files use it without duplicate registration - Add kubernetes_client session fixture to src/vip_tests/conftest.py - Add selftests/test_workbench_kubernetes_config.py covering defaults, from_dict, load_config integration, and repr - Document new [workbench.kubernetes] block in vip.toml.example Implements the plan from #311 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pull request created: #334
|
Closes #304
This plan addresses the request to add Kubernetes-specific session capacity and autoscaling tests to VIP's Workbench test suite. The customer flagged six scenarios around autoscaling boundaries, capacity limits, and resource profile enforcement that aren't covered by the current
test_session_capacity.feature.Merging this PR will trigger an implementation PR — comment to iterate on the plan first.
Plan overview
The implementation will:
test_session_capacity_k8s.featurewith six scenarios for autoscaling, capacity limits, and profile routingsrc/vip/clients/kubernetes.pyfor read-only cluster queries (node counts, pod scheduling)[workbench.kubernetes]for node-pool config and resource capsSee the full plan in
thoughts/shared/plans/2026-05-29-issue-304-k8s-autoscaling-probes.md.