Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/kthena/docs/user-guide/router-routing.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ spec:
4. Prefill runs on prefill instance; decode runs on decode instance, with KV state exchanged between them (configure `kvConnector` for nixl/mooncake when needed)
5. Response returned to client

**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggragation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggragation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggragation.yaml).
**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggregation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggregation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggregation.yaml).

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ spec:
4. Prefill runs on prefill instance; decode runs on decode instance, with KV state exchanged between them (configure `kvConnector` for nixl/mooncake when needed)
5. Response returned to client

**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggragation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggragation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggragation.yaml).
**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggregation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggregation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggregation.yaml).

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ spec:
4. Prefill runs on prefill instance; decode runs on decode instance, with KV state exchanged between them (configure `kvConnector` for nixl/mooncake when needed)
5. Response returned to client

**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggragation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggragation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggragation.yaml).
**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggregation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggregation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggregation.yaml).

---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ spec:
4. Prefill runs on prefill instance; decode runs on decode instance, with KV state exchanged between them (configure `kvConnector` for nixl/mooncake when needed)
5. Response returned to client

**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggragation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggragation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggragation.yaml).
**NOTE**: Deploy [ModelServing](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServing-ds1.5b-pd-disaggregation.yaml) with PD roles first, then apply [ModelServer](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelServer-ds1.5b-pd-disaggregation.yaml) and [ModelRoute](https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/ModelRoute-ds1.5b-pd-disaggregation.yaml).

---

Expand Down
12 changes: 0 additions & 12 deletions examples/kthena-router/ModelRoute-ds1.5b-pd-disaggragation.yaml

This file was deleted.

11 changes: 11 additions & 0 deletions examples/kthena-router/ModelRoute-ds1.5b-pd-disaggregation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: networking.serving.volcano.sh/v1alpha1
kind: ModelRoute
metadata:
name: deepseek-r1-1-5b-pd-disaggregation
namespace: default
spec:
modelName: "deepseek-r1-1-5b-pd-disaggregation"
rules:
- name: "default"
targetModels:
- modelServerName: "deepseek-r1-1-5b-pd-disaggregation"
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: networking.serving.volcano.sh/v1alpha1
kind: ModelServer
metadata:
name: deepseek-r1-1-5b-pd-disaggragation
name: deepseek-r1-1-5b-pd-disaggregation
namespace: default
spec:
workloadSelector:
Expand Down
6 changes: 3 additions & 3 deletions test/e2e/router/shared.go
Original file line number Diff line number Diff line change
Expand Up @@ -396,7 +396,7 @@ func TestModelRoutePrefillDecodeDisaggregationShared(t *testing.T, testCtx *rout

// Deploy ModelServing
t.Log("Deploying ModelServing for PD disaggregation...")
modelServing := utils.LoadYAMLFromFile[workloadv1alpha1.ModelServing]("examples/kthena-router/ModelServing-ds1.5b-pd-disaggragation.yaml")
modelServing := utils.LoadYAMLFromFile[workloadv1alpha1.ModelServing]("examples/kthena-router/ModelServing-ds1.5b-pd-disaggregation.yaml")
modelServing.Namespace = testNamespace
createdModelServing, err := testCtx.KthenaClient.WorkloadV1alpha1().ModelServings(testNamespace).Create(ctx, modelServing, metav1.CreateOptions{})
require.NoError(t, err, "Failed to create ModelServing")
Expand All @@ -417,7 +417,7 @@ func TestModelRoutePrefillDecodeDisaggregationShared(t *testing.T, testCtx *rout

// Deploy ModelServer
t.Log("Deploying ModelServer for PD disaggregation...")
modelServer := utils.LoadYAMLFromFile[networkingv1alpha1.ModelServer]("examples/kthena-router/ModelServer-ds1.5b-pd-disaggragation.yaml")
modelServer := utils.LoadYAMLFromFile[networkingv1alpha1.ModelServer]("examples/kthena-router/ModelServer-ds1.5b-pd-disaggregation.yaml")
modelServer.Namespace = testNamespace
createdModelServer, err := testCtx.KthenaClient.NetworkingV1alpha1().ModelServers(testNamespace).Create(ctx, modelServer, metav1.CreateOptions{})
require.NoError(t, err, "Failed to create ModelServer")
Expand All @@ -435,7 +435,7 @@ func TestModelRoutePrefillDecodeDisaggregationShared(t *testing.T, testCtx *rout

// Deploy ModelRoute
t.Log("Deploying ModelRoute for PD disaggregation...")
modelRoute := utils.LoadYAMLFromFile[networkingv1alpha1.ModelRoute]("examples/kthena-router/ModelRoute-ds1.5b-pd-disaggragation.yaml")
modelRoute := utils.LoadYAMLFromFile[networkingv1alpha1.ModelRoute]("examples/kthena-router/ModelRoute-ds1.5b-pd-disaggregation.yaml")
modelRoute.Namespace = testNamespace

// Configure ParentRefs if using Gateway API
Expand Down
Loading