Validation Checklist
Version
master
Detailed Description
When upgrading a cluster from 26.03 to master the following error appears:
The Deployment "jobset-controller-manager" is invalid: spec.selector: Invalid value: {"matchLabels":{"app.kubernetes.io/instance":"jobset","app.kubernetes.io/managed-by":"kustomize","app.kubernetes.io/name":"jobset","control-plane":"controller-manager"}}: field is immutable
This is most likely related to #3413.
The manifest which it tries to apply is the following:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: manager
app.kubernetes.io/created-by: jobset
app.kubernetes.io/instance: jobset
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/name: jobset
app.kubernetes.io/part-of: jobset
control-plane: controller-manager
name: jobset-controller-manager
namespace: kubeflow-system
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/instance: jobset
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/name: jobset
control-plane: controller-manager
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: manager
traffic.sidecar.istio.io/excludeInboundPorts: "9443"
labels:
app.kubernetes.io/instance: jobset
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/name: jobset
control-plane: controller-manager
spec:
containers:
- args:
- --config=/controller_manager_config.yaml
- --zap-log-level=2
command:
- /manager
image: us-central1-docker.pkg.dev/k8s-staging-images/jobset/jobset:v0.11.0
livenessProbe:
httpGet:
path: /healthz
port: 8081
initialDelaySeconds: 15
periodSeconds: 20
name: manager
ports:
- containerPort: 8443
name: metrics
protocol: TCP
- containerPort: 9443
name: webhook-server
protocol: TCP
readinessProbe:
httpGet:
path: /readyz
port: 8081
initialDelaySeconds: 5
periodSeconds: 10
resources:
limits:
memory: 4Gi
requests:
cpu: 500m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /controller_manager_config.yaml
name: manager-config
subPath: controller_manager_config.yaml
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
serviceAccountName: jobset-controller-manager
terminationGracePeriodSeconds: 10
volumes:
- configMap:
name: jobset-manager-config
name: manager-config
- name: cert
secret:
defaultMode: 420
secretName: jobset-webhook-server-cert
The version that is currently deployed in the cluster is the following:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: "2026-03-19T19:22:01Z"
generation: 1
labels:
app.kubernetes.io/component: manager
app.kubernetes.io/created-by: jobset
app.kubernetes.io/instance: controller-manager
app.kubernetes.io/managed-by: kustomize
app.kubernetes.io/name: deployment
app.kubernetes.io/part-of: jobset
control-plane: controller-manager
name: jobset-controller-manager
namespace: kubeflow-system
resourceVersion: "6427"
uid: bc81f873-a1a1-4db1-bf07-c8f400f53c6e
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
control-plane: controller-manager
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
kubectl.kubernetes.io/default-container: manager
traffic.sidecar.istio.io/excludeInboundPorts: "9443"
labels:
control-plane: controller-manager
spec:
containers:
- args:
- --config=/controller_manager_config.yaml
- --zap-log-level=2
command:
- /manager
image: registry.k8s.io/jobset/jobset:v0.10.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8081
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 20
successThreshold: 1
timeoutSeconds: 1
name: manager
ports:
- containerPort: 8443
name: metrics
protocol: TCP
- containerPort: 9443
name: webhook-server
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: 8081
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
memory: 4Gi
requests:
cpu: 500m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /controller_manager_config.yaml
name: manager-config
subPath: controller_manager_config.yaml
- mountPath: /tmp/k8s-webhook-server/serving-certs
name: cert
readOnly: true
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
serviceAccount: jobset-controller-manager
serviceAccountName: jobset-controller-manager
terminationGracePeriodSeconds: 10
volumes:
- configMap:
defaultMode: 420
name: jobset-manager-config
name: manager-config
- name: cert
secret:
defaultMode: 420
secretName: jobset-webhook-server-cert
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2026-03-19T19:27:33Z"
lastUpdateTime: "2026-03-19T19:27:33Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2026-03-19T19:22:02Z"
lastUpdateTime: "2026-03-19T19:27:33Z"
message: ReplicaSet "jobset-controller-manager-d988cfd45" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas:
Steps to Reproduce
- Setup a Kubeflow platform deployment with kind on version
26.03
- Checkout the latest
master branch (at the time this is commit 46f3142)
- Try to re-deploy via the
while [...] command and observe the error mentioned above
Screenshots or Videos (Optional)
No response
Validation Checklist
Version
master
Detailed Description
When upgrading a cluster from
26.03tomasterthe following error appears:This is most likely related to #3413.
The manifest which it tries to apply is the following:
The version that is currently deployed in the cluster is the following:
Steps to Reproduce
26.03masterbranch (at the time this is commit 46f3142)while [...]command and observe the error mentioned aboveScreenshots or Videos (Optional)
No response