kubectlinstalled and configured with authentication to the cluster.- Dedicated Namespace to run the jobs.
- RWX volume to store the evaluation results (check
volume.ymlfor inspiration).
The Job automatically references to a secret called evaluation-secret which is configured to pass the values on as environment Variables. The following key is required to be created.
PROVIDER_TOKEN- contains the API tokens for different providers. E.g.:openrouter:abcdefgh1234,custom-provider:abcdefgh1234
kubectl --namespace eval-dev-quality create secret generic evaluation-secret --from-literal='PROVIDER_TOKEN='- Define all the models with
--modelwhich should be run inside the containerized workload. - Define the parameter
--runtime kubernetesto indicate that jobs should run inside a kubernetes cluster. - Define the parameter
--parallel 20to indicate how many jobs should run simultaneously.
Example:
eval-dev-quality evaluate --runtime kubernetes --runs 5 --model symflower/symbolic-execution --model symflower/symbolic-execution --model symflower/symbolic-execution --repository golang/plain --parallel 2This commands run 3x the symflower/symbolic-execution model with 5 runs of each model inside a containerized workload on the kubernetes cluster, it will limit the parallel execution to 2 containers at the same time.