Skip to content

[ci] Migrate microbenchmark, benchmark-worker-startup, and rllib compute configs to new schema#62604

Open
sai-miduthuri wants to merge 3 commits intomasterfrom
sai-miduthuri/upgrade-microbenchmark-compute-configs
Open

[ci] Migrate microbenchmark, benchmark-worker-startup, and rllib compute configs to new schema#62604
sai-miduthuri wants to merge 3 commits intomasterfrom
sai-miduthuri/upgrade-microbenchmark-compute-configs

Conversation

@sai-miduthuri
Copy link
Copy Markdown
Contributor

Summary

  • Migrate 10 compute config files to the new Anyscale SDK schema (cloud_id -> cloud, head_node_type -> head_node, worker_node_types -> worker_nodes, etc.)
  • Add anyscale_sdk_2026: true flag to 12 test cluster blocks in release_tests.yaml

Config files migrated

  • release/microbenchmark/tpl_64.yaml (AWS, head-only)
  • release/microbenchmark/tpl_64_gce.yaml (GCE, head-only)
  • release/microbenchmark/experimental/compute_t4_gpu.yaml (AWS, head-only GPU)
  • release/microbenchmark/experimental/compute_gpu_2x1_aws.yaml (AWS, head+worker GPU)
  • release/microbenchmark/experimental/compute_a100_gpu.yaml (AWS, head-only GPU)
  • release/microbenchmark/experimental/compute_l4_gpu.yaml (AWS, head-only GPU)
  • release/microbenchmark/experimental/compute_l4_gpu_2x1_aws.yaml (AWS, head+worker GPU)
  • release/benchmark-worker-startup/only_head_node_1gpu_64cpu.yaml (AWS, head-only GPU)
  • release/benchmark-worker-startup/only_head_node_1gpu_64cpu_gce.yaml (GCE, head-only)
  • release/rllib_tests/1gpu_16cpus.yaml (AWS, head-only GPU)

Tests updated with anyscale_sdk_2026: true

  • microbenchmark (base + GCE variation)
  • compiled_graphs
  • compiled_graphs_GPU
  • compiled_graphs_GPU_multinode
  • compiled_graphs_GPU_cu130
  • compiled_graphs_GPU_multinode_cu130
  • rdt_single_node_T4_microbenchmark
  • rdt_single_node_A100_microbenchmark
  • benchmark_worker_startup (base + GCE variation)
  • rllib_learning_tests_pong_appo_torch

Test plan

  • All 10 config files validated against ComputeConfig.from_yaml()
  • CI passes with the new configs

🤖 Generated with Claude Code

…ute configs to new schema

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Signed-off-by: sai.miduthuri <[email protected]>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request migrates several release test configuration files to a new schema, involving field renames such as "cloud_id" to "cloud" and "head_node_type" to "head_node", and standardizing resource keys to uppercase. It also adds the "anyscale_sdk_2026" flag to various test definitions. A logical inconsistency was identified in "only_head_node_1gpu_64cpu.yaml" where the resource definition specifies 64 GPUs, contradicting the filename and physical instance capacity.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 2c4bba4. Configure here.

@ray-gardener ray-gardener bot added rllib RLlib related issues core Issues that should be addressed in Ray Core devprod release-test release test labels Apr 14, 2026
sai-miduthuri and others added 2 commits April 14, 2026 11:11
…ished pattern

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Signed-off-by: sai.miduthuri <[email protected]>
The g5.16xlarge has 1 physical GPU but the benchmark script uses
--num_gpus_in_cluster 64 to test resource scheduling with 64 logical
GPU slots.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Signed-off-by: sai.miduthuri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Issues that should be addressed in Ray Core devprod release-test release test rllib RLlib related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant