feat(recipes): add GAIE recipe for llama-3-70b vLLM disagg-multi-node by AsadShahid04 · Pull Request #7651 · ai-dynamo/dynamo

AsadShahid04 · 2026-03-26T04:39:51Z

Add GAIE (Generative AI Engine) recipe for llama-3-70b with vLLM in disaggregated multi-node configuration. This completes GAIE support across all llama-3-70b vLLM recipe shapes (agg, disagg-single-node, and disagg-multi-node).

Includes:

deploy.yaml with Epp (Embeddings Pre-Processor) and dual worker setup
http-route.yaml for Ingress Gateway configuration

Fixes #7392

Summary by CodeRabbit

New Features
- Deployed Llama-3-70b model with optimized distributed inference configuration
- Configured HTTP routing and endpoint access for the model deployment

Add GAIE (Generative AI Engine) recipe for llama-3-70b with vLLM in disaggregated multi-node configuration. This completes GAIE support across all llama-3-70b vLLM recipe shapes (agg, disagg-single-node, and disagg-multi-node). Includes: - deploy.yaml with Epp (Embeddings Pre-Processor) and dual worker setup - http-route.yaml for Ingress Gateway configuration Fixes ai-dynamo#7392 Signed-off-by: Asad Shahid <[email protected]>

copy-pr-bot · 2026-03-26T04:39:56Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-03-26T04:40:00Z

👋 Hi AsadShahid04! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2026-03-26T04:44:29Z

Walkthrough

Added two new Kubernetes manifest files for a GAIE recipe supporting disaggregated multi-node deployment of Llama 3 70B with vLLM. The deploy.yaml defines a DynamoGraphDeployment with an endpoint service and separate prefill/decode worker services configured with GPU resources, shared memory, and disaggregation scheduling. The http-route.yaml defines an HTTPRoute for external traffic routing to the deployment.

Changes

Cohort / File(s)	Summary
GAIE Deployment Configuration `recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yaml`	New `DynamoGraphDeployment` manifest defining three services (`Epp`, `VllmPrefillWorker`, `VllmDecodeWorker`) with GPU allocation (8 per worker), shared memory configuration (80Gi), model cache mounting, disaggregation scheduling profiles, and sidecar frontend components.
GAIE Network Routing `recipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml`	New `HTTPRoute` manifest routing traffic to `llama3-70b-disagg-pool` `InferencePool` backend with 300s request timeout on path `/`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR adds deploy.yaml and http-route.yaml files as required by issue `#7392`, but misses documentation updates and recipe validation tasks.	Complete the linked issue requirements by adding updates to docs/kubernetes/inference-gateway.md, recipes/README.md, recipes/llama-3-70b/README.md, and GAIE recipe validation/tests.
Description check	❓ Inconclusive	The PR description covers the overview and main changes but lacks complete coverage of all required template sections.	Add a 'Where should the reviewer start?' section identifying key files (deploy.yaml and http-route.yaml) to review, and clarify which issues are fixed.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a GAIE recipe for llama-3-70b vLLM in disaggregated multi-node configuration.
Out of Scope Changes check	✅ Passed	All changes in the PR are directly related to adding the GAIE recipe for disagg-multi-node configuration as specified in issue `#7392`.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yaml`:
- Around line 1-177: The PR is missing the required documentation and
validation/tests referenced in issue `#7392`; add updates to the Kubernetes
inference gateway doc, the top-level recipes README, and the llama-3-70b recipe
README to document the new GAIE deploy and its parameters, and add GAIE recipe
validation/tests that exercise the new DynamoGraphDeployment named
"llama3-70b-disagg-mn-gaie" and its components (Epp, VllmPrefillWorker,
VllmDecodeWorker) including env/config expectations (e.g., DYN_MODEL_NAME,
DYN_ENFORCE_DISAGG, SERVED_MODEL_NAME, MODEL_PATH, kv-transfer-config and
resource/gpu counts) and ensure test matrix entries cover the new path.

In `@recipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml`:
- Around line 16-17: The note and backendRefs usage are inconsistent:
backendRefs.namespace must either be explicitly set where backendRefs are
defined (the backendRefs entries referenced in the file) to the namespace
containing the InferencePool, or the top-level metadata.namespace note must be
updated to explicitly state "this route must be created in the same namespace as
the InferencePool" so omitting backendRefs.namespace is safe; update either the
backendRefs entries to include backendRefs.namespace = <InferencePool-namespace>
(matching your InferencePool resource name/namespace) or change the comment
about metadata.namespace to clearly require same-namespace application for the
route and InferencePool.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0d37f30d-bb25-4d5d-aa69-b92854074c53

📥 Commits

Reviewing files that changed from the base of the PR and between a58bcc3 and 60eeafc.

📒 Files selected for processing (2)

recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yaml
recipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml

recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yaml

recipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml

Update inference-gateway.md, recipes/README.md, and llama-3-70b/README.md to include the new disagg-multi-node GAIE recipe path. Marks GAIE support as available for the multi-node disaggregated configuration in the recipes compatibility matrix. Fixes ai-dynamo#7392 Signed-off-by: Asad Shahid <[email protected]>

AsadShahid04 · 2026-03-27T00:40:43Z

Added documentation updates in d6b5574:

Updated docs/kubernetes/inference-gateway.md to include disagg-multi-node/gaie/ path alongside existing agg and disagg-single-node references
Updated recipes/README.md to mark GAIE support as ✅ for Llama-3-70B disagg-multi-node
Updated recipes/llama-3-70b/README.md to list all three GAIE-supported configurations

AsadShahid04 requested review from a team as code owners March 26, 2026 04:39

pull-request-size bot added the size/L label Mar 26, 2026

github-actions bot added feat external-contribution Pull request is from an external contributor labels Mar 26, 2026

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yaml Show resolved Hide resolved

recipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml Show resolved Hide resolved

github-actions bot added the documentation Improvements or additions to documentation label Mar 27, 2026

Merge branch 'main' into fix/inf-issue-7392

dccb139

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(recipes): add GAIE recipe for llama-3-70b vLLM disagg-multi-node#7651

feat(recipes): add GAIE recipe for llama-3-70b vLLM disagg-multi-node#7651
AsadShahid04 wants to merge 3 commits intoai-dynamo:mainfrom
AsadShahid04:fix/inf-issue-7392

AsadShahid04 commented Mar 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

coderabbitai bot commented Mar 26, 2026

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

AsadShahid04 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AsadShahid04 commented Mar 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

coderabbitai bot commented Mar 26, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

AsadShahid04 commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AsadShahid04 commented Mar 26, 2026 •

edited by coderabbitai bot

Loading