feat(recipes): add GAIE recipe for llama-3-70b vLLM disagg-multi-node#7651
feat(recipes): add GAIE recipe for llama-3-70b vLLM disagg-multi-node#7651AsadShahid04 wants to merge 3 commits intoai-dynamo:mainfrom
Conversation
Add GAIE (Generative AI Engine) recipe for llama-3-70b with vLLM in disaggregated multi-node configuration. This completes GAIE support across all llama-3-70b vLLM recipe shapes (agg, disagg-single-node, and disagg-multi-node). Includes: - deploy.yaml with Epp (Embeddings Pre-Processor) and dual worker setup - http-route.yaml for Ingress Gateway configuration Fixes ai-dynamo#7392 Signed-off-by: Asad Shahid <[email protected]>
|
👋 Hi AsadShahid04! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
WalkthroughAdded two new Kubernetes manifest files for a GAIE recipe supporting disaggregated multi-node deployment of Llama 3 70B with vLLM. The Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yaml`:
- Around line 1-177: The PR is missing the required documentation and
validation/tests referenced in issue `#7392`; add updates to the Kubernetes
inference gateway doc, the top-level recipes README, and the llama-3-70b recipe
README to document the new GAIE deploy and its parameters, and add GAIE recipe
validation/tests that exercise the new DynamoGraphDeployment named
"llama3-70b-disagg-mn-gaie" and its components (Epp, VllmPrefillWorker,
VllmDecodeWorker) including env/config expectations (e.g., DYN_MODEL_NAME,
DYN_ENFORCE_DISAGG, SERVED_MODEL_NAME, MODEL_PATH, kv-transfer-config and
resource/gpu counts) and ensure test matrix entries cover the new path.
In `@recipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml`:
- Around line 16-17: The note and backendRefs usage are inconsistent:
backendRefs.namespace must either be explicitly set where backendRefs are
defined (the backendRefs entries referenced in the file) to the namespace
containing the InferencePool, or the top-level metadata.namespace note must be
updated to explicitly state "this route must be created in the same namespace as
the InferencePool" so omitting backendRefs.namespace is safe; update either the
backendRefs entries to include backendRefs.namespace = <InferencePool-namespace>
(matching your InferencePool resource name/namespace) or change the comment
about metadata.namespace to clearly require same-namespace application for the
route and InferencePool.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0d37f30d-bb25-4d5d-aa69-b92854074c53
📒 Files selected for processing (2)
recipes/llama-3-70b/vllm/disagg-multi-node/gaie/deploy.yamlrecipes/llama-3-70b/vllm/disagg-multi-node/gaie/http-route.yaml
Update inference-gateway.md, recipes/README.md, and llama-3-70b/README.md to include the new disagg-multi-node GAIE recipe path. Marks GAIE support as available for the multi-node disaggregated configuration in the recipes compatibility matrix. Fixes ai-dynamo#7392 Signed-off-by: Asad Shahid <[email protected]>
|
Added documentation updates in d6b5574:
|
Add GAIE (Generative AI Engine) recipe for llama-3-70b with vLLM in disaggregated multi-node configuration. This completes GAIE support across all llama-3-70b vLLM recipe shapes (agg, disagg-single-node, and disagg-multi-node).
Includes:
Fixes #7392
Summary by CodeRabbit