-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat(ai-proxy): add path_mode option to override.endpoint #12857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Add `override.path_mode` option to control upstream request path: - fixed (default): use endpoint path or driver default - preserve: use original request URI path with query string - append: append original request URI path to endpoint path This allows using a single route for multiple API endpoints (e.g., /v1/chat/completions and /v1/responses) with the same model, reducing route explosion in multi-model deployments. Signed-off-by: Sihyeon Jang <[email protected]>
Add test cases for ai-proxy and ai-proxy-multi: - path_mode=preserve: preserves original request URI path - path_mode=append: appends request path to endpoint path - path_mode=fixed: uses endpoint path (existing behavior) - query string preservation with preserve/append modes Signed-off-by: Sihyeon Jang <[email protected]>
|
Hi @sihyeonn, thank you for your contribution. Could you please describe your original requirements in detail? What problems did you encounter while using the current version, and why was this change necessary? |
|
Hi @Baoyuantop , I'm running an LLM gateway. Many models expose multiple endpoints from the same base URL:
Currently, I must create separate routes for each endpoint because # Example Current: 4 routes per model
routes:
- uri: /v1/chat/completions
plugins:
ai-proxy:
override:
endpoint: https://api.example.com/models/gpt-4/v1/chat/completions
- uri: /v1/responses
plugins:
ai-proxy:
override:
endpoint: https://api.example.com/models/gpt-4/v1/responses
# ... 2 more routesThis creates unnecessary duplication:
|
| if ctx.var.is_args and ctx.var.args and #ctx.var.args > 0 then | ||
| local req_args_tab = core.string.decode_args(ctx.var.args) | ||
| if type(req_args_tab) == "table" then | ||
| core.table.merge(query_params, req_args_tab) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be extracted into a common function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended to use ctx.var.is_args == "?" to check.
| end | ||
| end | ||
| else | ||
| path = (endpoint_path and endpoint_path ~= "" and endpoint_path) or self.path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be confirmed whether the behavior remains consistent when endpoint_path is an empty string.
apisix/plugins/ai-proxy/schema.lua
Outdated
| type = "string", | ||
| description = "To be specified to override the endpoint of the AI Instance", | ||
| }, | ||
| path_mode = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is recommended to extract it as a common schema variable to avoid duplication.
- Extract merge_request_query_params() to avoid code duplication - Use ctx.var.is_args == "?" for proper string comparison - Extract path_mode_schema as shared variable to avoid duplication - Clarify empty string handling in fixed mode path selection Signed-off-by: Sihyeon Jang <[email protected]>
|
@Baoyuantop Thanks for your review. I handled it. |
Description
Add
override.path_modeoption toai-proxyandai-proxy-multiplugins to control how the upstream request path is determined.Background (Problem)
Currently, when
override.endpointis set, the URL path is fixed to either the endpoint's path or the driver's default path. This causes issues when:/v1/chat/completionsand/v1/responses)Solution
Add
path_modeoption with three modes:fixed(default): Existing behavior - uses endpoint path or driver defaultpreserve: Uses the original request URI path (with query string)append: Appends original request URI path to the endpoint pathThis allows using a single route with the same model for multiple API endpoints, reducing route explosion in multi-model deployments.
Which issue(s) this PR fixes:
Fixes #
Checklist