-
Notifications
You must be signed in to change notification settings - Fork 961
Description
Summary
The images(), videos(), and video_stream() handlers in lib/llm/src/http/service/openai.rs have a pre-existing gap: their engine.generate(...).await.map_err(...) closures do not call inflight.mark_error(...), unlike the completions_single, embeddings, chat_completions, and responses handlers which all do.
This means that when a media endpoint request fails (e.g., 503 due to resource exhaustion), requests_total records the default internal error type rather than the correct overload error type, causing the two frontend metrics to diverge for media rejections.
Proposed Fix
Apply the same pattern used in the other handlers:
let stream = engine.generate(request).await.map_err(|e| {
if super::metrics::request_was_rejected(e.as_ref()) {
state.metrics_clone().inc_rejection(&model, "images");
}
- ErrorMessage::from_anyhow(e, "Failed to generate images")
+ let err_response = ErrorMessage::from_anyhow(e, "Failed to generate images");
+ inflight.mark_error(extract_error_type_from_response(&err_response));
+ err_response
})?;Apply the same pattern in videos() and video_stream().
Context
This gap predates PR #7644 and was acknowledged as out of scope there. Tracked here for a follow-up fix.
Backlink: #7644 (comment)
Requested by: @coderabbitai