fix(api): stop emitting duplicate RUN_ERROR events on streaming failures#2667
Open
pullfrog[bot] wants to merge 1 commit intomainfrom
Open
fix(api): stop emitting duplicate RUN_ERROR events on streaming failures#2667pullfrog[bot] wants to merge 1 commit intomainfrom
pullfrog[bot] wants to merge 1 commit intomainfrom
Conversation
executeRun() was re-throwing errors after already emitting a RUN_ERROR SSE event and persisting error state to the database. The controller's catch block would then emit a second RUN_ERROR event — exactly matching the duplicate error events reported in #2666. Changes: - Remove the re-throw from executeRun's catch block so errors are fully handled in the service (emit event + persist to DB) without bubbling to the controller - Wrap the DB error-cleanup in its own try/catch so a database failure during cleanup cannot suppress the already-emitted RUN_ERROR event - Replace silent fallback to OpenAI for unsupported providers (bedrock, openrouter) with an explicit error, and add exhaustive switch check Fixes #2666
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #2666
When a streaming run fails, the client receives two
RUN_ERRORSSE events instead of one. This PR eliminates the duplicate.Root cause
executeRun()inv1.service.tscatches errors during streaming and:RUN_ERRORSSE event to the clientThe controller's catch block then catches the re-thrown error and emits a second
RUN_ERRORSSE event. This matches the reporter's observation:Changes
apps/api/src/v1/v1.service.tsexecuteRun()'s catch block. The service now fully handles errors (emit SSE event + persist to DB) without bubbling to the controller.RUN_ERRORevent has already been sent to the client.packages/backend/src/services/llm/ai-sdk-client.tsbedrockandopenrouterare defined in theProvidertype but had no factory — they silently fell back to OpenAI, causing confusing LLM failures. Now they throw a clear "not yet supported" error.Provideradditions cause a compile-time error if not handled.apps/api/src/v1/__tests__/v1.service.test.tsexecuteRunto reject — it now resolves after handling errors internally.Note on the underlying INTERNAL_ERROR
The duplicate
RUN_ERRORis a confirmed code bug fixed in this PR. However, the underlyingINTERNAL_ERRORthat triggers it is a production environment issue (likely related to provider key configuration, model availability, or external API failures). The generic "An internal error occurred" message is intentionally opaque to avoid leaking internal details to clients — the full error is persisted to the database and Sentry for debugging by the Tambo team.