Skip to content

refactor: extract google ai logic to windmill-common and use native gemini api in chat proxy#8115

Open
centdix wants to merge 10 commits intomainfrom
ai-logic-share
Open

refactor: extract google ai logic to windmill-common and use native gemini api in chat proxy#8115
centdix wants to merge 10 commits intomainfrom
ai-logic-share

Conversation

@centdix
Copy link
Collaborator

@centdix centdix commented Feb 26, 2026

Summary

Extracts Google AI (Gemini) types and conversion logic from the worker into windmill-common, mirroring the existing Bedrock pattern. The API chat proxy now uses the native Gemini API directly instead of routing through Google's OpenAI-compatibility shim (/openai suffix).

Changes

  • windmill-common/src/ai_google.rs (new): shared Gemini types, request/response structs, SSE event types, and conversion functions (openai_messages_to_gemini, openai_tools_to_gemini, parse_gemini_sse_event, parse_data_url, find_gemini_function_name)
  • windmill-api/src/google.rs (new): native Gemini handler for the API chat proxy — converts OpenAI-format requests to Gemini format, calls streamGenerateContent?alt=sse or generateContent, and converts responses back to OpenAI format for the frontend
  • windmill-api/src/ai.rs: route GoogleAI + chat/completions to the new native handler; remove the /openai suffix hack from prepare_request; expose HTTP_CLIENT, KEEPALIVE_INTERVAL_SECS, inject_keepalives as pub(crate)
  • windmill-worker/src/ai/providers/google_ai.rs: import Gemini types from windmill-common instead of defining them locally; worker-specific S3 image handling stays in place
  • windmill-worker/src/ai/sse.rs: remove local Gemini SSE type definitions; GeminiSSEParser.parse_event_data now delegates to parse_gemini_sse_event from common
  • windmill-api/Cargo.toml: add eventsource-stream dependency for HTTP SSE parsing in the new handler

Test plan

  • Configure a Google AI resource in workspace settings and verify chat completions work (streaming and non-streaming)
  • Verify the AI agent (worker path) still works correctly with Google AI models
  • Verify other providers (OpenAI, Anthropic, Bedrock) are unaffected

Generated with Claude Code

…emini api in chat proxy

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Feb 26, 2026

Deploying windmill with  Cloudflare Pages  Cloudflare Pages

Latest commit: 50191c4
Status: ✅  Deploy successful!
Preview URL: https://863606f0.windmill.pages.dev
Branch Preview URL: https://ai-logic-share.windmill.pages.dev

View logs

centdix and others added 7 commits February 26, 2026 15:31
Extract streaming and non-streaming Gemini response conversion into
shared functions in ai_google so the API proxy and worker use the same
logic instead of duplicating format translation.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Remove duplicate parse_data_url from worker utils, use shared version
  from windmill_common::ai_google in both google_ai and anthropic providers
- Improve error diagnostics in google.rs by including HTTP status code
  in error messages from Gemini API responses
- Change GeminiToolCallEvent::into_extra_content to instance method
  to_extra_content using &self

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@centdix centdix marked this pull request as ready for review March 3, 2026 12:19
@claude
Copy link
Contributor

claude bot commented Mar 3, 2026

Claude finished @centdix's task in 3m 37s —— View job


PR Review: Refactor Google AI Logic

  • Read all changed files and understand the refactoring
  • Review windmill-common/src/ai_google.rs (shared types & conversions)
  • Review windmill-api/src/google.rs (native Gemini handler)
  • Review windmill-api/src/ai.rs (routing changes)
  • Review worker-side changes
  • Check for bugs, security, performance issues
  • Post detailed review with inline comments

Overall Assessment

Well-structured refactoring that cleanly extracts shared Google AI types and conversion logic into windmill-common, following the same pattern used for Bedrock. The move from Google's OpenAI-compatibility shim to the native Gemini API is a good architectural decision — it gives full control over the request/response format and avoids the limitations of the compatibility layer.

What looks good

  • Clean separation of concerns: Shared types, SSE parsing, and format conversion live in windmill-common. The API proxy handler and worker each have their own domain-specific logic.
  • Consistent error handling: Both streaming and non-streaming paths handle API errors consistently with status code checking and body extraction.
  • SSE keepalive injection: Reusing the existing inject_keepalives mechanism for the Gemini streaming proxy prevents connection drops.
  • Well-documented code: Module-level docs, doc comments on public functions, and inline comments explain design decisions clearly.
  • Thought signature support: Proper handling of Gemini 3+ thoughtSignature for multi-turn agent conversations is preserved through the refactoring.

Issues found (see inline comments)

  1. Potential bug — double models/ prefix (google.rs:139): The model listing returns IDs like models/gemini-2.5-flash, but the chat/streaming endpoints prepend models/ again when building the URL. This would produce …/models/models/gemini-2.5-flash:streamGenerateContent.

  2. Empty API key fallback (ai.rs:815): unwrap_or("") silently proceeds with an empty key instead of returning a clear error.

  3. Incomplete finish_reason mapping (ai_google.rs:563): Gemini's MAX_TOKENS becomes "max_tokens" instead of OpenAI's expected "length".

  4. Missing annotations in non-streaming response (google.rs:297): Grounding/web search citations are parsed but not included in the OpenAI-format non-streaming response.

  5. Code duplication (ai_google.rs:363-448): openai_messages_to_gemini and the worker's convert_messages_to_gemini share ~80% of logic. Not blocking but worth noting for future maintenance.

Testing instructions

To test the changes: go to a workspace's Settings > AI page. Configure a Google AI provider by adding a resource with a valid Gemini API key. Then navigate to any Script editor (or the AI Chat in the workspace) and start a conversation using a Google AI / Gemini model. Verify that both streaming responses (real-time token output) and non-streaming responses work correctly. Also test with tool calling if possible (e.g., a script with AI agent capabilities). Finally, verify that other providers (OpenAI, Anthropic, Bedrock) still work normally — their code paths should be unaffected by this change.

centdix and others added 2 commits March 3, 2026 16:01
…ht pattern

Replace the worker's `convert_messages_to_gemini` and
`convert_content_to_parts_with_s3` (~130 lines) with the existing
pre-flight pattern: `prepare_messages_for_api` converts S3 objects to
data URLs, then the shared `openai_messages_to_gemini` handles the rest.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants