refactor: extract google ai logic to windmill-common and use native gemini api in chat proxy by centdix · Pull Request #8115 · windmill-labs/windmill

centdix · 2026-02-26T13:58:51Z

Summary

Extracts Google AI (Gemini) types and conversion logic from the worker into windmill-common, mirroring the existing Bedrock pattern. The API chat proxy now uses the native Gemini API directly instead of routing through Google's OpenAI-compatibility shim (/openai suffix).

Changes

windmill-common/src/ai_google.rs (new): shared Gemini types, request/response structs, SSE event types, and conversion functions (openai_messages_to_gemini, openai_tools_to_gemini, parse_gemini_sse_event, parse_data_url, find_gemini_function_name)
windmill-api/src/google.rs (new): native Gemini handler for the API chat proxy — converts OpenAI-format requests to Gemini format, calls streamGenerateContent?alt=sse or generateContent, and converts responses back to OpenAI format for the frontend
windmill-api/src/ai.rs: route GoogleAI + chat/completions to the new native handler; remove the /openai suffix hack from prepare_request; expose HTTP_CLIENT, KEEPALIVE_INTERVAL_SECS, inject_keepalives as pub(crate)
windmill-worker/src/ai/providers/google_ai.rs: import Gemini types from windmill-common instead of defining them locally; worker-specific S3 image handling stays in place
windmill-worker/src/ai/sse.rs: remove local Gemini SSE type definitions; GeminiSSEParser.parse_event_data now delegates to parse_gemini_sse_event from common
windmill-api/Cargo.toml: add eventsource-stream dependency for HTTP SSE parsing in the new handler

Test plan

Configure a Google AI resource in workspace settings and verify chat completions work (streaming and non-streaming)
Verify the AI agent (worker path) still works correctly with Google AI models
Verify other providers (OpenAI, Anthropic, Bedrock) are unaffected

Generated with Claude Code

…emini api in chat proxy Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

cloudflare-workers-and-pages · 2026-02-26T14:01:34Z

Deploying windmill with Cloudflare Pages

Latest commit:	`50191c4`
Status:	✅ Deploy successful!
Preview URL:	https://863606f0.windmill.pages.dev
Branch Preview URL:	https://ai-logic-share.windmill.pages.dev

View logs

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

…t auth header Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.5 <[email protected]>

…ream" This reverts commit dfa01d2.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

Extract streaming and non-streaming Gemini response conversion into shared functions in ai_google so the API proxy and worker use the same logic instead of duplicating format translation. Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Remove duplicate parse_data_url from worker utils, use shared version from windmill_common::ai_google in both google_ai and anthropic providers - Improve error diagnostics in google.rs by including HTTP status code in error messages from Gemini API responses - Change GeminiToolCallEvent::into_extra_content to instance method to_extra_content using &self Co-Authored-By: Claude Opus 4.6 <[email protected]>

claude · 2026-03-03T12:19:52Z

Claude finished @centdix's task in 3m 37s —— View job

PR Review: Refactor Google AI Logic

Read all changed files and understand the refactoring
Review windmill-common/src/ai_google.rs (shared types & conversions)
Review windmill-api/src/google.rs (native Gemini handler)
Review windmill-api/src/ai.rs (routing changes)
Review worker-side changes
Check for bugs, security, performance issues
Post detailed review with inline comments

Overall Assessment

Well-structured refactoring that cleanly extracts shared Google AI types and conversion logic into windmill-common, following the same pattern used for Bedrock. The move from Google's OpenAI-compatibility shim to the native Gemini API is a good architectural decision — it gives full control over the request/response format and avoids the limitations of the compatibility layer.

What looks good

Clean separation of concerns: Shared types, SSE parsing, and format conversion live in windmill-common. The API proxy handler and worker each have their own domain-specific logic.
Consistent error handling: Both streaming and non-streaming paths handle API errors consistently with status code checking and body extraction.
SSE keepalive injection: Reusing the existing inject_keepalives mechanism for the Gemini streaming proxy prevents connection drops.
Well-documented code: Module-level docs, doc comments on public functions, and inline comments explain design decisions clearly.
Thought signature support: Proper handling of Gemini 3+ thoughtSignature for multi-turn agent conversations is preserved through the refactoring.

Issues found (see inline comments)

Potential bug — double models/ prefix (google.rs:139): The model listing returns IDs like models/gemini-2.5-flash, but the chat/streaming endpoints prepend models/ again when building the URL. This would produce …/models/models/gemini-2.5-flash:streamGenerateContent.
Empty API key fallback (ai.rs:815): unwrap_or("") silently proceeds with an empty key instead of returning a clear error.
Incomplete finish_reason mapping (ai_google.rs:563): Gemini's MAX_TOKENS becomes "max_tokens" instead of OpenAI's expected "length".
Missing annotations in non-streaming response (google.rs:297): Grounding/web search citations are parsed but not included in the OpenAI-format non-streaming response.
Code duplication (ai_google.rs:363-448): openai_messages_to_gemini and the worker's convert_messages_to_gemini share ~80% of logic. Not blocking but worth noting for future maintenance.

Testing instructions

To test the changes: go to a workspace's Settings > AI page. Configure a Google AI provider by adding a resource with a valid Gemini API key. Then navigate to any Script editor (or the AI Chat in the workspace) and start a conversation using a Google AI / Gemini model. Verify that both streaming responses (real-time token output) and non-streaming responses work correctly. Also test with tool calling if possible (e.g., a script with AI agent capabilities). Finally, verify that other providers (OpenAI, Anthropic, Bedrock) still work normally — their code paths should be unaffected by this change.

backend/windmill-api/src/google.rs

backend/windmill-common/src/ai_google.rs

backend/windmill-api/src/ai.rs

backend/windmill-common/src/ai_google.rs

…ht pattern Replace the worker's `convert_messages_to_gemini` and `convert_content_to_parts_with_s3` (~130 lines) with the existing pre-flight pattern: `prepare_messages_for_api` converts S3 objects to data URLs, then the shared `openai_messages_to_gemini` handles the rest. Co-Authored-By: Claude Opus 4.6 <[email protected]>

refactor: extract google ai logic to windmill-common and use native g…

297bc77

…emini api in chat proxy Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

centdix and others added 7 commits February 26, 2026 15:31

fix: use x-goog-api-key header for google ai non-chat requests

65c06d2

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

fix: transform gemini models response to openai format and use correc…

8b2f874

…t auth header Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

fix: skip thought parts from gemini thinking models in sse stream

dfa01d2

Co-Authored-By: Claude Opus 4.5 <[email protected]>

Revert "fix: skip thought parts from gemini thinking models in sse st…

d0f367d

…ream" This reverts commit dfa01d2.

fix: handle tool calls and sanitize schemas in gemini chat proxy

7fbf7cf

Co-Authored-By: Claude Opus 4.5 <[email protected]>

centdix marked this pull request as ready for review March 3, 2026 12:19

centdix requested review from alpetric, hugocasa and rubenfiszel as code owners March 3, 2026 12:19