imporved web search for pr #286#310
Open
ayuuuvauuu wants to merge 20 commits intobadrisnarayanan:mainfrom
Open
imporved web search for pr #286#310ayuuuvauuu wants to merge 20 commits intobadrisnarayanan:mainfrom
ayuuuvauuu wants to merge 20 commits intobadrisnarayanan:mainfrom
Conversation
- Fix Content-Length byte framing bug in MCP python server - Add logic to read from ~/.claude/settings.json in MCP python server - Fix web-search virtual model routing to skip standard validation - Ensure web-search virtual model injects googleSearch tools into Google Generative AI request format
Replaced manual stdout/stdin framing with mcp.server.stdio_server and switched from hardcoded proxy model from web-search to gem-3-flash
- Reduce max_tokens from 1024 to 256 to force concise responses - Add strict system prompt instructing model to perform exactly ONE search - Prohibit conversational filler and multi-step reasoning - Reduces search time from ~3m down to ~5s and saves quota Co-Authored-By: Claude Opus 4.6 <[email protected]>
Without this field, the proxy defaults to enabling thinking for gemini-3-flash (gemini-3+ always matches isThinkingModel), causing 3-4 minute delays on simple search queries. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Allow clients to explicitly disable thinking by sending
thinking: { type: "disabled" | "none" }. The proxy previously
ignored this and unconditionally enabled thinking for all
Gemini 3+ and Claude thinking models.
Normal coding requests are unaffected since they never send
this flag.
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace LLM-based search (gemini-3-flash) with direct web search via ddgs library. The old approach routed queries through the proxy model which answered from its 2024 training data cutoff instead of live results, and took 3-4 minutes per query. Now searches are instant, return live results, and use zero proxy tokens. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Returning traceback.format_exc() in tool responses exposes internal file paths and library versions. Log to stderr for debugging, return only the sanitized error message. Co-Authored-By: Claude Opus 4.6 <[email protected]>
DDGS().text() is synchronous and blocks the async event loop while waiting on HTTP. Wrap in asyncio.to_thread() to run it in a thread pool executor. Co-Authored-By: Claude Opus 4.6 <[email protected]>
The tool description claimed "Google Search" but the implementation uses the ddgs (DuckDuckGo) library. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Cap search queries at 500 characters to prevent abuse or accidental oversized inputs before they reach the DuckDuckGo API. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Use ~= instead of >= to prevent breaking changes from major version bumps while still allowing patch and minor updates. Co-Authored-By: Claude Opus 4.6 <[email protected]>
badrisnarayanan
requested changes
Mar 21, 2026
Owner
badrisnarayanan
left a comment
There was a problem hiding this comment.
Thanks for the contribution! The MCP SDK rewrite is cleaner than the raw stdio approach in #286, but there are a few issues:
1. DuckDuckGo is a downgrade from Google Search
PR #286 routed queries through Gemini's Google Search grounding, which gives significantly better search results. Swapping that for DuckDuckGo (ddgs) is a regression in the core feature. The whole point of the web-search model was to leverage Google's search quality via the existing proxy infrastructure.
2. Unrelated changes
package-lock.jsonversion bump (2.5.0 → 2.7.7) and theaccbin alias don't belong in this PR- The thinking disable logic in
request-converter.jsis unrelated to web search — should be a separate PR if needed
3. No setup documentation
How should users configure Claude Code to use this MCP server? A mcpServers config snippet (for ~/.claude/settings.json) would be needed.
4. No tests
Same gap as #286 — at minimum a test that invokes the search tool and verifies the response format.
Add a Python MCP server that provides Google Search grounding via Gemini through the Antigravity Proxy. Uses the official MCP Python SDK for reliable stdio transport. - Routes queries through gemini-3-flash with Google Search grounding - Uses minimal thinking budget (budget_tokens: 1) for fast responses - Validates query length (max 500 chars) - Logs errors to stderr to avoid leaking to MCP client Co-Authored-By: Claude Opus 4.6 <[email protected]>
Verify that Google Search grounding via gemini-3-flash works: - Basic search returns text content with 200 status - Minimal thinking budget (budget_tokens: 1) produces valid results - Response format matches Anthropic Messages API shape Co-Authored-By: Claude Opus 4.6 <[email protected]>
…laude-proxy into feat/web-search
Add native Google Search grounding to the proxy request converter by
separating tools named `google_search` or `googleSearchRetrieval` from
regular function declarations and converting them to Gemini-native
`{ google_search: {} }` entries. This enables any Anthropic-format
request to activate live search grounding on Gemini models.
Rewrite the web search MCP server to use Google Search grounding through
the proxy instead of DuckDuckGo, returning live results with source URLs.
Revert unrelated changes (package-lock.json version bump, thinking
disable logic in request-converter.js) that were introduced in earlier
iterations.
- Proxy separates grounding tools from functionDeclarations in tools array
- MCP server sends google_search tool + budget_tokens: 1 for fast search
- Updated tests to verify grounding returns live data (4/4 passing)
- Added grounding documentation to CLAUDE.md
- Replaced ddgs dependency with requests in requirements.txt
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

fixed this
for #286