Skip to content

Add Exa as a search engine option in WebSearchTool#2139

Open
10ishq wants to merge 3 commits intohuggingface:mainfrom
10ishq:add-exa-search-engine
Open

Add Exa as a search engine option in WebSearchTool#2139
10ishq wants to merge 3 commits intohuggingface:mainfrom
10ishq:add-exa-search-engine

Conversation

@10ishq
Copy link
Copy Markdown

@10ishq 10ishq commented Mar 31, 2026

Summary

Adds "exa" as a new engine parameter value for WebSearchTool, following the same pattern as the existing "bing" engine added in #1313.

Changes

  • Added search_exa() method to WebSearchTool class (~35 lines, 1 file)
  • Uses the Exa REST API directly via requests — no new dependencies
  • Returns search results with highlights as descriptions
  • Requires EXA_API_KEY environment variable

Usage

from smolagents import WebSearchTool

tool = WebSearchTool(engine="exa")
results = tool("latest AI research papers")
print(results)

Tests

Added 4 unit tests with mocked API calls:

  • test_exa_missing_api_key — raises ValueError when EXA_API_KEY is not set
  • test_exa_search_results — verifies correct parsing and markdown output
  • test_exa_no_results — raises exception on empty results
  • test_exa_empty_highlights — gracefully handles missing highlights

All existing tests continue to pass.

Add 'exa' as a new engine parameter value for WebSearchTool, following
the same pattern as the existing 'bing' engine. Uses the Exa REST API
directly with requests (no extra dependencies beyond what's already
required). Returns search results with highlights as descriptions.

Requires EXA_API_KEY environment variable. Includes x-exa-integration
header for usage tracking.

Co-Authored-By: Tanishq Jaiswal <tanishq.jaiswal97@gmail.com>
@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 1, 2026

Hey @albertvillanova — would love your review on this when you get a chance! This follows the same pattern as the Bing engine addition in #1313 (which @aymeric-roucher reviewed). Happy to address any feedback.

Copy link
Copy Markdown

@VANDRANKI VANDRANKI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean implementation that follows the existing engine pattern correctly. Two issues worth fixing before merge.

Blocker: missing timeout on requests.post

Every other search method in this file was just updated to add timeout=30 in PR #2140 (currently open). search_exa has the same missing timeout. Without it, a hanging Exa endpoint will block the agent indefinitely with no way to recover. The fix is one line:

response = requests.post(
    "https://api.exa.ai/search",
    headers={...},
    json={...},
    timeout=30,
)

Bug: highlights can be None, not just absent

The result description is built with:

"description": " ".join(result.get("highlights", []))

The Exa API returns "highlights": null for results where highlight generation fails or was not requested, rather than omitting the key entirely. result.get("highlights", []) returns None in that case, and " ".join(None) raises TypeError. The safe form is:

"description": " ".join(result.get("highlights") or [])

The or [] coerces both a missing key and an explicit null to an empty list.

Note on test coverage

The test suite covers the three main paths cleanly (missing key, successful results, empty results). One edge case worth adding: a result where highlights is null in the response to confirm the or [] fix holds. This would prevent the bug from regressing silently.


Aside from the two issues above, the implementation is straightforward and the x-exa-integration: smolagents header is a nice touch for API attribution.

- Add timeout=30 to requests.post in search_exa() to prevent hanging
- Use 'or []' to safely handle highlights being null vs absent
- Add test_exa_null_highlights to cover the null highlights edge case

Co-Authored-By: Tanishq Jaiswal <tanishq.jaiswal97@gmail.com>
@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 10, 2026

Thanks for the thorough review @VANDRANKI! All three points addressed in the latest commit — added timeout=30, fixed the null highlights handling with or [], and added a dedicated test for it.

@VANDRANKI
Copy link
Copy Markdown

All three points addressed:

  • timeout=30 added to the requests.post call.
  • highlights or [] handles the null case correctly.
  • Dedicated test for highlights: None covers the edge case.

LGTM.

@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 13, 2026

Hey @VANDRANKI, thanks so much for the review! Can you please approve and merge? That'd be great.

Thanks again for your time.

Copy link
Copy Markdown

@VANDRANKI VANDRANKI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All points addressed. LGTM.

@VANDRANKI
Copy link
Copy Markdown

Approved. I'm not a maintainer on this repo so I can't merge directly — a maintainer will need to do that. You can tag @aymeric-roucher or another maintainer to get eyes on it.

@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 14, 2026

Hey @aymeric-roucher I would really appreciate it if you could help merge the PR.

Thanks.

@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 17, 2026

Gentle bump — this has been reviewed and approved, just needs a maintainer merge. @julien-c @merveenoyan would either of you be able to take a look? It's a small addition (~35 lines) following the existing Bing engine pattern from #1313. Thanks!

@VANDRANKI
Copy link
Copy Markdown

Adding a note as the reviewer: the implementation is clean and all feedback was addressed promptly. Follows the Bing engine pattern closely, includes proper timeout handling, null-safe highlights, and 5 tests covering the edge cases. Ready to merge from a code standpoint.

Use getattr(self, 'timeout', 30) instead of hardcoded timeout=30
so search_exa respects self.timeout when the configurable timeout
parameter is added to WebSearchTool (see huggingface#2198).

Co-Authored-By: Tanishq Jaiswal <tanishq.jaiswal97@gmail.com>
@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 23, 2026

Updated the timeout handling based on feedback from #2198search_exa now uses getattr(self, "timeout", 30) so it will respect a configurable self.timeout when that lands, while defaulting to 30s in the meantime.

@albertvillanova would really appreciate a quick look when you have a moment — it's a small, self-contained addition with tests and an approval already in place. Happy to address any concerns. Thanks!

@10ishq
Copy link
Copy Markdown
Author

10ishq commented Apr 24, 2026

Hey @julien-c thanks so much for your review!

Would be amazing if you or @VANDRANKI were able to merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants