Skip to content

docs: update detection core with tips for using Gemini integration#1925

Merged
Borda merged 5 commits intoroboflow:developfrom
tberends:docs/update-detection-gemini
Feb 4, 2026
Merged

docs: update detection core with tips for using Gemini integration#1925
Borda merged 5 commits intoroboflow:developfrom
tberends:docs/update-detection-gemini

Conversation

@tberends
Copy link
Copy Markdown
Contributor

@tberends tberends commented Aug 2, 2025

Description

On request of @SkalskiP at PR: https://github.com/roboflow/notebooks/pull/384

This PR improves the documentation regarding the ordering of content in requests that combine images with text prompts. Following Google's Gemini API best practices, text prompts are now placed after image parts in the contents array when using a single image with text.

Type of change

  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

According to the Gemini API documentation on image prompts, when using a single image with text, the recommended approach is to place the text prompt after the image part in the contents array. This ordering has been shown to produce significantly better results in practice.

In our testing with Process & Instrument Diagrams (P&IDs) using object detection, this reordering led to drastically improved accuracy in bounding box positioning. While the object labels were already accurate, the spatial precision of detected elements improved considerably with the optimized prompt ordering

Docs

  • Docs updated? What were the changes: updated the tips for prompt engineering

@tberends tberends requested a review from SkalskiP as a code owner August 2, 2025 11:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Detections.from_lmm() / Detections.from_vlm() documentation to add Gemini-specific prompt engineering tips, emphasizing best-practice ordering for mixed image+text requests.

Changes:

  • Added a “Prompt engineering” tip block for Gemini 2.0 describing normalized box coordinates and recommended prompt guidance.
  • Added a note for Gemini 2.5 about placing the text prompt after the image part in the contents array for single-image prompts.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72%. Comparing base (0ebab21) to head (47c0773).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff           @@
##           develop   #1925   +/-   ##
=======================================
  Coverage       72%     72%           
=======================================
  Files           61      61           
  Lines         7245    7245           
=======================================
  Hits          5242    5242           
  Misses        2003    2003           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Borda Borda merged commit 5c5bfe5 into roboflow:develop Feb 4, 2026
24 checks passed
@Borda Borda added the documentation Improvements or additions to documentation label Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants