Skip to content

Support SweRank code re-ranking#338

Open
aaryanshroff wants to merge 19 commits intocastorini:mainfrom
aaryanshroff:swerank-support
Open

Support SweRank code re-ranking#338
aaryanshroff wants to merge 19 commits intocastorini:mainfrom
aaryanshroff:swerank-support

Conversation

@aaryanshroff
Copy link
Contributor

@aaryanshroff aaryanshroff commented Feb 4, 2026

Pull Request Checklist

Reference Issue

Please provide the reference to issue this PR is addressing (# followed by the issue number). If there is no associated issue, write "N/A".

ref:

Checklist Items

Before submitting your pull request, please review these items:

  • Have you followed the contributing guidelines?
  • Have you verified that there are no existing Pull Requests for the same update/change?
  • Have you updated any relevant documentation or added new tests where needed?

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no API changes)
  • Documentation content changes
  • Reproduction logs
  • Other...
    • Description:

Raptors65 and others added 18 commits November 5, 2025 12:07
Add get_content() to Candidate dataclass to centralize the doc key
lookup logic (text/segment/contents/content/body/passage). Update
_convert_doc_to_prompt_content and _create_prompt_code to use it,
fixing a bug where the CODE path used doc.get("content") missing
the "contents" key.
Removes get_content() from Candidate and reverts _convert_doc_to_prompt_content
back to taking doc: Dict directly. Fixes the only real bug: the final else
branch now uses doc.get("passage", "") instead of doc["passage"] to avoid
KeyError on malformed docs.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
TEMPLATES = files("rank_llm.rerank.prompt_templates")


class RerankType(Enum):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SweRank uses a different prompt truncation/creation algorithm than RankLLM. This enum lets us choose which one to use based on the task.

@@ -0,0 +1,11 @@
method: "singleturn_listwise"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Lucas's PR: #323

token_str = " > ".join([f"[{i+1}]" for i in range(current_window_size)])

_output_token_estimate = len(self._tokenizer.encode(token_str)) - 1
_output_token_estimate = len(self._tokenizer.encode(token_str)) + 2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SweRank output was getting interrupted, leading to nonsense results

else:
return re.sub(r"\[(\d+)\]", r"(\1)", s)

def _create_prompt_code(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return messages, self.get_num_tokens(messages)
return prompt, self.get_num_tokens(prompt)

def _create_prompt_text(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old / RankLLM-style truncation algorithm

@ronakice ronakice marked this pull request as ready for review February 19, 2026 19:42
@ronakice
Copy link
Member

@claude review

@claude
Copy link

claude bot commented Feb 19, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@aaryanshroff aaryanshroff changed the title Swerank support Support SweRank code re-ranking Feb 19, 2026
@ronakice
Copy link
Member

ronakice commented Mar 3, 2026

@codex review

@chatgpt-codex-connector
Copy link

To use Codex here, create a Codex account and connect to github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants