Skip to content

Add codes for LLM4Chem dataset#13

Open
coldchair wants to merge 1 commit intoSciEval:developfrom
coldchair:llm4chem
Open

Add codes for LLM4Chem dataset#13
coldchair wants to merge 1 commit intoSciEval:developfrom
coldchair:llm4chem

Conversation

@coldchair
Copy link
Copy Markdown

Add support for the LLM4Chem dataset and its subtasks

This PR adds dataset integration for LLM4Chem(https://github.com/OSU-NLP-Group/LLM4Chem) with support for the following subtasks:

'forward_synthesis',
'retrosynthesis',
'molecule_captioning',
'molecule_generation',
'name_conversion-i2f',
'name_conversion-i2s',
'name_conversion-s2f',
'name_conversion-s2i',
'property_prediction-esol',
'property_prediction-lipo',
'property_prediction-bbbp',
'property_prediction-clintox',
'property_prediction-hiv',
'property_prediction-sider',
'retrosynthesis_uspto50k'

Notes:

  • retrosynthesis_uspto50k comes from otori-bird/retrosynthesis(https://github.com/otori-bird/retrosynthesis) and has been integrated seamlessly with the other retrosynthesis tasks.
  • The content within <think></think> will be removed.
  • Some metrics currently support only top-1; top-k support is yet to be developed (mainly depending on framework support).

@Geniusyingmanji Geniusyingmanji changed the base branch from main to develop November 17, 2025 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant