Merged
Conversation
gasvn
added a commit
that referenced
this pull request
Mar 27, 2026
New scripts (pure stdlib, no scipy needed): - degrees_of_unsaturation.py: DoU from formula with structural interpretation - epidemiology.py: R0/herd immunity, NNT, diagnostic stats, Bayesian post-test - stat_tests.py: chi-square, Fisher exact, linear regression (all pure Python) 5 remaining skills improved: - clinical-data-integration: ICD-10/SNOMED harmonization reasoning - electron-microscopy: TEM/cryo-EM/SEM modality selection - functional-genomics-screens: growth vs reporter screen interpretation - protein-therapeutic-design: binding surface → modality selection - multiomic-disease-characterization: cross-layer concordance reasoning 12 bundled scripts total. V10 test: 5/51 (9.8%) — 67% improvement over V6. PR #153: 68 commits.
Skills (114 total): - Rewrite 80+ skills as reasoning guides (not reference tables) - Add LOOK UP DON'T GUESS and COMPUTE DON'T DESCRIBE across all skills - Add new skills: data-wrangling (24 domain API patterns), dataset-discovery, epidemiological-analysis, data-integration-analysis, ecology-biodiversity, inorganic-physical-chemistry, plant-genomics, vaccine-design, stem-cell, lipidomics, non-coding-RNA, aging-senescence - Add Programmatic Access sections to 6 domain skills (TCGA, GWAS, spatial-transcriptomics, variant-to-mechanism, binder-discovery, clinical-trials) - Generalize all analysis skills to be data-source-agnostic - Add progressive disclosure: references/ for specialized domains - Improve skill descriptions for better triggering Tools (31 new): - RGD (4 tools), T3DB toxins, IEDB MHC binding prediction - 11 scientific calculator tools (DNA translate, molecular formula, equilibrium solver, enzyme kinetics, statistics, etc.) - AgingCohort_search (28+ longitudinal cohort registry) - NHANES_download_and_parse (XPT download + parse + age filter) - DataQuality_assess (missingness, outliers, correlations) - MetaAnalysis_run (fixed/random effects, I-squared, Q-test) - 4 dataset discovery tools (re3data, Data.gov, OpenAIRE, DataCite) Bug fixes: - Fix 50+ tool name references across skills - Fix NHANES search (dynamic CDC catalog query, not hardcoded keywords) - Fix tool return envelopes (Unpaywall, MyGene, HPA, EuropePMC) - Fix STRING, OpenTargets, ENCODE, Foldseek, STITCH, BridgeDb - Fix BindingDB test for broken API detection Router: - Add MC elimination strategy, batch processing protocol - Add 20+ bundled computation scripts - Route to all 114 skills Version bumped to 1.1.11
7e744e5 to
c21b6a4
Compare
d33disc
added a commit
to d33disc/upstream-tooluniverse
that referenced
this pull request
Apr 8, 2026
…31 new tools) Merges mims-harvard#153 (2bf5198) and version bump to 1.1.11 (4d66869) from upstream. Resolves 11 conflicts while preserving every fork-specific customization. Upstream additions: - 31 new tools: RGD, T3DB, IEDB MHC binding, 11 scientific calculators (DNA translate, molecular formula, equilibrium solver, stats, etc), AgingCohort_search, NHANES parser, DataQuality_assess, MetaAnalysis, 4 dataset discovery tools (re3data, Data.gov, OpenAIRE, DataCite) - Skills rewrite: 114 skills as reasoning guides, 12 new domain skills, new data-wrangling/dataset-discovery/epidemiology/vaccine-design - Router: MC elimination, batch processing, 20+ computation scripts - server.json bumped to 1.1.11 Fork customizations preserved: - semantic_scholar_tool.py DVS-FORK-PATCH block intact (1.05s rate limit for personal-tier S2 API keys + SEMANTIC_SCHOLAR_MIN_INTERVAL override). Upstream-added tldr/fieldsOfStudy fields integrated around the patch block, not through it. - test_semantic_scholar_tool_resilience.py: upstream's envelope-format assertions merged with fork's 4 DVS-FORK-PATCH rate-limit tests. Net: the fork's previously-broken list-vs-dict tests now pass. - agentic_tool.py API_KEY_ENV_VARS untouched (still CLAUDE_CLI + OLLAMA only - upstream did not modify this file). - restful_tool.py MonarchTool: kept fork's _CLOSURE_KEYS stripping (renamed _clean -> _strip_closure for clarity), integrated with upstream's remove_none_and_empty_values + envelope wrapping. Both behaviors compose cleanly. - SEC EDGAR tools (data JSON + 2 stubs + default_config entry) preserved alongside upstream's new iedb_prediction / popgen / epidemiology / scientific_calculator registry entries. - skills/tooluniverse symlink to upstream_skills restored (fork PR #20 made upstream_skills the source of truth; upstream's inline SKILL.md rejected). Data JSON conflicts (8 files): - admetai_tools: accepted upstream's oneOf smiles parameter schema (accepts string OR array), kept fork's richer test_examples. - datacite_tools: adopted upstream's DataCiteRESTTool type fix and appended upstream's 2 new tools (DataCite_search_datasets, DataCite_get_dataset). Kept fork's unwrapped return_schemas from PR #17 for the 2 existing tools since cli.py validates against result.get("data"), not the envelope. - ena_portal, rcsb_advanced_search, wikipathways: kept fork's simplified return_schemas, added upstream's new aliases/label fields. - iedb_tools: scripted merge preserving fork's antigen_uniprot parameter + shorthand filter alongside upstream's IEDB_search_epitopes alias. - mgi_tools: kept fork's array return_schema (primary_key/name_key names match the Alliance API response shape). - unpaywall_tools: kept fork's test_examples for check_oa_status, appended upstream's new Unpaywall_get_full_text_url tool verbatim. Regenerated derived files: - src/tooluniverse/tools/*.py (2291 stubs via scripts/build_tools.py) - src/tooluniverse/_lazy_registry_static.py (584 tool classes, +22 from upstream's new tools) - src/tooluniverse/data/skills_catalog.json (126 skill entries) - Ran ruff format on src/tooluniverse/ to reformat regenerated stubs Validation: - ruff check . -> clean - ruff format --check src/tooluniverse/ -> clean - pytest tests/unit/ tests/integration/ (877 tests) -> 0 failures - Sanity: DVS-FORK-PATCH sentinels present, API_KEY_ENV_VARS == {CLAUDE_CLI, OLLAMA}, SEC EDGAR tools present, rate-limit patch slept 1.046s under 1.0s threshold. The pre-existing environmental failure test_compose_tool::test_external _file_tools (nested Claude CLI unavailable) and 3 test_mcp_protocol tests (require .venv/bin on PATH) both fail identically on origin/main and are unrelated to this merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4 tasks
d33disc
added a commit
to d33disc/upstream-tooluniverse
that referenced
this pull request
Apr 8, 2026
…31 new tools) (#28) Merges mims-harvard#153 (2bf5198) and version bump to 1.1.11 (4d66869) from upstream. Resolves 11 conflicts while preserving every fork-specific customization. Upstream additions: - 31 new tools: RGD, T3DB, IEDB MHC binding, 11 scientific calculators (DNA translate, molecular formula, equilibrium solver, stats, etc), AgingCohort_search, NHANES parser, DataQuality_assess, MetaAnalysis, 4 dataset discovery tools (re3data, Data.gov, OpenAIRE, DataCite) - Skills rewrite: 114 skills as reasoning guides, 12 new domain skills, new data-wrangling/dataset-discovery/epidemiology/vaccine-design - Router: MC elimination, batch processing, 20+ computation scripts - server.json bumped to 1.1.11 Fork customizations preserved: - semantic_scholar_tool.py DVS-FORK-PATCH block intact (1.05s rate limit for personal-tier S2 API keys + SEMANTIC_SCHOLAR_MIN_INTERVAL override). Upstream-added tldr/fieldsOfStudy fields integrated around the patch block, not through it. - test_semantic_scholar_tool_resilience.py: upstream's envelope-format assertions merged with fork's 4 DVS-FORK-PATCH rate-limit tests. Net: the fork's previously-broken list-vs-dict tests now pass. - agentic_tool.py API_KEY_ENV_VARS untouched (still CLAUDE_CLI + OLLAMA only - upstream did not modify this file). - restful_tool.py MonarchTool: kept fork's _CLOSURE_KEYS stripping (renamed _clean -> _strip_closure for clarity), integrated with upstream's remove_none_and_empty_values + envelope wrapping. Both behaviors compose cleanly. - SEC EDGAR tools (data JSON + 2 stubs + default_config entry) preserved alongside upstream's new iedb_prediction / popgen / epidemiology / scientific_calculator registry entries. - skills/tooluniverse symlink to upstream_skills restored (fork PR #20 made upstream_skills the source of truth; upstream's inline SKILL.md rejected). Data JSON conflicts (8 files): - admetai_tools: accepted upstream's oneOf smiles parameter schema (accepts string OR array), kept fork's richer test_examples. - datacite_tools: adopted upstream's DataCiteRESTTool type fix and appended upstream's 2 new tools (DataCite_search_datasets, DataCite_get_dataset). Kept fork's unwrapped return_schemas from PR #17 for the 2 existing tools since cli.py validates against result.get("data"), not the envelope. - ena_portal, rcsb_advanced_search, wikipathways: kept fork's simplified return_schemas, added upstream's new aliases/label fields. - iedb_tools: scripted merge preserving fork's antigen_uniprot parameter + shorthand filter alongside upstream's IEDB_search_epitopes alias. - mgi_tools: kept fork's array return_schema (primary_key/name_key names match the Alliance API response shape). - unpaywall_tools: kept fork's test_examples for check_oa_status, appended upstream's new Unpaywall_get_full_text_url tool verbatim. Regenerated derived files: - src/tooluniverse/tools/*.py (2291 stubs via scripts/build_tools.py) - src/tooluniverse/_lazy_registry_static.py (584 tool classes, +22 from upstream's new tools) - src/tooluniverse/data/skills_catalog.json (126 skill entries) - Ran ruff format on src/tooluniverse/ to reformat regenerated stubs Validation: - ruff check . -> clean - ruff format --check src/tooluniverse/ -> clean - pytest tests/unit/ tests/integration/ (877 tests) -> 0 failures - Sanity: DVS-FORK-PATCH sentinels present, API_KEY_ENV_VARS == {CLAUDE_CLI, OLLAMA}, SEC EDGAR tools present, rate-limit patch slept 1.046s under 1.0s threshold. The pre-existing environmental failure test_compose_tool::test_external _file_tools (nested Claude CLI unavailable) and 3 test_mcp_protocol tests (require .venv/bin on PATH) both fail identically on origin/main and are unrelated to this merge. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Major update: 7 new tools, 107 skills with reasoning frameworks, 17+ bundled computation scripts, and comprehensive tool/skill quality improvements.
New Tools (7)
EuropePMC_get_full_text— retrieve full-text XML for open-access PMC articlesSemanticScholar_get_paper— now includes TLDR AI summary + fields of studyUnpaywall_get_full_text_url— extract open-access PDF/HTML URLs from DOIsBRENDA_get_enzyme_kinetics— Km, kcat, Ki lookup by EC number via SABIO-RKSABIO_RK_search_reactions— enzyme kinetics data searchCOD_search_structures— Crystallography Open Database crystal structure searchCOD_get_structure— crystal structure detail retrievalSkill Improvements
Key Reasoning Frameworks Added
Bundled Computation Scripts (17)
popgen_calculator, translate_dna, sequence_tools, iv_drip_rate, molecular_formula, fluid_calculations, degrees_of_unsaturation, epidemiology, stat_tests, herd_immunity, radioactive_decay, burn_fluids, equilibrium_solver, env_risk_assessment, enzyme_kinetics, chemistry_facts, biology_facts
Tool Quality Fixes
Version
1.1.10 → 1.1.11
Test plan
tu test): 8/8 passed