Skip to content

MPC working#136

Open
grpinto wants to merge 764 commits into
lueckenlab:mainfrom
grpinto:main
Open

MPC working#136
grpinto wants to merge 764 commits into
lueckenlab:mainfrom
grpinto:main

Conversation

@grpinto
Copy link
Copy Markdown

@grpinto grpinto commented May 4, 2026

Gonçalo Pinto :

  • Going to work on MCPs for Patpy

VladimirShitov and others added 30 commits September 13, 2025 01:32
…entation_tutorial

- Add the tutorials for the GloScope reimplementation
- Fix R implementation of GloScope
benjaminfreyuu and others added 12 commits May 4, 2026 16:55
Adds a markdown skill library (Claude Code per-folder SKILL.md layout) that
mirrors the public API surface (pp / tl / pl / datasets), so coding agents
can use patpy without rediscovering the API from source each session.

Pure additive change: no module behavior is altered. Skills ship with the
wheel via the default hatchling include for files under src/patpy/.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Adds a markdown skill library (Claude Code per-folder SKILL.md layout) that
mirrors the public API surface (pp / tl / pl / datasets), so coding agents
can use patpy without rediscovering the API from source each session.

Pure additive change: no module behavior is altered. Skills ship with the
wheel via the default hatchling include for files under src/patpy/.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Adds a BioContextAI Registry-compatible MCP server that exposes
dataset search and download from CellxGene Discover so any
MCP-capable agent (Claude Desktop, Cursor, mcp-cli + Ollama, ...)
can fetch single-cell datasets by disease, tissue, or ontology
term. The server complements (instead of duplicating) the existing
MaxMLang/cxg-census-mcp and biocontext-ai/anndata-mcp registry
servers.

- New src/patpy/mcp/ package with a multi-source plugin
  architecture (DataSource protocol) and a CellxGene Discover REST
  source with pagination, retries, 24 h on-disk index cache, and
  streaming download with SHA-256 verification.
- New mcp/ directory with a schema-validated meta.yaml ready to be
  PR'd into biocontext-ai/registry, plus a Dockerfile and
  .dockerignore.
- Documentation in docs/mcp.md (linked from docs/index.md and a
  new section in README.md) covering install, agent config
  snippets, and a chaining recipe with cxg-census-mcp and
  anndata-mcp.
- Test suite under tests/: server tool registration check,
  twelve discover-client unit tests with a hand-rolled fake
  session, and registry-schema validation against the upstream
  schema.json (committed offline as a fixture).

Co-authored-by: Cursor <[email protected]>
@VladimirShitov
Copy link
Copy Markdown
Member

Thanks, looks interesting, I'll review the content soon. But the main question is how to keep it relevant? Could you add prompts that you used to generate it, or come up with a way to update the skills on new code updates? It would be cool to have a GitHub action, backed by Claude, that updates skills on pull requests.

For more effective prompting, point the agent to CHANGELOG.md

@grpinto
Copy link
Copy Markdown
Author

grpinto commented May 5, 2026

Will do it today, thanks Vlad

@VladimirShitov
Copy link
Copy Markdown
Member

Claude tells that these skills will be available to patpy developers, but not users.

image

Should we also create and publish a plugin? Looks super easy, but would be helpful to have a clear workflow on how to keep it updated and what needs to be done on a new package release

@grpinto
Copy link
Copy Markdown
Author

grpinto commented May 5, 2026

Sounds good, I am trying to finish fetching datasets from cellXgene

grpinto and others added 5 commits May 5, 2026 10:30
…caches

Aligns the mcp/ subproject with biocontext-ai/mcp-server-cookiecutter and
keeps test caches out of git going forward.

mcp/ subproject (cookiecutter alignment):
- Add mcp/mcp.json (BioContextAI Registry client snippet, mirrors
  biocontext-ai-anndata-mcp and MaxMLang-cxg-census-mcp siblings).
- Add .github/workflows/test-patpy-mcp.yaml and build-patpy-mcp.yaml
  so mcp/tests/ runs on every PR (was previously only at release-tag time)
  and a broken mcp/pyproject.toml is caught before tagging. Both are
  path-filtered to mcp/** so they don't run on parent-only changes.
- Enrich mcp/README.md with the cookiecutter's four canonical install
  patterns (uvx PyPI / uvx git / uvx local / pip) plus copyable
  mcp.json snippets, and document the two-file registry submission.

Skills (correct API divergences surfaced by an end-to-end run):
- sample_representation/SKILL.md: Pseudobulk's cell_group_key is a
  required positional arg (not optional); document that explicitly with
  an inline TypeError example, and update the minimal example to pass it.
- supervised_methods/SKILL.md: PULSAR is zero-shot inference using a
  pretrained HuggingFace model on UCE foundation-model embeddings, not
  a trainable classifier on PCA features. Split the "common signature"
  into trainable (MixMIL/PaSCient) vs zero-shot (PULSAR), document
  layer="X_uce", device="cuda", and the HF-model download. Add the
  mixmil + torch-scatter install gotcha.

Repo hygiene:
- Tighten .gitignore: ignore .cache-*/, .venv-*/, /outputs/,
  /run_*.py, /.cellxgene-test-target.json, and /mcp/dist/. Prevents the
  patpy-mcp test cache and any future scratch from being committed.
- Remove already-committed scratch from HEAD: 62.7 MB Slide-seq h5ad,
  9.5 MB CellxGene catalog dump, the test-target sidecar, and the
  /.cellxgene-test-target.json probe file. Historical blobs will be
  evicted from .git/objects via a follow-up git filter-repo run.

AGENTS.md: document the new mcp.json + workflows so future agents see
them in the layout tree.

Co-authored-by: Cursor <[email protected]>
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

benjaminfreyuu and others added 4 commits May 6, 2026 00:13
- Pack patpy skill modules (datasets, preprocessing, sample_representation,
  supervised_methods, evaluation, plotting) for export to Claude Code,
  Codex, and BioContext layouts via patpy-export-skills /
  patpy-export-biocontext
- Add patpy.mcp FastMCP server (patpy-skills-mcp console script) exposing
  dataset_summary, preprocess_dataset, build_representation,
  evaluate_representation, run_supervised_prediction, generate_plot,
  simulate_dataset as MCP tools
- Add example client configs under examples/mcp/
- Add tests for skill export bundles and MCP tool wrappers
- Add [mcp] optional dependency group
# Conflicts:
#	.gitignore
#	CHANGELOG.md
#	README.md
#	docs/index.md
#	pyproject.toml
#	src/patpy/skills/SKILL.md
#	src/patpy/skills/sample_representation/SKILL.md
#	src/patpy/skills/supervised_methods/SKILL.md
#	src/patpy/tl/evaluation.py
#	tests/test_evaluation.py
#	tests/test_skills.py
@VladimirShitov
Copy link
Copy Markdown
Member

Is this ready for review?

@grpinto
Copy link
Copy Markdown
Author

grpinto commented May 12, 2026

@VladimirShitov I am not sure yet @benjaminfreyuu did you finish what you wanted to do ? I have not refactored the code. It is something that still needs to be done, reorganize stuff and test like on a large scale.

@benjaminfreyuu
Copy link
Copy Markdown
Contributor

The skills part is working, MCP claude code integration and biocontextAI are added as well.
Claude Code import is tested. Not tested for Claude.ai. BiocontextAI addition not completed yet.
Either test it thoroughly, could also be later added (the MCP addition of the package, not the goncalo dataset MCP stuff). Wdyt? @grpinto @VladimirShitov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants