Skip to content

Commit 988fb9c

Browse files
feat: add cog doctor command (#2923)
* feat(doctor): add core types and Check interface * refactor(schema): export tree-sitter helpers for reuse by doctor checks * feat(doctor): add runner with check orchestration and fix flow * feat(doctor): add check registry * feat(doctor): add CLI command with output formatting * feat(doctor): add config parse check * feat(doctor): add deprecated config fields check * feat(doctor): add predict reference check * feat(doctor): add pydantic BaseModel check with auto-fix * feat(doctor): add deprecated imports check with auto-fix * feat(doctor): add Docker availability check * feat(doctor): add Python version check * feat(doctor): add config schema validation check * feat(doctor): add missing type annotations check * fix(doctor): address lint issues in printDoctorResults Replace if-else chain with switch statement (gocritic) and use strings.Join instead of string concatenation in a loop (modernize). * test(doctor): add integration tests for cog doctor command Add 8 txtar integration tests covering: - Clean project (no issues found, exit 0) - Pydantic BaseModel detection (exit 1) - Pydantic BaseModel auto-fix with --fix - Deprecated imports detection (exit 1) - Deprecated imports auto-fix with --fix - Missing predict class reference (exit 1) - Exit code 1 when predict file is missing - Deprecated config fields as warnings (exit 0) * fix(doctor): address review findings — fix correctness bugs, consolidate config loading, improve fix robustness Critical fixes: - HasErrors() now accounts for check internal errors (exit code 1) - hasArbitraryTypesAllowed uses tree-sitter AST instead of string matching (no false positives) - addToCogImport handles missing 'from cog import' line - Config loading consolidated onto CheckContext (eliminates 4x redundant Load calls) High priority fixes: - Respect --file flag instead of hardcoding cog.yaml - Handle 'import pydantic' module-level style in fix - Deprecated imports fix uses tree-sitter re-scan instead of fragile string matching - ConfigPredictRefCheck reuses cached Python files from ctx.PythonFiles - model_config fix preserves other ConfigDict kwargs Medium fixes: - Docker/Python exec commands have 5-second timeout - Multi-line parenthesized imports handled by fix - File permissions preserved (not hardcoded to 0o644) - printDoctorResults counts per-finding consistently * chore: update cli docs Signed-off-by: Mark Phelps <[email protected]> * chore: update llms docs Signed-off-by: Mark Phelps <[email protected]> * Update pkg/doctor/runner.go Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com> Signed-off-by: Mark Phelps <[email protected]> * chore: fix context passing Signed-off-by: Mark Phelps <[email protected]> * fix: remove deprecated name references and orphaned imports in doctor --fix The deprecated imports fix was only removing the import line itself (e.g. 'from cog.types import ExperimentalFeatureWarning') but leaving behind statements that reference the removed symbol (e.g. 'warnings.filterwarnings("ignore", category=ExperimentalFeatureWarning)'). Use tree-sitter AST traversal to: 1. Find and remove statements referencing deprecated names by walking identifier nodes recursively 2. Detect and remove orphaned 'import X' statements where the module is no longer referenced anywhere in the file * chore: pass context Signed-off-by: Mark Phelps <[email protected]> * fix: pass context.Background() in deprecated imports fix tests The CheckContext.ctx field was nil in tests that exercise DeprecatedImportsCheck.Fix(), causing a nil pointer dereference when tree-sitter's ParseCtx was called with a nil context. * fix(doctor): propagate context in env and config checks; fix parser typo - Rename undefined 'content' to exported 'Content' in schema/python/parser.go (fixes build failure from unexported helper rename) - Thread CheckContext.ctx through parser.ParseCtx and exec.CommandContext instead of context.Background() so cancellation propagates: - check_config_predict_ref.go:78 — use ctx.ctx in parser.ParseCtx - check_env_docker.go — derive timeout from ctx.ctx - check_env_python_version.go — derive timeout from ctx.ctx - Update tests to populate CheckContext.ctx * fix(doctor): deterministic map iteration + cancellation short-circuit in Run() - Sort PythonFiles and affectedFiles keys before iteration in DeprecatedImportsCheck and PydanticBaseModelCheck (Check + Fix) so findings order and write order are deterministic across runs. - Check ctx.Err() at the top of the Run() loop so Ctrl-C is respected between checks, even for checks that don't inspect the context themselves. * test(doctor): harden test CheckContext literals + add cancellation tests - Populate ctx.ctx in all test CheckContext{} literals so future changes that introduce ctx.ctx usage won't panic on nil pointer dereference. - Add cancellation-propagation tests for DockerCheck, PythonVersionCheck, and ConfigPredictRefCheck to document intent and prevent regression of the context propagation fix from the previous commit. * feat(doctor): detect aliased pydantic.BaseModel imports inheritsPydanticBaseModel now looks up the actual identifier used in the superclass list against the ImportContext, so aliased imports like from pydantic import BaseModel as PBM class X(PBM): are detected. Also tightens the attribute-access path to check the dotted 'pydantic.BaseModel' via tree-sitter fields instead of raw text, so spacing/formatting differences don't cause false negatives. Adds test coverage for both the aliased import form and the 'import pydantic' + 'pydantic.BaseModel' dotted form. * fix(doctor): handle direct assignment nodes in hasArbitraryTypesAllowed Class body statements in tree-sitter-python can appear as either an expression_statement wrapping an assignment OR as an assignment node directly. The previous code only checked the expression_statement case and would silently skip standalone assignment nodes. Now handles both, matching the pattern used in pkg/schema/python parser.go collectModuleScope. * refactor(doctor): replace line-based pydantic fixer with AST-based edits The previous fixPydanticBaseModel walked the file line-by-line with strings.HasPrefix and strings.Replace to find and rewrite pydantic BaseModel classes. This was fragile on parenthesized multi-line imports, trailing commas, aliased imports, and silently over-rewrote unrelated pydantic classes in the same file. The new implementation is narrow-scope and AST-driven: 1. Parse the file once with tree-sitter. 2. Collect classes that both inherit pydantic.BaseModel AND have arbitrary_types_allowed=True -- only these are rewritten. 3. Emit byte-range edits for: - the superclass reference (BaseModel / PBM / pydantic.BaseModel) - the model_config = ConfigDict(...) assignment (remove the arbitrary_types_allowed keyword arg, or drop the whole line if ConfigDict becomes empty) - rewriting pydantic.ConfigDict to ConfigDict for the touched classes 4. Re-parse and clean up imports: - drop BaseModel and unused ConfigDict from 'from pydantic import ...' - remove 'import pydantic' if no pydantic.* references remain - preserve 'import pydantic' if pydantic.Field/validator/etc. is still used in the file 5. Ensure cog.BaseModel is available, using an aliased import ('BaseModel as CogBaseModel') if another class in the file still needs pydantic.BaseModel under its bare name. Adds tests for: - unrelated pydantic class preserved untouched (uses aliased cog import) - import pydantic preserved when pydantic.Field remains - ConfigDict(arbitrary_types_allowed=True,) with trailing comma * fix(doctor): use cached source for deprecated-imports Fix() removeDeprecatedImportsAST uses the tree from ctx.PythonFiles[path].Tree, which was parsed against ctx.PythonFiles[path].Source. Reading fresh bytes from disk between Check() and Fix() risks a byte-offset mismatch if the file has changed, producing garbage output from Content() extraction. Switch to using pf.Source so the tree and source match by construction. The fix still writes the result to disk using the current file mode from os.Stat. * refactor(doctor): AST-based removal of deprecated imports Replaces removeDeprecatedImportLines (a 112-line state machine that walked source line-by-line with string manipulation and multi-line parenthesized-import tracking) with an AST-driven approach. The new editDropFromImport walks import_from_statement nodes via tree-sitter, collects the imported names (honoring aliased_import and import_list structures), and produces a byte-range edit that rewrites the statement to drop the target names — or removes the whole statement line if no names remain. This is more robust against: - multi-line parenthesized imports (from pydantic import (A, B, C)) - trailing commas and unusual whitespace - aliased imports (from X import Y as Z) - trailing comments and line continuations * chore(doctor): idiomatic cleanups + documentation - Delete the isParseError wrapper; call errors.As directly at both call sites. - Replace splitPredictRef's hand-rolled backward scan with strings.LastIndex; return named (file, class) values instead of a [2]string. - Rename ctxt -> cc in buildCheckContext for clarity (ctx is used for context.Context elsewhere). - Hoist the envCheckTimeout constant to avoid repeating 5 * time.Second in DockerCheck and PythonVersionCheck. - Add comments explaining the Severity ordering convention (smaller = worse) and the rationale for storing context.Context as an unexported field on CheckContext. * fix(cli): prevent duplicate error output from cog doctor runDoctor previously returned fmt.Errorf("doctor found errors") on failure, which cobra then printed to stderr as 'Error: doctor found errors' after printDoctorResults had already displayed the full summary. Set SilenceErrors and SilenceUsage on the doctor command, and return a sentinel errDoctorFoundIssues error purely for non-zero exit. Findings are now reported exactly once, via printDoctorResults. * chore(doctor): release tree-sitter resources via CheckContext.Close() Each *sitter.Tree holds a C-allocated buffer. Tree-sitter's runtime.SetFinalizer eventually releases these via GC, but calling Close() explicitly releases them promptly. Adds a CheckContext.Close() method that walks PythonFiles and closes each cached Tree. Run() now defers cc.Close() so trees are released when a doctor run finishes, which matters if the runner is reused from a long-lived process. * fix(doctor): close tree-sitter parse trees to prevent memory leaks Add defer tree.Close() after every successful ParseCtx() call to free C-allocated memory from tree-sitter. Fixes 6 locations across 3 files: - check_config_predict_ref.go - check_python_deprecated_imports.go (2 locations) - check_python_pydantic_basemodel.go (3 locations) * mark cog doctor as experimental in CLI output and docs * chore: regen llm docs Signed-off-by: Mark Phelps <[email protected]> * fix(doctor): address review findings — consolidate types, fix hardcoded config filename, improve --fix suggestion * ci: force-reinstall nox in lint-python to fix pipx cache issue Same class of problem as Rust components: mise caches ~/.local/share/mise which has the nox install directory, but the nox binary is a wrapper pointing to a uv/pipx-managed venv outside the cached path. On cache hit, mise skips reinstall but the binary is missing. Fix: mise install pipx:nox --force after mise-action. --------- Signed-off-by: Mark Phelps <[email protected]> Signed-off-by: Mark Phelps <[email protected]> Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com>
1 parent f41895d commit 988fb9c

31 files changed

Lines changed: 4081 additions & 105 deletions

.github/workflows/ci.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,12 +34,14 @@ env:
3434
# Note: do NOT add rustup/rustup-init to MISE_DISABLE_TOOLS — mise needs
3535
# them to install Rust components (rustfmt, clippy) specified in mise.toml.
3636
#
37-
# mise cache & Rust components:
38-
# mise-action caches ~/.local/share/mise but NOT ~/.rustup. On cache hit,
39-
# mise sees rust as "installed" (symlink exists) and skips reinstall, but
40-
# the actual rustup toolchain + components (rustfmt, clippy) are missing.
41-
# Rust jobs must run `rustup component add` after mise-action to ensure
42-
# components are present regardless of cache state.
37+
# mise cache caveats:
38+
# mise-action caches ~/.local/share/mise but NOT external tool state:
39+
# - ~/.rustup (Rust toolchain + components like rustfmt, clippy)
40+
# - pipx/uv venvs (nox binary is a wrapper pointing to a venv outside mise)
41+
# On cache hit, mise sees tools as "installed" and skips reinstall, but
42+
# the actual binaries are missing. Affected jobs must force-reinstall:
43+
# - Rust jobs: `rustup component add rustfmt clippy`
44+
# - lint-python: `mise install pipx:nox --force`
4345
#
4446
# mise cache keys: each job uses its own key (mise-ci-{job_id}) so parallel
4547
# jobs don't race on a single shared cache entry.
@@ -352,6 +354,8 @@ jobs:
352354
- uses: jdx/mise-action@v4
353355
with:
354356
cache_key_prefix: mise-ci-${{ github.job }}
357+
- name: Ensure nox is installed
358+
run: mise install pipx:nox --force
355359
- name: Lint Python
356360
run: mise run lint:python
357361

docs/cli.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,26 @@ cog build [flags]
6666
--use-cog-base-image Use pre-built Cog base image for faster cold boots (default true)
6767
--use-cuda-base-image string Use Nvidia CUDA base image, 'true' (default) or 'false' (use python base image). False results in a smaller image but may cause problems for non-torch projects (default "auto")
6868
```
69+
## `cog doctor`
70+
71+
Diagnose and fix common issues in your Cog project.
72+
73+
NOTE: cog doctor is experimental. Behavior and checks may change in future versions.
74+
75+
By default, cog doctor reports problems without modifying any files.
76+
Pass --fix to automatically apply safe fixes.
77+
78+
```
79+
cog doctor [flags]
80+
```
81+
82+
**Options**
83+
84+
```
85+
-f, --file string The name of the config file. (default "cog.yaml")
86+
--fix Automatically apply fixes
87+
-h, --help help for doctor
88+
```
6989
## `cog exec`
7090

7191
Execute a command inside a Docker environment defined by cog.yaml.

docs/llms.txt

Lines changed: 20 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Test that cog doctor succeeds on a valid project with no issues.
2+
3+
cog doctor
4+
stderr 'experimental'
5+
stderr 'no issues found'
6+
7+
-- cog.yaml --
8+
build:
9+
python_version: "3.12"
10+
predict: "predict.py:Predictor"
11+
12+
-- predict.py --
13+
from cog import BasePredictor
14+
15+
16+
class Predictor(BasePredictor):
17+
def predict(self, text: str) -> str:
18+
return "hello " + text
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Test that cog doctor reports deprecated config fields as warnings.
2+
# Warnings do not cause a non-zero exit code; only errors do.
3+
4+
cog doctor
5+
stderr 'python_packages'
6+
stderr 'deprecated'
7+
8+
-- cog.yaml --
9+
build:
10+
python_version: "3.12"
11+
python_packages:
12+
- torch==2.0.0
13+
predict: "predict.py:Predictor"
14+
15+
-- predict.py --
16+
from cog import BasePredictor
17+
18+
19+
class Predictor(BasePredictor):
20+
def predict(self, text: str) -> str:
21+
return "hello " + text
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# Test that cog doctor detects deprecated imports.
2+
# ExperimentalFeatureWarning was removed in cog 0.17.
3+
4+
! cog doctor
5+
stderr 'ExperimentalFeatureWarning'
6+
stderr 'removed'
7+
8+
-- cog.yaml --
9+
build:
10+
python_version: "3.12"
11+
predict: "predict.py:Predictor"
12+
13+
-- predict.py --
14+
import warnings
15+
from cog import BasePredictor
16+
from cog.types import ExperimentalFeatureWarning
17+
18+
warnings.filterwarnings("ignore", category=ExperimentalFeatureWarning)
19+
20+
21+
class Predictor(BasePredictor):
22+
def predict(self, text: str) -> str:
23+
return "hello " + text
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Test that cog doctor exits with code 1 when predict file is missing.
2+
3+
! cog doctor
4+
stderr 'predict.py'
5+
stderr 'not found'
6+
7+
-- cog.yaml --
8+
build:
9+
python_version: "3.12"
10+
predict: "predict.py:Predictor"
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Test that cog doctor --fix removes deprecated imports.
2+
3+
# First, doctor should detect the issue
4+
! cog doctor
5+
stderr 'ExperimentalFeatureWarning'
6+
7+
# Fix the issue
8+
cog doctor --fix
9+
stderr 'Fixed'
10+
11+
# Verify the import was removed from the file
12+
exec cat predict.py
13+
! stdout 'ExperimentalFeatureWarning'
14+
! stdout 'cog.types'
15+
16+
# Re-running doctor should now pass
17+
cog doctor
18+
stderr 'no issues found'
19+
20+
-- cog.yaml --
21+
build:
22+
python_version: "3.12"
23+
predict: "predict.py:Predictor"
24+
25+
-- predict.py --
26+
import warnings
27+
from cog import BasePredictor
28+
from cog.types import ExperimentalFeatureWarning
29+
30+
warnings.filterwarnings("ignore", category=ExperimentalFeatureWarning)
31+
32+
33+
class Predictor(BasePredictor):
34+
def predict(self, text: str) -> str:
35+
return "hello " + text
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Test that cog doctor --fix rewrites pydantic.BaseModel to cog.BaseModel.
2+
# After fix, re-running cog doctor should pass.
3+
4+
# First, doctor should detect the issue
5+
! cog doctor
6+
stderr 'pydantic.BaseModel'
7+
8+
# Fix the issue
9+
cog doctor --fix
10+
stderr 'Fixed'
11+
12+
# Verify the file was modified: pydantic.BaseModel replaced with cog.BaseModel
13+
exec cat predict.py
14+
stdout 'from cog import BasePredictor, Path, BaseModel'
15+
! stdout 'from pydantic import BaseModel'
16+
! stdout 'ConfigDict'
17+
! stdout 'arbitrary_types_allowed'
18+
19+
# Re-running doctor should now pass
20+
cog doctor
21+
stderr 'no issues found'
22+
23+
-- cog.yaml --
24+
build:
25+
python_version: "3.12"
26+
predict: "predict.py:Predictor"
27+
28+
-- predict.py --
29+
from cog import BasePredictor, Path
30+
from pydantic import BaseModel, ConfigDict
31+
32+
33+
class VoiceCloningOutputs(BaseModel):
34+
model_config = ConfigDict(arbitrary_types_allowed=True)
35+
audio: Path
36+
spectrogram: Path
37+
38+
39+
class Predictor(BasePredictor):
40+
def predict(self, text: str) -> VoiceCloningOutputs:
41+
return VoiceCloningOutputs(audio="a.wav", spectrogram="s.png")
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Test that cog doctor detects when predict ref points to a nonexistent class.
2+
# cog.yaml references DoesNotExist but only Predictor exists in predict.py.
3+
4+
! cog doctor
5+
stderr 'DoesNotExist'
6+
stderr 'not found'
7+
8+
-- cog.yaml --
9+
build:
10+
python_version: "3.12"
11+
predict: "predict.py:DoesNotExist"
12+
13+
-- predict.py --
14+
from cog import BasePredictor
15+
16+
17+
class Predictor(BasePredictor):
18+
def predict(self, text: str) -> str:
19+
return "hello " + text

0 commit comments

Comments
 (0)