refactor(task): unify detection into resolve_task/TaskResolution; fix inspect source#878
Conversation
Single source of truth for model-type / model-id task overrides, unifying the detect_task short-circuit, the (model_type, None) sentinel reverse-lookup, and the model-id default. Not yet wired into the entry points.
…entry points detect_task, _detect_task_and_class_from_config, and resolve_loader_config step 2 now all consult _resolve_task_override first. Fixes: - sam2/sam2-video resolve to mask-generation on every entry point (was feature-extraction via inspect and --model-type) - model-id default (bert-tiny) now applies on the detect path too Removes the now-redundant sentinel block and the detect-path short-circuit.
Replace the config-field heuristic (_TASK_MODALITY_DISAMBIGUATION, which keyed on top-level image_size/patch_size) with the architecture class's main_input_name (pixel_values -> image-feature-extraction; text/audio stay feature-extraction). Fixes AST (carries patch_size but is audio) being misrouted to image-feature-extraction. Uses the arch class from config.architectures, not the resolved/Auto class.
…smatch universal_calib_dataset now degrades to RandomDataset (reading real input specs from the ONNX) when the task-specific dataset either fails to construct or produces none of the model's input tensors. Keeps build from crashing when a backbone stays feature-extraction but the ONNX wants non-text inputs (audio/image/video).
sam2 -> mask-generation via detect_task, resolve_task_and_model_class, and resolve_loader_config(model_type); detect_task agrees with resolve_task_and_model_class for AST (feature-extraction), ViT (image-feature-extraction), and BART-mnli.
…entry class-fixes Review found a regression: the single-real-task branch in _resolve_task_override, now consulted by the build path, forced segformer classification checkpoints to image-segmentation (its MODEL_CLASS_MAPPING entry is a class-fix for the segmentation task, not a default-task declaration). Remove that branch so a default comes only from an explicit (model_type, None) sentinel; otherwise the architecture head decides. segformer is the only single-entry-no-sentinel type. Also label detect_task source as MODEL_TASK_MAPPING for model-id overrides (was mislabeled HF_MODEL_CLASS_MAPPING).
…ID_DEFAULT/error/default paths
…tection # Conflicts: # src/winml/modelkit/eval/tensor_similarity_evaluator.py
DingmaomaoBJTU
left a comment
There was a problem hiding this comment.
Code Review
Summary
This PR correctly unifies task detection by replacing scattered, divergent paths with a single resolve_task() 5-stage resolver. The sentinel-based override mechanism for SAM/SAM2, the modality upgrade via main_input_name, and the composite component tagging are all well-designed. Below are the substantive issues I found.
Bug: USER_CLASS path skips modality upgrade (resolution.py line 367–369)
When model_class is provided without --task, the USER_CLASS stage returns opt_task directly as TaskResolution.task — no call to _resolve_task_modality. Example:
resolve_task(vit_config, model_class="ViTModel")
→ TaskResolution(task="feature-extraction", ...) ← should be "image-feature-extraction"
The class docstring says task is always WinML modality-aware. The detection path (Stage 3) applies _resolve_task_modality; the USER_CLASS path does not. This causes TASK_DATASET_MAPPING to route vision models to TextDataset for calibration.
Fix:
`python
resolution.py, after opt_task is set in the USER_CLASS branch:
surfaced = _resolve_task_modality(config, opt_task)
return TaskResolution(surfaced, to_optimum_task(surfaced), resolved, TaskSource.USER_CLASS, None)
`
No existing test covers this combination; it should be added to test_resolve_task.py.
Warning: Stage 1b KeyError propagates uncaught (resolution.py line 417)
python resolved = TasksManager.get_model_class_for_task(opt_task, framework="pt")
This line has no try/except. Stage 2's error handling does not protect it — resolved is set here, before Stage 2 checks if resolved is None. If the first supported task from get_supported_tasks cannot be looked up under framework="pt" (e.g. a timm_wrapper whose classes are under the timm library, not transformers), a raw KeyError escapes. Stage 2's except Exception pattern should be applied here too.
Warning: assert eliminated by -O (resolution.py line 447)
python assert source is not None
Python's -O flag strips assert statements. If an unexpected code path were to leave source = None, this guard disappears in optimised builds. Replace with:
python if source is None: raise RuntimeError("resolve_task: internal invariant violated — source was not set")
Warning: image-feature-extraction not pinned in TASK_SYNONYM_EXTENSIONS (task.py line 222)
to_optimum_task("image-feature-extraction") relies on Optimum's TasksManager.map_from_synonym to collapse it to "feature-extraction". The docstring documents the collapse; the tests verify it works today. But unlike "mask-generation" (which is pinned in TASK_SYNONYM_EXTENSIONS to prevent Optimum from mis-normalising it), "image-feature-extraction" has no fallback. If a future Optimum update changes the mapping, all downstream Optimum API calls will silently receive the unknown task string and fail.
Suggested addition:
python TASK_SYNONYM_EXTENSIONS: dict[str, str] = { "image-feature-extraction": "feature-extraction", # add this "next-sentence-prediction": "text-classification", "mask-generation": "mask-generation", }
Info: audio feature-extraction lands in TextDataset (datasets/__init__.py line 44)
_FEATURE_MODALITY_BY_MAIN_INPUT only upgrades pixel_values → image-feature-extraction. Audio models (input_values, input_features) stay as "feature-extraction" and hit TextDataset. The new RandomDataset fallback handles the mismatch gracefully, but it would help to add a # TODO: add audio-feature-extraction once AudioDataset exists comment at line 44 so this gap is visible.
Info: missing test for USER_CLASS with vision feature-extraction
test_resolve_task.py has no case for resolve_task(vit_config, model_class="ViTModel"). Even if the modality gap is intentional (user is being explicit), the expected behaviour should be pinned by a test so future refactors cannot silently change it.
DingmaomaoBJTU
left a comment
There was a problem hiding this comment.
Code Review: fix(task): unify task detection across entry points
The unified resolve_task() 5-stage resolver is a sound design — sentinel-based overrides, the TaskResolution/TaskSource dataclasses, and the composite tagging are all well-structured. The cross-entry-point tests in test_task_entry_point_consistency.py are a good addition.
Five issues below (one bug, three warnings, one info).
⚠️ task.py line 222 — TASK_SYNONYM_EXTENSIONS missing image-feature-extraction entry
to_optimum_task("image-feature-extraction") relies on Optimum's map_from_synonym to collapse this to "feature-extraction". Comments in clip.py and blip.py confirm Optimum knows the synonym today, but unlike "mask-generation" (which is explicitly pinned here to prevent Optimum mis-normalisation), there is no fallback. If Optimum changes the mapping, all downstream Optimum API calls silently receive the unknown string "image-feature-extraction" and fail.
Suggested fix:
TASK_SYNONYM_EXTENSIONS: dict[str, str] = {
"image-feature-extraction": "feature-extraction", # add
"next-sentence-prediction": "text-classification",
"mask-generation": "mask-generation",
}…den Stage 1b + invariant; pin image-feature-extraction synonym; add tests
|
Thanks for the thorough review — all six addressed in 1. USER_CLASS skips modality (bug) — Fixed, with one refinement to your suggested patch. Applying if task is not None:
opt_task = normalize_task(task); surfaced = opt_task # preserve user intent
else:
opt_task = _infer_task_from_architecture(config)
surfaced = _resolve_task_modality(config, opt_task) # inferred -> modality-aware
2. Stage 1b KeyError uncaught — Fixed by removing the eager 3. 4. 5. audio → TextDataset (info) — Added a 6. missing USER_CLASS test — Added Generated with Claude Code |
…the validate-gate cross-modality warning (test_build_validate_task); collapse stays guarded by test_task_boundary
DingmaomaoBJTU
left a comment
There was a problem hiding this comment.
Follow-up review after 5d47ce9
All five issues from the previous review have been addressed:
- ✅ Bug: USER_CLASS modality upgrade — fixed (inferred-task branch now calls
_resolve_task_modality). - ✅ Warning: Stage 1b raw
KeyError— fixed (call removed; Stage 2 handles class resolution). - ✅ Warning:
assert source is not Nonestripped by-O— fixed (if/raise RuntimeError). - ✅ Warning:
image-feature-extractionnot pinned inTASK_SYNONYM_EXTENSIONS— fixed. - ✅ Info: Missing test for USER_CLASS + vision feature-extraction — fixed (two new tests added).
One new issue introduced by the fix:
⚠️ USER_CLASS explicit-task branch normalizes away the user's original task string (inconsistent with USER_TASK)
In the explicit-task sub-branch (line 358–361), surfaced = normalize_task(task). The comment says "preserve it verbatim (like USER_TASK)", but the USER_TASK path (line 381–398) stores original = task and returns that, not the normalized form. So the two paths are inconsistent:
# USER_TASK: resolve_task(cfg, task="causal-lm") -> TaskResolution(task="causal-lm", ...)
# USER_CLASS: resolve_task(cfg, task="causal-lm", model_class="X") -> TaskResolution(task="text-generation", ...)The new test test_user_class_with_explicit_task_preserved_verbatim only checks task="feature-extraction" (already canonical, no normalization happens) so it does not catch this divergence.
Fix: use surfaced = task (the original user string) instead of surfaced = opt_task (normalized), keeping opt_task = normalize_task(task) for the Optimum API call only — identical to what USER_TASK does with original vs normalized.
|
Correction on item 4 ( The forward-looking robustness concern is already guarded: Generated with Claude Code |
…im (match USER_TASK), not the normalized alias; add alias test
…e broke the intentional test_task_normalized_with_model_class; USER_TASK/USER_CLASS divergence is by design
…lready documents the modality gap
DingmaomaoBJTU
left a comment
There was a problem hiding this comment.
Follow-up review after 275b022
Status of previous open items:
- ✅ Warning (resolution.py:361) — Addressed as an intentional design decision, documented in code. The
test_task_normalized_with_model_classtest pins the normalization behavior. - ✅ Info (test_resolve_task.py:144) — Test renamed to
test_user_class_explicit_task_is_not_modality_upgraded, more accurately describing the behavior.
One new warning found in the current state — see inline comment.
…th inferred branch + USER_TASK The USER_CLASS explicit-task branch surfaced normalize_task(task) directly, collapsing the WinML modality-aware image-feature-extraction to feature-extraction -- which routes a ViT to a text dataset/evaluator (config --task image-feature-extraction --model-class ViTModel -> build/quantize calibration). Re-apply _resolve_task_modality after normalizing so the surfaced task stays modality-aware, matching the USER_CLASS inferred branch and USER_TASK; adding --model-class must not collapse the modality. opt_task (the Optimum class lookup) is unchanged, so alias canonicalization (masked-lm -> fill-mask) and the optimum_task==to_optimum_task(task) invariant still hold; _resolve_task_modality is a no-op for non-feature-extraction tasks. Update test_user_class_explicit_feature_extraction_is_modality_aware (was pinning the collapse) and add test_user_class_explicit_image_feature_extraction_preserved as the regression pin.
DingmaomaoBJTU
left a comment
There was a problem hiding this comment.
Fourth-pass review — commit e65968d
Previous warning RESOLVED:
esolution.py:361 — surfaced = _resolve_task_modality(config, opt_task) is now applied in the USER_CLASS explicit-task branch. The image-feature-extraction modality is preserved when combined with --model-class, backed by two new tests ( est_user_class_explicit_feature_extraction_is_modality_aware, est_user_class_explicit_image_feature_extraction_preserved). ✅
One info finding remains — see inline comment on
esolution.py:361.
Open item from previous pass:
- 🔵 Info ( ask.py:226) — TASK_SYNONYM_EXTENSIONS dual role (Optimum translation + build-validator bypass) is a structural design smell. Low priority; recommend a follow-up issue to separate the concerns.
DingmaomaoBJTU
left a comment
There was a problem hiding this comment.
LGTM! All bugs and warnings resolved across 4 review rounds. Two remaining info items (misleading comment on resolution.py:361 and TASK_SYNONYM_EXTENSIONS dual-role coupling) are low-priority and can be tracked as follow-up issues.
What
Unifies WinML's task detection into a single resolver.
detect_task(inspect/eval),_detect_task_and_class_from_config(build via-m), andresolve_loader_configstep 2 (config/build via--model-type) were three implementations of the same decision over the sameMODEL_CLASS_MAPPINGdata, and had drifted. Modality was reconstructed from config field names — a heuristic that misroutes models whose fields aren't modality-exclusive.All three are replaced by a single
resolve_task(config, *, task=None, model_class=None) -> TaskResolutionin a newloader/resolution.py, consumed by every entry point.loader/task.pyis reduced to data tables + boundary utils. Modality derives from the model class'smain_input_name. Addresses #877.API change: legacy
detect_task/resolve_task_and_model_class/_detect_task_and_class_from_configare removed; callers useresolve_task+TaskResolutionfields.resolve_loader_configreturns theTaskResolutionas a 4th tuple element.Why (concrete bugs fixed)
winml inspectrecomputed provenance post-hoc by scanningMODEL_CLASS_MAPPING, so a model detected via Optimum (e.g. TAPEX/BART) was mislabeledHF_MODEL_CLASS_MAPPING. It now reports the resolver's realsource(tasks-manager).ASTModelcarriespatch_size, so the old config-field heuristic upgraded it toimage-feature-extraction→ image dataset → fail.main_input_name(input_values) keeps itfeature-extraction.feature-extractionvia inspect/--model-typebutmask-generationvia-m(different artifacts for the same model). Nowmask-generationeverywhere.load_hf_modelreturned an inconsistent task (Optimum-canonical in theuser_scriptbranch, modality-aware otherwise). Both now return the modality-aware task.feature-extractionbut exports non-text inputs routed to a text dataset and crashed; calibration falls back toRandomDataseton a modality mismatch.How
resolve_task(5 stages): user override → detection (override / no-architectures / TasksManager + fill-mask→seq2seq correction / default) → model class → modality upgrade (detection path only) → composite tag. ReturnsTaskResolution(task, optimum_task, model_class, source, composite)with the invariantoptimum_task == to_optimum_task(task).TaskSourceenum replaces stringly-typed provenance with semantic values (tasks-manager,sentinel-default,model-id-default,wrapped-library,hf-task-default,user-task,user-class) — no internal data-structure names leak intoinspectoutput.resolve_composite+ the seq2seq bridge move fromcommands/config.pyinto the resolver;TaskResolution.compositecarries sub-components sowinml configno-task routing matches explicit--task._resolve_task_override(model-id default →(model_type, None)sentinel reverse-lookup) is the single override source; a default-task override is declared only by an explicit sentinel, not by a lone(model_type, task)class-fix entry (e.g. segformer image-segmentation) — the architecture head decides otherwise.main_input_name(pixel_values→image-feature-extraction; text/audio stayfeature-extraction).--model-typefallback (first ONNX-exportable task) and the timm wrapped-library stage merge into one no-architectures stage inresolve_task.task.pyintoresolution.py. Calibration degrades toRandomDataset(real input specs from the ONNX) on a modality mismatch.Behavior changes
HF_MODEL_CLASS_MAPPINGtasks-manager(real source)ASTModel)image-feature-extractionfeature-extraction--model-type:feature-extraction;-m:mask-generationmask-generationeverywhereimage-segmentationimage-classificationprajjwal1/bert-tiny(inspect)feature-extraction(model-id default)load_hf_model(user_scriptbranch)detect_taskfell back; loader path raisedhf-task-defaultfallbacksourcestringsTasksManager/HF_MODEL_CLASS_MAPPING/ …TaskSourcevaluesOut of scope (deferred)
audio-feature-extractionevaluator + dataset (and the task name itself).