fixes-for-core-unstructured-experimental by KRRT7 · Pull Request #1524 · codeflash-ai/codeflash

KRRT7 · 2026-02-18T12:52:19Z

Summary

Extract parameter type constructor signatures into testgen context so the LLM knows how to construct typed parameters
Resolves types via jedi (following re-exports) and extracts full __init__ source
Filters out builtins/typing names and avoids duplicating classes already in context

Remove safe_relative_to, resolve_classes_from_modules, extract_classes_from_type_hint, resolve_transitive_type_deps, extract_init_stub, _is_project_module_cached, is_project_path, _is_project_module, extract_imports_for_class, collect_names_from_annotation, is_dunder_method, _qualified_name, and _validate_classdef. Inline trivial helpers into prune_cst and clean up enrich_testgen_context and get_function_sources_from_jedi. Remove corresponding tests.

Add enrichment step that parses FTO parameter type annotations, resolves types via jedi (following re-exports), and extracts full __init__ source to give the LLM constructor context for typed parameters.

claude · 2026-02-18T13:02:22Z

codeflash/languages/python/context/code_context_extractor.py


-        class_imports = extract_imports_for_class(module_tree, class_node, module_source)
-        full_source = class_imports + "\n\n" + class_source if class_imports else class_source
+        full_source = class_source


Previous review flagged missing is_project_path guard as a bug. Tests have been updated to reflect the intentional design change: stdlib/third-party classes are now extracted via AST source parsing rather than runtime reflection. The extract_class_and_bases function only extracts ClassDef nodes it finds in the resolved module's source, which is a reasonable approach. No longer blocking.

claude · 2026-02-18T13:02:29Z

codeflash/languages/python/context/code_context_extractor.py

@@ -710,8 +910,7 @@ def extract_class_and_bases(
            start_line = min(d.lineno for d in class_node.decorator_list)
        class_source = "\n".join(lines[start_line - 1 : class_node.end_lineno])



Previous review flagged removed import extraction as a bug. The tests have been updated to no longer assert from dataclasses import in extracted code. This is an intentional simplification — emitting raw class source without import statements. The LLM context builder presumably handles imports separately. No longer blocking.

claude · 2026-02-18T13:06:18Z

PR Review Summary

Prek Checks

Fixed 2 issues and pushed commit 7df7d79d:

TC004: from pathlib import Path was incorrectly moved into TYPE_CHECKING block, but Path is used at runtime (lines 1102, 1129, 1145, 1184 as constructors). Moved back to runtime imports.
F821: safe_relative_to function was deleted but a call to it remained at line 542. Replaced with inline try/except pattern matching the rest of the PR.

After fix: all prek checks pass ✅

Mypy

code_context_extractor.py: clean ✅
test_code_context_extractor.py: 40 pre-existing no-untyped-def errors on test functions (not introduced by this PR)

Code Review

No new critical issues found. Previous review comments are resolved. The refactoring is coherent:

safe_relative_to properly inlined everywhere
New extract_parameter_type_constructors, extract_init_stub_from_class, and resolve_instance_class_name functions are well-structured
Removed external base class runtime reflection (replaced with AST-based source parsing)
build_testgen_context correctly passes function_to_optimize through all call sites
Helper function inlining (_qualified_name, _validate_classdef, is_dunder_method) is clean

Test Coverage

File	Main	PR	Delta
`code_context_extractor.py`	85% (634 stmts, 98 miss)	64% (738 stmts, 266 miss)	-21% ⚠️

Analysis: The file grew by 104 statements (new functions: extract_parameter_type_constructors, extract_init_stub_from_class, resolve_instance_class_name, collect_type_names_from_annotation, BUILTIN_AND_TYPING_NAMES). Coverage dropped because the new code paths add 168 additional uncovered lines. The PR includes 20+ new test functions covering the new functions, but many of the new code paths (especially error handling and jedi-based resolution in extract_parameter_type_constructors) are not exercised by tests.

Note: 17 test failures on both branches are environment-dependent (missing CODEFLASH_API_KEY), not related to this PR.

Last updated: 2026-02-21

codeflash/languages/python/context/code_context_extractor.py

Fix 10 failing tests: remove wrong assertions expecting import statements inside extracted class code, use substring matching for UserDict class signature, and rewrite click-dependent tests as project-local equivalents. Add tests for resolve_instance_class_name, enhanced extract_init_stub_from_class, and enrich_testgen_context instance resolution.

codeflash/languages/python/context/code_context_extractor.py

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

The optimized code achieves a **70% runtime speedup** (from 7.02ms to 4.13ms) through three key improvements: ## 1. **Faster Class Discovery via Deque-Based BFS (Primary Speedup)** The original code uses `ast.walk()` which recursively traverses the entire AST tree even after finding the target class. The line profiler shows this taking 20.5ms (71% of time). The optimized version replaces this with an explicit BFS using `collections.deque`, which stops immediately upon finding the target class. The profiler shows this reduces traversal time to 9.95ms - **cutting the search overhead by >50%**. This is especially impactful when: - The target class appears early in the module (eliminates unnecessary traversal) - The module contains many classes (test shows 7-10% faster on modules with 100-1000 classes) - The function is called frequently (shown by the 108% speedup on 1000 repeated calls) ## 2. **Explicit Loops Replace Generator Overhead** The original code uses `any()` with a generator expression and `min()` with a generator to check decorators and find minimum line numbers. These create function call and generator overhead. The optimized version uses explicit `for` loops with early breaks: - Decorator checking: Directly iterates and breaks on first match - Min line number: Uses explicit comparison instead of `min()` generator The profiler shows decorator processing time reduced from ~1.4ms to ~0.3ms, and min line calculation from 69μs to 28μs. ## 3. **Conditional Flag Pattern for Relevance Checking** Instead of evaluating both conditions in a compound expression, the optimized version uses an `is_relevant` flag with early exits, reducing redundant checks. ## Impact on Workloads Based on `function_references`, this function is called from: - `enrich_testgen_context`: Used in test generation workflows where it may process many classes - Benchmark tests: Indicates this is in a performance-critical path The optimization particularly benefits: - **Large codebases**: 89-90% faster on classes with 100+ methods or 50+ properties - **Repeated calls**: 108% faster when called 1000 times in sequence - **Early matches**: Up to 88% faster when target class is found quickly - **Deep nesting**: 57% faster for nested classes The annotated tests show consistent 50-108% speedups across most scenarios, with minimal gains (6-10%) only when processing very large files where string slicing dominates runtime.

codeflash-ai · 2026-02-18T14:38:36Z

⚡️ Codeflash found optimizations for this PR

📄 70% (0.70x) speedup for `extract_init_stub_from_class` in `codeflash/languages/python/context/code_context_extractor.py`

⏱️ Runtime : 7.02 milliseconds → 4.13 milliseconds (best of 41 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function extract_init_stub_from_class by 70% in PR #1524 (fixes-for-core-unstructured-experimental) #1529

If you approve, it will be merged into this PR (branch fixes-for-core-unstructured-experimental).

…2026-02-18T14.38.26

codeflash-ai · 2026-02-18T14:44:23Z

This PR is now faster! 🚀 @KRRT7 accepted my optimizations from:

⚡️ Speed up function extract_init_stub_from_class by 70% in PR #1524 (fixes-for-core-unstructured-experimental) #1529

codeflash/languages/python/context/code_context_extractor.py

feat: extend testgen type context to include function body references Extract types referenced in the function body (constructor calls, attribute access, isinstance/issubclass args) in addition to parameter annotations. Use full class extraction instead of init-stub-only, with instance resolution fallback and project/site-packages filtering.

This reverts commit 2966e15.

Move Path import out of TYPE_CHECKING block (TC004) since it is used at runtime, and replace missing safe_relative_to call with inline try/except pattern matching the rest of the PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

KRRT7 and others added 4 commits February 18, 2026 05:03

Merge branch 'main' into fixes-for-core-unstructured-experimental

68c148c

feat: extract parameter type constructor signatures into testgen context

2367b4c

Add enrichment step that parses FTO parameter type annotations, resolves types via jedi (following re-exports), and extracts full __init__ source to give the LLM constructor context for typed parameters.

fix: resolve mypy no-redef error in collect_type_names_from_annotation

c644b6e

claude bot reviewed Feb 18, 2026

View reviewed changes

codeflash-ai bot reviewed Feb 18, 2026

View reviewed changes

codeflash/languages/python/context/code_context_extractor.py Show resolved Hide resolved

KRRT7 and others added 6 commits February 18, 2026 13:16

Merge branch 'main' into fixes-for-core-unstructured-experimental

26989b2

temp

d480061

style: auto-fix linting issues

4779486

fix: resolve mypy attr-defined errors in new test functions

7f5e163

context extraction imporvements

b269212

claude bot reviewed Feb 18, 2026

View reviewed changes

codeflash/languages/python/context/code_context_extractor.py Outdated Show resolved Hide resolved

KRRT7 and others added 2 commits February 18, 2026 14:19

Update code_context_extractor.py

eedd73d

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

codeflash-ai bot mentioned this pull request Feb 18, 2026

⚡️ Speed up function extract_init_stub_from_class by 70% in PR #1524 (fixes-for-core-unstructured-experimental) #1529

Merged

github-actions bot and others added 2 commits February 18, 2026 14:43

style: auto-fix linting issues and resolve mypy type errors

ae740d9

Merge pull request #1529 from codeflash-ai/codeflash/optimize-pr1524-…

2364096

…2026-02-18T14.38.26

codeflash-ai bot reviewed Feb 18, 2026

View reviewed changes

codeflash/languages/python/context/code_context_extractor.py Show resolved Hide resolved

KRRT7 force-pushed the fixes-for-core-unstructured-experimental branch from 1b63179 to 2966e15 Compare February 21, 2026 05:50

KRRT7 and others added 3 commits February 21, 2026 00:50

Revert "commit"

c1703a2

This reverts commit 2966e15.

chore: merge main into fixes-for-core-unstructured-experimental

c6fbdfa

KRRT7 merged commit bc0f9d5 into main Feb 21, 2026
26 of 28 checks passed

KRRT7 deleted the fixes-for-core-unstructured-experimental branch February 21, 2026 06:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fixes-for-core-unstructured-experimental#1524

fixes-for-core-unstructured-experimental#1524
KRRT7 merged 18 commits intomainfrom
fixes-for-core-unstructured-experimental

KRRT7 commented Feb 18, 2026

Uh oh!

claude bot Feb 18, 2026 •

edited

Loading

Uh oh!

claude bot Feb 18, 2026 •

edited

Loading

Uh oh!

claude bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

codeflash-ai bot commented Feb 18, 2026

⚡️ Speed up function `extract_init_stub_from_class` by 70% in PR #1524 (`fixes-for-core-unstructured-experimental`) #1529

Uh oh!

codeflash-ai bot commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -710,8 +910,7 @@ def extract_class_and_bases(
		start_line = min(d.lineno for d in class_node.decorator_list)
		class_source = "\n".join(lines[start_line - 1 : class_node.end_lineno])

Comments

Conversation

KRRT7 commented Feb 18, 2026

Summary

Uh oh!

claude bot Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Mypy

Code Review

Test Coverage

Uh oh!

Uh oh!

Uh oh!

codeflash-ai bot commented Feb 18, 2026

⚡️ Codeflash found optimizations for this PR

📄 70% (0.70x) speedup for extract_init_stub_from_class in codeflash/languages/python/context/code_context_extractor.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function extract_init_stub_from_class by 70% in PR #1524 (fixes-for-core-unstructured-experimental) #1529

Uh oh!

codeflash-ai bot commented Feb 18, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot Feb 18, 2026 •

edited

Loading

claude bot Feb 18, 2026 •

edited

Loading

claude bot commented Feb 18, 2026 •

edited

Loading

📄 70% (0.70x) speedup for `extract_init_stub_from_class` in `codeflash/languages/python/context/code_context_extractor.py`

⚡️ Speed up function `extract_init_stub_from_class` by 70% in PR #1524 (`fixes-for-core-unstructured-experimental`) #1529