Skip to content

⚡️ Speed up method JsxRenderCallTransformer.transform by 9,715% in PR #1561 (add/support_react)#1652

Closed
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-24T21.36.52
Closed

⚡️ Speed up method JsxRenderCallTransformer.transform by 9,715% in PR #1561 (add/support_react)#1652
codeflash-ai[bot] wants to merge 1 commit intoadd/support_reactfrom
codeflash/optimize-pr1561-2026-02-24T21.36.52

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai bot commented Feb 24, 2026

⚡️ This pull request contains optimizations for PR #1561

If you approve this dependent PR, these changes will be merged into the original PR branch add/support_react.

This PR will be automatically closed if the original PR is merged.


📄 9,715% (97.15x) speedup for JsxRenderCallTransformer.transform in codeflash/languages/javascript/instrument.py

⏱️ Runtime : 904 milliseconds 9.21 milliseconds (best of 70 runs)

📝 Explanation and details

Runtime improvement (primary): The optimized version reduces end-to-end transform time from 904 ms to 9.21 ms (~98× faster; reported 9714% speedup). Line-profiling confirms the hot loop that previously dominated (string-state checks) is now a one-time linear pass instead of being repeated for every regex match.

What changed (concrete optimizations)

  • Precompute string-inside table: instead of calling is_inside_string(code, pos) for every regex match (which re-scanned the prefix repeatedly), the optimized transform builds an inside_flags list of length n+1 in a single forward O(n) pass. Subsequent checks become O(1) array lookups.
  • Avoid extra substring allocation: replaced code[match.start():].index("(") with code.find("(", match.start()) so we don't create a large slice just to find the parenthesis.
  • Single forward pointer in precompute: the incremental j pointer simulates the original is_inside_string scan exactly but only once, preserving behavior while avoiding repeated scans.

Why it is faster (performance reasoning)

  • Complexity reduction: original behavior effectively did repeated scanning of the source to determine string state for each match (O(matches * scan)), which becomes quadratic-like on pathological inputs. The optimized version does a single linear scan over the input (O(n)) plus cheap per-match work, yielding overall O(n + matches) instead of repeated O(n) work per match.
  • Reduced allocation and work: avoiding the render_call_text substring removes memory allocations and copying when matching, lowering both CPU and GC/allocator pressure.
  • The heavy work that remains (parenthesis matching in _find_matching_paren) was already necessary and remains unchanged; the biggest waste (checking "inside string" by rescanning) is eliminated.

Evidence in profiling and tests

  • Original line-profiler shows is_inside_string and repeated scanning accounted for the vast majority of time. Optimized profiling moves that cost to a single precompute loop, and total transform time drops from ~14s (profile aggregate) to ~0.088s.
  • Annotated unit tests show the biggest wins on large inputs: e.g., transforming 1000 render calls goes from ~615 ms to ~4.99 ms (huge improvement). Mixed-content large tests also show thousands-percent improvement.
  • Small inputs: there is a tiny precompute overhead for very small files — some micro-tests show a small increase in latency (single-digit microsecond differences). This is an expected and reasonable trade-off for the large wins on real/hot workloads.

Behavioral and workload impact

  • Behavior-preserving: the precompute loop faithfully reproduces the original string-parsing rules (including escape handling and backticks), so match-skipping semantics are preserved.
  • Hot-path benefit: where this transformer runs on large files or on code with many render(...) calls (the typical hot path in the tests), the change dramatically reduces CPU time and allocation churn.
  • Trade-offs: memory usage rises slightly (O(n) boolean flags), and tiny single-match inputs may see small overhead; this trade-off is acceptable because it removes the dominant repeated work and yields orders-of-magnitude runtime reductions for typical large/hot inputs.

Summary
The optimized code eliminates repeated rescans for string membership by precomputing a single, linear-time "inside string" table and avoids an unnecessary substring allocation when searching for the opening parenthesis. These two targeted changes transform an expensive repeated-O(n) operation into a one-time O(n) cost plus cheap O(1) checks, producing the large runtime improvement observed in profiling and tests.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 21 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 94.8%
🌀 Click to see Generated Regression Tests
import inspect  # used to introspect constructor signature of FunctionToOptimize
import re  # used in some assertions to validate transformed outputs

import pytest  # used for our unit tests
from codeflash.discovery.functions_to_optimize import FunctionToOptimize
# import the real classes/functions from the project under test
from codeflash.languages.javascript.instrument import (
    JsxRenderCallTransformer, is_inside_string)

def _make_function_to_optimize(function_name: str, qualified_name: str):
    """
    Construct a real FunctionToOptimize instance by inspecting its constructor
    signature. This helper adapts to different constructor parameter orders/names
    while still using the real class (no mocks or stubs).
    """
    # get the __init__ signature and parameters (skip 'self')
    init_sig = inspect.signature(FunctionToOptimize.__init__)
    params = list(init_sig.parameters.values())[1:]  # skip self

    # If constructor expects named parameters 'function_name' and 'qualified_name',
    # pass them explicitly. Otherwise try positional construction.
    param_names = [p.name for p in params]

    if "function_name" in param_names and "qualified_name" in param_names:
        # provide by name
        kwargs = {"function_name": function_name, "qualified_name": qualified_name}
        # Fill other params with None if they don't have defaults
        for p in params:
            if p.name not in kwargs and p.default is inspect._empty:
                kwargs[p.name] = None
        return FunctionToOptimize(**kwargs)
    else:
        # Fallback: call with positional args if there are at least 2 parameters
        if len(params) >= 2:
            return FunctionToOptimize(function_name, qualified_name)
        # If only one parameter, try to pass a tuple with both values (unlikely but safe)
        if len(params) == 1:
            return FunctionToOptimize((function_name, qualified_name))
        # As a last resort, attempt no-arg construction
        return FunctionToOptimize()

def test_basic_simple_render_replacement():
    # Create a real FunctionToOptimize instance for component 'MyComp'
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    # Create the transformer that will wrap render calls using capturePerf
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # simple code: render(<MyComp />); (no leading whitespace, no await)
    src = "render(<MyComp />);"
    # transform the code
    codeflash_output = transformer.transform(src); out = codeflash_output # 10.1μs -> 13.4μs (24.4% slower)

    # Expect exactly one transformation: render(...) -> codeflash.capturePerf('pkg.MyComp','1', () => render(...));
    expected = "codeflash.capturePerf('pkg.MyComp', '1', () => render(<MyComp />));"

def test_basic_await_and_indentation_preserved():
    # Ensure 'await ' prefix and leading whitespace are preserved in the transformed output
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # include indentation and 'await ' prefix
    src = "  await render(<MyComp prop={42} />);"
    codeflash_output = transformer.transform(src); out = codeflash_output # 11.6μs -> 15.8μs (26.7% slower)

def test_basic_multiple_invocations_increment_ids():
    # Multiple render calls should get monotonically increasing invocation ids
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    src = "render(<MyComp />);\nrender(<MyComp />);\nrender(<MyComp />);"
    codeflash_output = transformer.transform(src); out = codeflash_output # 23.7μs -> 25.4μs (6.89% slower)
    # Verify ordering: id 1 occurs before id 2 before id 3
    pos1 = out.find("'1'")
    pos2 = out.find("'2'")
    pos3 = out.find("'3'")

def test_skip_inside_string_literals():
    # Rendering-like text inside strings should not be transformed
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # The first occurrence is inside a double-quoted string and must be skipped.
    src = 'const s = "render(<MyComp />)";\nrender(<MyComp />);'
    codeflash_output = transformer.transform(src); out = codeflash_output # 16.7μs -> 18.9μs (11.8% slower)

def test_skip_if_already_wrapped_nearby():
    # If the render call is already inside a codeflash.capturePerf (within lookback window),
    # the transformer must not double-wrap it.
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # Construct code where a render call is already wrapped and another render follows closely.
    wrapped = "codeflash.capturePerf('pkg.MyComp', '1', () => render(<MyComp />));"
    # Place another render call within 60 chars before its opening -- to trigger the lookback skip
    src = wrapped + "render(<MyComp />);"
    codeflash_output = transformer.transform(src); out = codeflash_output # 23.7μs -> 24.8μs (4.44% slower)

def test_missing_closing_paren_does_not_raise_and_leaves_code_untouched():
    # If a matching closing parenthesis cannot be found, transformer should skip transformation and not crash.
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # Intentionally broken render call (no closing paren)
    src = "render(<MyComp />"
    # Ensure transform returns without raising and original substring remains
    codeflash_output = transformer.transform(src); out = codeflash_output # 8.77μs -> 11.6μs (24.2% slower)

def test_render_with_nested_parentheses_and_strings_in_props():
    # Complex JSX props that include parentheses and string literals should not confuse the paren-matching logic
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # prop contains nested parentheses and a string that includes a parenthesis char
    src = "render(<MyComp prop={() => (1 + (2))} label={'paren)here'} />);"
    codeflash_output = transformer.transform(src); out = codeflash_output # 14.8μs -> 21.8μs (32.0% slower)

def test_non_matching_component_name_is_ignored():
    # Only the exact component name provided in FunctionToOptimize should be matched; others ignored
    fto = _make_function_to_optimize("TargetComp", "pkg.TargetComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # render of OtherComp should remain unchanged
    src = "render(<OtherComp />);\nrender(<TargetComp />);"
    codeflash_output = transformer.transform(src); out = codeflash_output # 13.8μs -> 17.2μs (20.0% slower)

def test_is_inside_string_utility_various_quotes_and_escapes():
    # Escaped quotes should not terminate the string early
    s = '"a \\" escaped" rest'

def test_large_scale_many_render_calls_performance_and_ids():
    # Generate a large number of render calls to ensure transformer scales and ids increment properly.
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    # Create 1000 lines of identical render calls (each terminated by semicolon)
    N = 1000
    src_lines = ["render(<MyComp />);" for _ in range(N)]
    src = "\n".join(src_lines)

    # Transform the large input
    codeflash_output = transformer.transform(src); out = codeflash_output # 615ms -> 4.99ms (12236% faster)

def test_large_scale_mixed_content_up_to_1000_entries():
    # Mixed content: some lines that should not be transformed interleaved with transformable ones.
    fto = _make_function_to_optimize("MyComp", "pkg.MyComp")
    transformer = JsxRenderCallTransformer(fto, "capturePerf")

    N = 500  # half as many transformable lines, interleaved, total 1000-ish size
    parts = []
    for i in range(N):
        # Add one non-matching line, then one matching line
        parts.append(f"const x{i} = {i};")
        parts.append("render(<MyComp />);")
    src = "\n".join(parts)

    codeflash_output = transformer.transform(src); out = codeflash_output # 288ms -> 4.07ms (6980% faster)
    for i in range(N):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1561-2026-02-24T21.36.52 and push.

Codeflash Static Badge

Runtime improvement (primary): The optimized version reduces end-to-end transform time from 904 ms to 9.21 ms (~98× faster; reported 9714% speedup). Line-profiling confirms the hot loop that previously dominated (string-state checks) is now a one-time linear pass instead of being repeated for every regex match.

What changed (concrete optimizations)
- Precompute string-inside table: instead of calling is_inside_string(code, pos) for every regex match (which re-scanned the prefix repeatedly), the optimized transform builds an inside_flags list of length n+1 in a single forward O(n) pass. Subsequent checks become O(1) array lookups.
- Avoid extra substring allocation: replaced code[match.start():].index("(") with code.find("(", match.start()) so we don't create a large slice just to find the parenthesis.
- Single forward pointer in precompute: the incremental j pointer simulates the original is_inside_string scan exactly but only once, preserving behavior while avoiding repeated scans.

Why it is faster (performance reasoning)
- Complexity reduction: original behavior effectively did repeated scanning of the source to determine string state for each match (O(matches * scan)), which becomes quadratic-like on pathological inputs. The optimized version does a single linear scan over the input (O(n)) plus cheap per-match work, yielding overall O(n + matches) instead of repeated O(n) work per match.
- Reduced allocation and work: avoiding the render_call_text substring removes memory allocations and copying when matching, lowering both CPU and GC/allocator pressure.
- The heavy work that remains (parenthesis matching in _find_matching_paren) was already necessary and remains unchanged; the biggest waste (checking "inside string" by rescanning) is eliminated.

Evidence in profiling and tests
- Original line-profiler shows is_inside_string and repeated scanning accounted for the vast majority of time. Optimized profiling moves that cost to a single precompute loop, and total transform time drops from ~14s (profile aggregate) to ~0.088s.
- Annotated unit tests show the biggest wins on large inputs: e.g., transforming 1000 render calls goes from ~615 ms to ~4.99 ms (huge improvement). Mixed-content large tests also show thousands-percent improvement.
- Small inputs: there is a tiny precompute overhead for very small files — some micro-tests show a small increase in latency (single-digit microsecond differences). This is an expected and reasonable trade-off for the large wins on real/hot workloads.

Behavioral and workload impact
- Behavior-preserving: the precompute loop faithfully reproduces the original string-parsing rules (including escape handling and backticks), so match-skipping semantics are preserved.
- Hot-path benefit: where this transformer runs on large files or on code with many render(...) calls (the typical hot path in the tests), the change dramatically reduces CPU time and allocation churn.
- Trade-offs: memory usage rises slightly (O(n) boolean flags), and tiny single-match inputs may see small overhead; this trade-off is acceptable because it removes the dominant repeated work and yields orders-of-magnitude runtime reductions for typical large/hot inputs.

Summary
The optimized code eliminates repeated rescans for string membership by precomputing a single, linear-time "inside string" table and avoids an unnecessary substring allocation when searching for the opening parenthesis. These two targeted changes transform an expensive repeated-O(n) operation into a one-time O(n) cost plus cheap O(1) checks, producing the large runtime improvement observed in profiling and tests.
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 4, 2026

Closing stale optimization PR.

@claude claude bot closed this Mar 4, 2026
@claude claude bot deleted the codeflash/optimize-pr1561-2026-02-24T21.36.52 branch March 4, 2026 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants