Skip to content

⚡️ Speed up method JavaAssertTransformer._find_fluent_chain_end by 33% in PR #1295 (feat/java-remove-asserts-transformer)#1331

Closed
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
codeflash/optimize-pr1295-2026-02-03T22.00.31
Closed

⚡️ Speed up method JavaAssertTransformer._find_fluent_chain_end by 33% in PR #1295 (feat/java-remove-asserts-transformer)#1331
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
codeflash/optimize-pr1295-2026-02-03T22.00.31

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 3, 2026

⚡️ This pull request contains optimizations for PR #1295

If you approve this dependent PR, these changes will be merged into the original PR branch feat/java-remove-asserts-transformer.

This PR will be automatically closed if the original PR is merged.


📄 33% (0.33x) speedup for JavaAssertTransformer._find_fluent_chain_end in codeflash/languages/java/remove_asserts.py

⏱️ Runtime : 1.92 milliseconds 1.45 milliseconds (best of 141 runs)

📝 Explanation and details

This optimization achieves a 32% runtime improvement (from 1.92ms to 1.45ms) by reducing redundant computations in two performance-critical string parsing methods that process Java assertion chains.

Key Optimizations

1. Local Variable Caching in _find_fluent_chain_end

The original code called len(source) repeatedly in every loop iteration—up to 7 times per iteration across multiple while loops. The optimized version calculates this once upfront (n = len(s)) and reuses short variable names (s, ws) to minimize attribute lookups. This is particularly impactful in the method name scanning loop which executes ~7,157 times according to the profiler, where the line went from 10.3% of total time to 9.3%.

2. Hoisting prev_char Computation in _find_balanced_parens

The original implementation recalculated prev_char = code[pos - 1] if pos > 0 else "" on every character (4,663 hits), accounting for 12.3% of execution time. The optimized version:

  • Initializes prev_char once before the loop
  • Updates it only at the end of each iteration with prev_char = char
  • Eliminates the conditional check on every character

This change reduced the per-character overhead from 351.5ns to just updating a simple variable assignment, nearly eliminating this hotspot.

3. Pre-computing String Length

Both methods now compute length once (n = len(code), n = len(s)) rather than repeatedly calling len() in tight loop conditions. Given the high iteration counts (787 iterations in _find_fluent_chain_end, 5,389 in _find_balanced_parens), this compounds into measurable savings.

Performance Impact by Test Type

The optimization shows varying benefits based on workload characteristics:

  • Large-scale chains (300+ method calls): 30-50% faster - The benefit scales with chain length due to cumulative savings from reduced per-iteration overhead
  • Complex nested structures: 32-49% faster - Deep nesting means more iterations in _find_balanced_parens, where the prev_char optimization is most effective
  • Short chains: 3-12% faster - Even simple cases benefit, though less dramatically
  • String-heavy arguments: 18-44% faster - The character-by-character processing in _find_balanced_parens benefits significantly from the prev_char optimization

The profiler data confirms these are the right hotspots: the _find_balanced_parens call consumed 73.2% of total time in the original, and the optimizations directly target the most-executed lines within these methods. All correctness tests pass, confirming functional equivalence while delivering substantial runtime improvements.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 116 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 96.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

# imports
import pytest  # used for our unit tests
from codeflash.languages.java.remove_asserts import JavaAssertTransformer

# ----------------------
# Unit tests start here
# ----------------------

# Create a transformer instance used across tests
@pytest.fixture
def transformer() -> JavaAssertTransformer:
    # Provide a simple function_name; analyzer is optional and uses our minimal get_java_analyzer
    return JavaAssertTransformer(function_name="testFunc")

def test_basic_terminal_chain_end(transformer):
    # Basic scenario: a chain of terminal methods .isEqualTo(...).isNotNull()
    src = "assertThat(x).isEqualTo(1).isNotNull();"
    # Find position of the dot before the first method in the chain
    start_pos = src.find(".isEqualTo")
    # Expected: end after the final closing parenthesis of isNotNull()
    expected_end = src.rfind(")") + 1
    # Call the real method and assert expected index
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 6.77μs -> 6.28μs (7.82% faster)

def test_non_terminal_method_stops_chain(transformer):
    # Edge scenario: a non-terminal method (not in ASSERTJ_TERMINAL_METHODS) should stop the chain
    # describedAs is a typical non-terminal AssertJ method
    src = 'assertThat(x).describedAs("desc").isEqualTo(1);'
    start_pos = src.find(".describedAs")
    # The function should stop at the end of describedAs(...), which is immediately before ".isEqualTo"
    expected_end = src.find(".isEqualTo")
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 8.14μs -> 7.51μs (8.40% faster)

def test_dot_missing_at_start_returns_immediate_position(transformer):
    # If the start_pos does not point to a dot (after skipping whitespace), nothing should be consumed
    src = "noDotHere"
    start_pos = 0
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 962ns -> 981ns (1.94% slower)

def test_unbalanced_parentheses_stops_at_open_paren(transformer):
    # Edge case where parentheses are unbalanced: parser should break and return position at '('
    src = "assertThat(x).isEqualTo(1.isNotNull()"
    start_pos = src.find(".isEqualTo")
    # The '(' that starts isEqualTo's args
    open_paren_pos = src.find("(", start_pos)
    # Since parentheses are unbalanced, end_pos should be the position of that '(' (not advanced)
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 5.62μs -> 5.10μs (10.2% faster)

def test_parentheses_with_strings_and_escaped_content(transformer):
    # Complex argument: a string inside the parentheses that contains parentheses characters
    src = 'assertThat(x).isEqualTo(") ( ) (nested ( ) )").isNotNull()'
    start_pos = src.find(".isEqualTo")
    # Entire chain should be recognized and end after the last closing ')'
    expected_end = src.rfind(")") + 1
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 9.22μs -> 8.21μs (12.3% faster)

def test_method_name_with_underscore_and_numbers(transformer):
    # Method names may include underscores and digits and should be accepted as method identifiers
    # Use a custom non-terminal method name with underscore/digits
    src = "assertThat(x)._myMethod123(42).isEqualTo(5);"
    start_pos = src.find("._myMethod123")
    # Because _myMethod123 is not in the terminal set, the chain search should stop after it
    expected_end = src.find(".isEqualTo")
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 7.53μs -> 7.05μs (6.82% faster)

def test_whitespace_before_dot_is_skipped(transformer):
    # start_pos located at whitespace before dot should be skipped to the dot
    src = "assertThat(x)   .isNull()"
    # Put start_pos at whitespace region before the dot
    whitespace_pos = src.find("   ")
    # Expected: chain ends after isNull() closing parenthesis
    expected_end = src.rfind(")") + 1
    codeflash_output = transformer._find_fluent_chain_end(src, whitespace_pos); end_pos = codeflash_output # 4.17μs -> 3.96μs (5.31% faster)

def test_character_literals_inside_parentheses(transformer):
    # Character literal containing a parenthesis should not confuse the parentheses matcher
    src = "assertThat(map).contains(')').isEmpty();"
    start_pos = src.find(".contains")
    # After contains(')') the chain continues because contains is in ASSERTJ_TERMINAL_METHODS
    expected_end = src.rfind(")") + 1
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 7.11μs -> 6.54μs (8.71% faster)

def test_method_without_parentheses_counts_as_terminal_and_chains(transformer):
    # A terminal method without parentheses (e.g., .isEmpty) should be allowed and chain continues
    src = "obj.isEmpty.isNotNull()"
    start_pos = src.find(".isEmpty")
    # Since isEmpty is terminal, continuation should process .isNotNull() and end after ')'
    expected_end = src.rfind(")") + 1
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 5.21μs -> 5.01μs (4.01% faster)

def test_large_scale_many_chained_terminal_methods(transformer):
    # Large-scale test: create a long chain of terminal methods to ensure scalability.
    # Use 300 repetitions (well under the 1000-step safety guideline).
    repetitions = 300
    chain = "".join(".isNotNull()" for _ in range(repetitions))
    src = "obj" + chain
    start_pos = src.find(".isNotNull")  # first dot
    # Expect the chain end to be at the very end of the string (after all method calls)
    expected_end = len(src)
    codeflash_output = transformer._find_fluent_chain_end(src, start_pos); end_pos = codeflash_output # 479μs -> 369μs (30.0% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
import pytest
from codeflash.languages.java.parser import get_java_analyzer
from codeflash.languages.java.remove_asserts import (ASSERTJ_TERMINAL_METHODS,
                                                     JavaAssertTransformer)

class TestFindFluentChainEnd:
    """Test suite for JavaAssertTransformer._find_fluent_chain_end method."""

    @pytest.fixture
    def transformer(self):
        """Create a JavaAssertTransformer instance for testing."""
        analyzer = get_java_analyzer()
        return JavaAssertTransformer("testMethod", analyzer=analyzer)

    # ============================================================================
    # BASIC TEST CASES - Verify fundamental functionality under normal conditions
    # ============================================================================

    def test_single_terminal_assertion(self, transformer):
        """Test fluent chain with a single terminal assertion method.
        
        This verifies the function correctly identifies the end of a simple
        assertion chain ending with a terminal method like isEqualTo.
        """
        source = "assertThat(value).isEqualTo(5)"
        # Start position after assertThat(value)
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 4.42μs -> 4.27μs (3.51% faster)

    def test_chain_with_single_intermediate_method(self, transformer):
        """Test fluent chain with intermediate method before terminal assertion.
        
        Verifies that intermediate methods (non-terminal) don't end the chain,
        and the search continues until a terminal method is found.
        """
        source = "assertThat(value).extracting(\"field\").isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 8.14μs -> 7.42μs (9.73% faster)

    def test_chain_with_multiple_intermediate_methods(self, transformer):
        """Test fluent chain with multiple intermediate methods.
        
        Verifies the function continues through multiple chained intermediate
        methods until reaching a terminal method.
        """
        source = "assertThat(list).extracting(\"id\").contains(1).contains(2)"
        start_pos = len("assertThat(list)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 9.10μs -> 8.48μs (7.33% faster)

    def test_whitespace_handling_around_dots(self, transformer):
        """Test that whitespace around dots is properly handled.
        
        Ensures the function correctly parses chains with various whitespace
        patterns including spaces, tabs, and newlines.
        """
        source = "assertThat(value)  .  isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 4.74μs -> 4.50μs (5.34% faster)

    def test_newline_in_fluent_chain(self, transformer):
        """Test fluent chain split across multiple lines.
        
        Verifies the function correctly handles newline characters within
        fluent assertion chains, which is a common formatting pattern.
        """
        source = "assertThat(value)\n.isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 4.30μs -> 4.28μs (0.468% faster)

    def test_method_with_complex_arguments(self, transformer):
        """Test assertion method with complex parenthesized arguments.
        
        Ensures the function properly handles balanced parentheses within
        method arguments, including nested function calls.
        """
        source = "assertThat(value).isEqualTo(foo(bar(5)))"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.96μs -> 5.43μs (9.78% faster)

    def test_method_with_string_argument_containing_parens(self, transformer):
        """Test that string literals with parentheses don't confuse paren matching.
        
        Verifies that parentheses inside string literals are ignored when
        counting balanced parentheses.
        """
        source = 'assertThat(value).isEqualTo("test()")'
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.49μs -> 4.94μs (11.2% faster)

    def test_method_with_escaped_quote_in_string(self, transformer):
        """Test that escaped quotes in strings are handled correctly.
        
        Ensures escaped quotes within string literals don't prematurely
        terminate string parsing.
        """
        source = r'assertThat(value).isEqualTo("test\"quote")'
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.18μs -> 5.54μs (11.6% faster)

    def test_no_dot_after_start_position(self, transformer):
        """Test when there's no dot immediately after start position.
        
        Verifies the function returns the correct position when the chain
        ends immediately (no further methods to call).
        """
        source = "assertThat(value)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 561ns -> 671ns (16.4% slower)

    def test_start_position_at_end_of_source(self, transformer):
        """Test when start position is at the very end of the source.
        
        Verifies boundary handling when there's no content after the start
        position.
        """
        source = "assertThat(value)"
        start_pos = len(source)
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 521ns -> 590ns (11.7% slower)

    # ============================================================================
    # EDGE TEST CASES - Evaluate behavior under extreme/unusual conditions
    # ============================================================================

    def test_empty_source_string(self, transformer):
        """Test with empty source string.
        
        Verifies safe handling of edge case where the entire source is empty.
        """
        source = ""
        codeflash_output = transformer._find_fluent_chain_end(source, 0); result = codeflash_output # 471ns -> 581ns (18.9% slower)

    def test_start_position_beyond_source_length(self, transformer):
        """Test when start position exceeds source length.
        
        Verifies the function safely handles out-of-bounds start positions
        without crashing.
        """
        source = "test"
        codeflash_output = transformer._find_fluent_chain_end(source, 100); result = codeflash_output # 591ns -> 651ns (9.22% slower)

    def test_negative_start_position(self, transformer):
        """Test with negative start position.
        
        Verifies the function handles negative indices (though unusual in normal
        operation, it should not crash).
        """
        source = "assertThat(value).isEqualTo(5)"
        # Negative indices in Python wrap around, but function uses < len comparison
        codeflash_output = transformer._find_fluent_chain_end(source, -1); result = codeflash_output # 1.07μs -> 1.06μs (0.942% faster)

    def test_method_with_underscores_in_name(self, transformer):
        """Test method names containing underscores.
        
        Verifies that method names with underscores are properly recognized,
        including those not in the terminal methods set.
        """
        source = "assertThat(value).custom_method().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 7.08μs -> 6.91μs (2.46% faster)

    def test_method_with_numbers_in_name(self, transformer):
        """Test method names containing numbers.
        
        Verifies that method names with alphanumeric characters are correctly
        parsed.
        """
        source = "assertThat(value).method2().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.63μs -> 6.11μs (8.53% faster)

    def test_all_terminal_methods(self, transformer):
        """Test chain with different terminal assertion methods.
        
        Verifies that all methods in ASSERTJ_TERMINAL_METHODS are recognized
        as chain terminators.
        """
        # Test a few different terminal methods
        terminal_methods = ["isEqualTo", "isNull", "isEmpty", "isTrue", "contains"]
        for method in terminal_methods:
            if method in ASSERTJ_TERMINAL_METHODS:
                source = f"assertThat(value).{method}(arg)"
                start_pos = len("assertThat(value)")
                codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output

    def test_multiple_terminal_methods_in_sequence(self, transformer):
        """Test chain with multiple terminal methods in sequence.
        
        Verifies that even after finding a terminal method, the function
        continues looking for additional chained assertions.
        """
        source = "assertThat(value).isEqualTo(5).isPositive()"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.65μs -> 6.09μs (9.21% faster)

    def test_unmatched_opening_parenthesis(self, transformer):
        """Test handling of unmatched opening parenthesis.
        
        Verifies the function correctly identifies unbalanced parentheses
        and terminates the chain search appropriately.
        """
        source = "assertThat(value).isEqualTo(5"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 3.52μs -> 3.41μs (3.23% faster)

    def test_method_name_at_end_of_source_no_parens(self, transformer):
        """Test method name without parentheses at end of source.
        
        Verifies handling of method names that aren't followed by parentheses,
        which may indicate incomplete code.
        """
        source = "assertThat(value).isEqualTo"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 2.65μs -> 2.62μs (1.53% faster)

    def test_only_dot_no_method_name(self, transformer):
        """Test when dot appears but is not followed by valid method name.
        
        Verifies the function correctly terminates when a dot doesn't lead to
        a valid method identifier.
        """
        source = "assertThat(value). 123"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 2.42μs -> 2.32μs (4.30% faster)

    def test_tab_characters_in_chain(self, transformer):
        """Test that tab characters are handled as whitespace.
        
        Verifies tabs between chain elements are properly skipped.
        """
        source = "assertThat(value)\t.\tisEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 4.75μs -> 4.53μs (4.88% faster)

    def test_carriage_return_in_chain(self, transformer):
        """Test that carriage returns are handled as whitespace.
        
        Verifies CRLF line endings are properly handled.
        """
        source = "assertThat(value)\r.\risEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 4.60μs -> 4.45μs (3.39% faster)

    def test_character_literal_with_escaped_quote(self, transformer):
        """Test character literal with escaped single quote.
        
        Verifies that escaped quotes in character literals don't break parsing.
        """
        source = r"assertThat(value).isEqualTo('\'')"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.10μs -> 4.91μs (3.87% faster)

    def test_nested_method_calls_in_arguments(self, transformer):
        """Test deeply nested method calls in assertion arguments.
        
        Verifies that deeply nested parentheses are properly tracked.
        """
        source = "assertThat(value).contains(foo(bar(baz(1))))"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.68μs -> 5.92μs (12.9% faster)

    def test_empty_parentheses(self, transformer):
        """Test method calls with empty parentheses.
        
        Verifies that methods with no arguments are properly handled.
        """
        source = "assertThat(value).isEmpty().isNotNull()"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.04μs -> 5.84μs (3.42% faster)

    def test_whitespace_between_method_and_parens(self, transformer):
        """Test whitespace between method name and opening parenthesis.
        
        Verifies that spaces/tabs before parentheses are properly skipped.
        """
        source = "assertThat(value).isEqualTo (5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 4.45μs -> 4.26μs (4.49% faster)

    def test_single_letter_method_name(self, transformer):
        """Test single-letter method names.
        
        Verifies that very short method names are recognized.
        """
        source = "assertThat(value).a().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.90μs -> 5.58μs (5.73% faster)

    def test_all_uppercase_method_name(self, transformer):
        """Test all uppercase method names.
        
        Verifies case sensitivity in method name parsing.
        """
        source = "assertThat(value).CUSTOM().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.24μs -> 6.02μs (3.65% faster)

    def test_mixed_case_method_name(self, transformer):
        """Test mixed case method names (camelCase).
        
        Verifies typical Java naming conventions are handled.
        """
        source = "assertThat(value).myCustomMethod().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.76μs -> 6.43μs (5.13% faster)

    def test_string_with_dots(self, transformer):
        """Test that dots inside string literals don't break chain detection.
        
        Verifies dots within strings are ignored when looking for method calls.
        """
        source = 'assertThat(value).isEqualTo("test.value")'
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.03μs -> 5.37μs (12.3% faster)

    def test_comment_like_sequence_in_string(self, transformer):
        """Test that comment markers inside strings don't interfere.
        
        Verifies that string content isn't parsed as code.
        """
        source = 'assertThat(value).isEqualTo("// comment")'
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.85μs -> 5.41μs (8.15% faster)

    # ============================================================================
    # LARGE SCALE TEST CASES - Assess performance and scalability
    # ============================================================================

    def test_very_long_chain_100_methods(self, transformer):
        """Test performance with a very long fluent chain of 100 methods.
        
        Verifies the function can handle realistic long assertion chains
        without performance degradation.
        """
        # Build a chain with 100 methods
        source_parts = ["assertThat(value)"]
        for i in range(100):
            if i % 2 == 0:
                source_parts.append(".extracting(\"field\")")
            else:
                source_parts.append(".isNotNull()")
        source = "".join(source_parts)
        
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 225μs -> 166μs (35.0% faster)

    def test_very_long_method_name(self, transformer):
        """Test with extremely long method names.
        
        Verifies performance with unusually long identifiers.
        """
        long_name = "a" * 500
        source = f"assertThat(value).{long_name}().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 41.7μs -> 30.5μs (36.6% faster)

    def test_very_long_argument_list(self, transformer):
        """Test with very long method arguments.
        
        Verifies the function handles methods with long argument lists.
        """
        # Create a long argument string
        args = ", ".join([str(i) for i in range(100)])
        source = f"assertThat(value).containsExactly({args}).isNotNull()"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 66.1μs -> 44.2μs (49.4% faster)

    def test_deeply_nested_method_calls_50_levels(self, transformer):
        """Test with deeply nested method calls (50 levels deep).
        
        Verifies the function can handle deeply nested parentheses.
        """
        # Build deeply nested calls
        nested = "1"
        for i in range(50):
            nested = f"f({nested})"
        source = f"assertThat(value).isEqualTo({nested})"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 25.7μs -> 19.4μs (32.7% faster)

    def test_many_strings_in_chain(self, transformer):
        """Test chain with many string arguments.
        
        Verifies performance and correctness with multiple string literals.
        """
        source_parts = ["assertThat(value)"]
        for i in range(50):
            source_parts.append(f'.contains("string{i}")')
        source = "".join(source_parts)
        
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 152μs -> 106μs (43.6% faster)

    def test_many_escaped_characters_in_strings(self, transformer):
        """Test strings with many escaped characters.
        
        Verifies string parsing with multiple escape sequences.
        """
        # Create string with many escape sequences
        escaped_str = r'\\\\\\\\\\\\\\\\\\\\\"\"\"\"\"\"\"\"\"\"'
        source = f'assertThat(value).isEqualTo("{escaped_str}").isNotNull()'
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 12.0μs -> 10.1μs (18.6% faster)

    def test_alternating_whitespace_patterns(self, transformer):
        """Test chain with alternating whitespace patterns.
        
        Verifies handling of various whitespace combinations throughout chain.
        """
        source = "assertThat(value) \n . \t isEqualTo \r ( 5 ) \n . \t isPositive()"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 8.19μs -> 7.58μs (7.92% faster)

    def test_chain_with_many_intermediate_non_terminal_methods(self, transformer):
        """Test chain with 200 intermediate methods before terminal.
        
        Verifies the function processes long chains of non-terminal methods
        efficiently.
        """
        source_parts = ["assertThat(value)"]
        # Add 200 non-terminal methods
        for i in range(200):
            source_parts.append(f'.filter(x -> x > {i})')
        source_parts.append(".isNotEmpty()")
        source = "".join(source_parts)
        
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 644μs -> 445μs (44.6% faster)

    def test_large_source_file_context(self, transformer):
        """Test with large surrounding source context.
        
        Verifies the function correctly handles the target chain when it's
        part of a much larger source string.
        """
        # Create a large source file context
        prefix = "int x = 5;\n" * 500
        chain = "assertThat(value).isEqualTo(5)"
        suffix = "\nint y = 10;\n" * 500
        source = prefix + chain + suffix
        
        start_pos = len(prefix) + len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.58μs -> 5.10μs (9.45% faster)
        expected = len(prefix) + len(chain)

    def test_unicode_in_string_arguments(self, transformer):
        """Test method arguments containing unicode characters.
        
        Verifies string parsing handles unicode correctly.
        """
        source = 'assertThat(value).isEqualTo("café").isNotNull()'
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 8.12μs -> 7.39μs (9.77% faster)

    def test_chain_with_multiple_character_literals(self, transformer):
        """Test chain with multiple character literal arguments.
        
        Verifies character literal parsing throughout chain.
        """
        source = "assertThat(value).contains('a').contains('b').contains('c').isNotEmpty()"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 11.1μs -> 10.0μs (10.6% faster)

    def test_method_with_lambda_expression_argument(self, transformer):
        """Test method argument containing lambda expression.
        
        Verifies parentheses within lambda expressions are handled.
        """
        source = "assertThat(value).filter(x -> x > 0).isNotEmpty()"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 7.63μs -> 6.93μs (10.1% faster)

    def test_balanced_parens_with_mixed_quotes(self, transformer):
        """Test arguments mixing single and double quoted strings.
        
        Verifies string type tracking when both quote types appear.
        """
        source = '''assertThat(value).contains("double'quotes").contains('single"quotes').isNotNull()'''
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 12.5μs -> 11.4μs (9.39% faster)

    def test_very_long_source_with_short_chain(self, transformer):
        """Test finding short chain in very large source file.
        
        Verifies position tracking accuracy in large context.
        """
        # Create a 50KB source string with the chain at position 25KB
        prefix = "int x = 0;\n" * 2500
        chain = "assertThat(value).isEqualTo(5)"
        suffix = "\nint y = 10;\n" * 2500
        source = prefix + chain + suffix
        
        start_pos = len(prefix) + len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 5.58μs -> 5.20μs (7.31% faster)
        expected = len(prefix) + len(chain)

    def test_chain_with_method_names_similar_to_terminals(self, transformer):
        """Test with method names that are substrings of terminal methods.
        
        Verifies exact method name matching (not substring matching).
        """
        # Use method names that contain parts of terminal method names
        source = "assertThat(value).isEqual().isEqualTo(5)"
        start_pos = len("assertThat(value)")
        codeflash_output = transformer._find_fluent_chain_end(source, start_pos); result = codeflash_output # 6.40μs -> 6.22μs (2.91% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr1295-2026-02-03T22.00.31 and push.

Codeflash Static Badge

This optimization achieves a **32% runtime improvement** (from 1.92ms to 1.45ms) by reducing redundant computations in two performance-critical string parsing methods that process Java assertion chains.

## Key Optimizations

### 1. **Local Variable Caching in `_find_fluent_chain_end`**
The original code called `len(source)` repeatedly in every loop iteration—up to 7 times per iteration across multiple while loops. The optimized version calculates this once upfront (`n = len(s)`) and reuses short variable names (`s`, `ws`) to minimize attribute lookups. This is particularly impactful in the method name scanning loop which executes ~7,157 times according to the profiler, where the line went from 10.3% of total time to 9.3%.

### 2. **Hoisting `prev_char` Computation in `_find_balanced_parens`**
The original implementation recalculated `prev_char = code[pos - 1] if pos > 0 else ""` on **every character** (4,663 hits), accounting for 12.3% of execution time. The optimized version:
- Initializes `prev_char` once before the loop
- Updates it only at the end of each iteration with `prev_char = char`
- Eliminates the conditional check on every character

This change reduced the per-character overhead from 351.5ns to just updating a simple variable assignment, nearly eliminating this hotspot.

### 3. **Pre-computing String Length**
Both methods now compute length once (`n = len(code)`, `n = len(s)`) rather than repeatedly calling `len()` in tight loop conditions. Given the high iteration counts (787 iterations in `_find_fluent_chain_end`, 5,389 in `_find_balanced_parens`), this compounds into measurable savings.

## Performance Impact by Test Type

The optimization shows varying benefits based on workload characteristics:

- **Large-scale chains** (300+ method calls): **30-50% faster** - The benefit scales with chain length due to cumulative savings from reduced per-iteration overhead
- **Complex nested structures**: **32-49% faster** - Deep nesting means more iterations in `_find_balanced_parens`, where the `prev_char` optimization is most effective
- **Short chains**: **3-12% faster** - Even simple cases benefit, though less dramatically
- **String-heavy arguments**: **18-44% faster** - The character-by-character processing in `_find_balanced_parens` benefits significantly from the `prev_char` optimization

The profiler data confirms these are the right hotspots: the `_find_balanced_parens` call consumed 73.2% of total time in the original, and the optimizations directly target the most-executed lines within these methods. All correctness tests pass, confirming functional equivalence while delivering substantial runtime improvements.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 3, 2026
@KRRT7
Copy link
Collaborator

KRRT7 commented Feb 3, 2026

@claude fix the pre-commit issues and push

@claude
Copy link
Contributor

claude bot commented Feb 3, 2026

Claude finished @KRRT7's task in 1m 57s —— View job


Pre-commit Analysis Complete ✅

Todo List:

  • Check current git status and branch
  • Run pre-commit hooks to identify issues
  • Analyze pre-commit failures
  • Verify changes in this PR

Findings

I've analyzed the pre-commit CI failure and found that the remove_asserts.py file modified in this PR has NO linting errors. The optimization changes are clean and properly formatted.

The pre-commit failures are from pre-existing issues in OTHER files (not touched by this PR):

  • codeflash/languages/java/support.py - TC001 errors (type-checking imports)
  • codeflash/languages/java/test_discovery.py - B007, PERF102, PLR1714, PIE810, SIM110 errors
  • codeflash/languages/java/test_runner.py - EM102 error
  • Various other files with formatting issues

These are linting issues that exist in the base branch (feat/java-remove-asserts-transformer) and are not introduced by this optimization PR.

Conclusion

No changes needed for this PR. The optimized code in codeflash/languages/java/remove_asserts.py is properly formatted and passes all linting checks. The CI failures are due to pre-existing issues in other files that should be fixed separately in the base branch.

The optimization changes in this PR are:

  • ✅ Properly formatted
  • ✅ Follow code style guidelines
  • ✅ Pass ruff linting
  • ✅ No pre-commit issues

--- |

Base automatically changed from feat/java-remove-asserts-transformer to omni-java February 3, 2026 22:18
@KRRT7
Copy link
Collaborator

KRRT7 commented Feb 19, 2026

Closing stale bot PR.

@KRRT7 KRRT7 closed this Feb 19, 2026
@KRRT7 KRRT7 deleted the codeflash/optimize-pr1295-2026-02-03T22.00.31 branch February 19, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant