⚡️ Speed up method JavaAssertTransformer._find_hamcrest_assertions by 39% in PR #1199 (omni-java)#1355
Open
codeflash-ai[bot] wants to merge 1 commit intoomni-javafrom
Open
Conversation
This optimization achieves a **38% runtime improvement** (6.74ms → 4.86ms) by eliminating repeated regex compilation overhead and streamlining string operations.
## Key Optimizations
**1. Pre-compiled Regex Patterns (Primary Impact)**
The original code recompiled regex patterns on every method call:
- `_find_hamcrest_assertions`: Compiled `assertThat` pattern per invocation
- `_extract_target_calls`: Compiled function name pattern per invocation (potentially hundreds of times)
- Additional patterns compiled in receiver detection logic
The optimized version moves these to `__init__` as instance attributes (`_hamcrest_pattern`, `_method_pattern`, `_new_class_pattern`, `_ident_pattern`), compiled once per transformer instance. Line profiler shows this eliminated ~1.13ms in `_find_hamcrest_assertions` and ~5.02ms in `_extract_target_calls` - the primary source of the speedup.
**2. Reduced String Operations**
- **Receiver extraction**: Changed from `content[receiver_start:method_start].rstrip(".").strip()` (two operations) to `receiver_text.rstrip().rstrip(".")` (still two calls but more targeted)
- **Conditional extraction**: Added `if receiver_start < method_start` check to avoid unnecessary string slicing when no receiver exists
- **Single variable assignment**: Introduced `before_dot_content` to avoid redundant slicing
**3. Loop Optimization in `_find_balanced_parens`**
- Pre-computed `code_len = len(code)` outside the loop (avoiding repeated `len()` calls)
- Restructured escape checking: Changed `prev_char = code[pos - 1] if pos > 0 else ""` and `prev_char != "\\"` to direct check `pos == 0 or code[pos - 1] != "\\"`, eliminating string comparisons
- Optimized character indexing by removing the `prev_char` variable altogether
## Performance Impact by Test Category
**Best improvements** (40-50% faster): Test cases with many regex compilations benefit most - empty sources, non-matching frameworks, and high-volume scenarios (500 assertions: 3.27ms → 2.30ms, 42% faster).
**Moderate improvements** (27-35%): Tests with actual assertion parsing show 27-35% speedup from both regex and string operation optimizations.
**Consistent gains**: All test cases show improvement, indicating the optimizations benefit both cold starts (regex compilation) and hot paths (repeated parsing).
The optimization is particularly effective for workloads that repeatedly call these methods with the same function names, as the regex patterns are reused across all invocations on the same transformer instance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1199
If you approve this dependent PR, these changes will be merged into the original PR branch
omni-java.📄 39% (0.39x) speedup for
JavaAssertTransformer._find_hamcrest_assertionsincodeflash/languages/java/remove_asserts.py⏱️ Runtime :
6.74 milliseconds→4.86 milliseconds(best of101runs)📝 Explanation and details
This optimization achieves a 38% runtime improvement (6.74ms → 4.86ms) by eliminating repeated regex compilation overhead and streamlining string operations.
Key Optimizations
1. Pre-compiled Regex Patterns (Primary Impact)
The original code recompiled regex patterns on every method call:
_find_hamcrest_assertions: CompiledassertThatpattern per invocation_extract_target_calls: Compiled function name pattern per invocation (potentially hundreds of times)The optimized version moves these to
__init__as instance attributes (_hamcrest_pattern,_method_pattern,_new_class_pattern,_ident_pattern), compiled once per transformer instance. Line profiler shows this eliminated ~1.13ms in_find_hamcrest_assertionsand ~5.02ms in_extract_target_calls- the primary source of the speedup.2. Reduced String Operations
content[receiver_start:method_start].rstrip(".").strip()(two operations) toreceiver_text.rstrip().rstrip(".")(still two calls but more targeted)if receiver_start < method_startcheck to avoid unnecessary string slicing when no receiver existsbefore_dot_contentto avoid redundant slicing3. Loop Optimization in
_find_balanced_parenscode_len = len(code)outside the loop (avoiding repeatedlen()calls)prev_char = code[pos - 1] if pos > 0 else ""andprev_char != "\\"to direct checkpos == 0 or code[pos - 1] != "\\", eliminating string comparisonsprev_charvariable altogetherPerformance Impact by Test Category
Best improvements (40-50% faster): Test cases with many regex compilations benefit most - empty sources, non-matching frameworks, and high-volume scenarios (500 assertions: 3.27ms → 2.30ms, 42% faster).
Moderate improvements (27-35%): Tests with actual assertion parsing show 27-35% speedup from both regex and string operation optimizations.
Consistent gains: All test cases show improvement, indicating the optimizations benefit both cold starts (regex compilation) and hot paths (repeated parsing).
The optimization is particularly effective for workloads that repeatedly call these methods with the same function names, as the regex patterns are reused across all invocations on the same transformer instance.
✅ Correctness verification report:
🌀 Click to see Generated Regression Tests
To edit these changes
git checkout codeflash/optimize-pr1199-2026-02-04T01.17.36and push.