Skip to content

⚡️ Speed up function is_build_output_dir by 141% in PR #1288 (fix/js-test-framework-detection)#1290

Closed
codeflash-ai[bot] wants to merge 1 commit intofix/js-test-framework-detectionfrom
codeflash/optimize-pr1288-2026-02-03T06.26.27
Closed

⚡️ Speed up function is_build_output_dir by 141% in PR #1288 (fix/js-test-framework-detection)#1290
codeflash-ai[bot] wants to merge 1 commit intofix/js-test-framework-detectionfrom
codeflash/optimize-pr1288-2026-02-03T06.26.27

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Feb 3, 2026

⚡️ This pull request contains optimizations for PR #1288

If you approve this dependent PR, these changes will be merged into the original PR branch fix/js-test-framework-detection.

This PR will be automatically closed if the original PR is merged.


📄 141% (1.41x) speedup for is_build_output_dir in codeflash/setup/detector.py

⏱️ Runtime : 590 microseconds 244 microseconds (best of 180 runs)

📝 Explanation and details

The optimized code achieves a 141% speedup (from 590μs to 244μs) through three key optimizations:

1. Eliminated repeated set construction
The original code recreated the build_dirs set on every function call. Moving it to a module-level frozenset constant (BUILD_DIRS) eliminates this overhead entirely. The line profiler shows this saved ~160ns per call (5.3% of original runtime).

2. Replaced expensive string conversion with native Path API
The original used path.as_posix().split("/") which:

  • Converts the Path to a POSIX string representation
  • Allocates a new string
  • Splits it into a list of parts

The optimized version uses path.parts, which directly returns a tuple of path components without any string conversion or allocation. This alone saved ~1.68μs per call (55.9% of original runtime), making it the single biggest performance gain.

3. Early-exit explicit loop vs generator expression
Replacing any(part in BUILD_DIRS for part in parts) with an explicit loop enables early exit as soon as a build directory is found. While any() also short-circuits, the explicit loop avoids generator object creation overhead. The line profiler shows the loop overhead is now distributed across multiple lines but totals less time than the original any() expression.

Performance characteristics from test results:

  • Shallow paths (2-3 components): ~2-2.5x faster consistently
  • Deep paths without build dirs (100 levels): ~2.5x faster due to avoiding string allocation on every check
  • Early matches: Up to 7x faster when build directory appears first (703% speedup for repeated "build" directories), demonstrating the power of early exit
  • Late matches: Still ~2.5x faster, showing the string conversion savings dominate even without early exit

The optimization is particularly effective for codebases with deep directory structures or when scanning many paths, as each path check now uses native tuple operations instead of string manipulation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 98 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from __future__ import annotations

from pathlib import Path

# imports
import pytest  # used for our unit tests
from codeflash.setup.detector import is_build_output_dir

def test_basic_simple_build_directory():
    # Basic positive case: a path that is exactly the build directory should be detected.
    p = Path("build")
    codeflash_output = is_build_output_dir(p) # 4.71μs -> 2.29μs (105% faster)

def test_basic_nested_build_directory():
    # Basic positive: build directory nested within other directories should be detected.
    p = Path("src/module/build/lib")
    codeflash_output = is_build_output_dir(p) # 5.33μs -> 2.44μs (118% faster)

def test_basic_dist_and_out_variants():
    # Ensure other canonical build directories are recognized.
    codeflash_output = is_build_output_dir(Path("dist")) # 4.60μs -> 2.20μs (109% faster)
    codeflash_output = is_build_output_dir(Path("project/out/bin")) # 3.28μs -> 1.45μs (125% faster)

def test_dot_next_and_dot_nuxt_are_detected():
    # Hidden framework build directories should be detected.
    codeflash_output = is_build_output_dir(Path(".next/static")) # 4.83μs -> 2.39μs (102% faster)
    codeflash_output = is_build_output_dir(Path("site/.nuxt/server")) # 3.15μs -> 1.47μs (114% faster)

def test_negative_similar_names_do_not_match():
    # Ensure substring matches do NOT count: 'builds', 'mybuild', 'build.zip' should not be considered build dirs.
    codeflash_output = is_build_output_dir(Path("builds")) # 4.37μs -> 2.22μs (96.4% faster)
    codeflash_output = is_build_output_dir(Path("mybuild/output")) # 2.81μs -> 1.33μs (111% faster)
    codeflash_output = is_build_output_dir(Path("artifact/build.zip")) # 2.44μs -> 1.04μs (135% faster)

def test_case_sensitivity():
    # The implementation does a direct membership test; it is case-sensitive.
    # 'Build' or 'BUILD' should not match the lowercase 'build' entry.
    codeflash_output = is_build_output_dir(Path("Build")) # 4.15μs -> 2.25μs (84.0% faster)
    codeflash_output = is_build_output_dir(Path("PROJECT/BUILD/bin")) # 2.85μs -> 1.38μs (106% faster)

def test_root_and_current_and_empty_paths():
    # Path('/') splits into ['', ''] and should not match any build dir.
    codeflash_output = is_build_output_dir(Path("/")) # 4.78μs -> 2.28μs (109% faster)
    # Current directory '.' should not be treated as a build dir.
    codeflash_output = is_build_output_dir(Path(".")) # 2.49μs -> 1.12μs (122% faster)
    # An empty-string Path becomes '.' - should be False as well.
    codeflash_output = is_build_output_dir(Path("")) # 1.84μs -> 852ns (116% faster)

def test_parent_traversal_keeps_detection():
    # Paths containing '..' components should still detect a build directory if present.
    p = Path("src/../build/output")
    # The function does not resolve the path, but the 'build' segment is present and should be detected.
    codeflash_output = is_build_output_dir(p) # 5.25μs -> 2.44μs (116% faster)

def test_drive_letter_and_windows_style_paths():
    # Windows-style paths provided as strings should still behave reasonably when converted to POSIX form.
    # Example: 'C:/projects/build/file' contains 'build' as a segment and should be detected.
    p = Path("C:/projects/build/file")
    codeflash_output = is_build_output_dir(p) # 5.21μs -> 2.44μs (114% faster)

def test_redundant_slashes_and_trailing_slash():
    # Multiple slashes or trailing slashes should not prevent detection of build directories.
    p1 = Path("project//build///file")
    p2 = Path("/project/build/")
    codeflash_output = is_build_output_dir(p1) # 5.13μs -> 2.38μs (115% faster)
    codeflash_output = is_build_output_dir(p2) # 3.40μs -> 1.19μs (185% faster)

def test_idempotent_calls_return_same_result():
    # Calling the function multiple times with the same Path yields the same boolean result.
    p = Path("libs/mylib")
    codeflash_output = is_build_output_dir(p); first = codeflash_output # 4.52μs -> 2.35μs (92.0% faster)
    codeflash_output = is_build_output_dir(p); second = codeflash_output # 1.48μs -> 511ns (190% faster)

def test_many_paths_large_scale_sample():
    # Large-scale-ish test: generate a collection of varied paths (well under 1000 elements).
    # We keep the sample size moderate (200) to test scalability and correctness across many inputs.
    paths = []
    expected = []
    for i in range(200):  # well below the 1000-step limit
        if i % 7 == 0:
            # make some entries contain an exact 'build' segment
            p = Path(f"project_{i}/build/bin")
            paths.append(p)
            expected.append(True)
        elif i % 11 == 0:
            # some entries contain '.next'
            p = Path(f"site_{i}/.next/static")
            paths.append(p)
            expected.append(True)
        else:
            # many non-build-like paths, including names that contain build as substring
            p = Path(f"project_{i}/source/building")  # 'building' should NOT match
            paths.append(p)
            expected.append(False)
    # Now assert detection matches expectation for every generated path
    results = [is_build_output_dir(p) for p in paths]

def test_edge_case_components_with_punctuation():
    # Parts that include punctuation either side should not accidentally match.
    codeflash_output = is_build_output_dir(Path("pre-build/post")) # 4.82μs -> 2.42μs (99.5% faster)
    codeflash_output = is_build_output_dir(Path("build-")) # 2.27μs -> 1.22μs (85.2% faster)
    codeflash_output = is_build_output_dir(Path("release_build")) # 1.93μs -> 901ns (115% faster)

def test_path_with_multiple_build_parts_one_true_is_enough():
    # If any one part equals a build dir, the result should be True even if other parts are not.
    codeflash_output = is_build_output_dir(Path("build/out/dist")) # 4.88μs -> 2.22μs (119% faster)
    codeflash_output = is_build_output_dir(Path("out/build/something")) # 2.90μs -> 1.16μs (149% faster)

def test_unicode_and_special_characters_in_parts():
    # Non-ascii characters in other parts should not interfere with detection.
    codeflash_output = is_build_output_dir(Path("projéct/αβγ/build")) # 5.92μs -> 2.31μs (156% faster)
    codeflash_output = is_build_output_dir(Path("projéct/αβγ/buïld")) # 3.03μs -> 1.30μs (132% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from pathlib import Path

# imports
import pytest
from codeflash.setup.detector import is_build_output_dir

class TestBasicFunctionality:
    """Test the fundamental behavior of is_build_output_dir with normal inputs."""

    def test_single_build_directory_at_root(self):
        """Test that 'build' directory at root level is detected as build output."""
        path = Path("build")
        codeflash_output = is_build_output_dir(path) # 4.88μs -> 2.21μs (120% faster)

    def test_single_dist_directory_at_root(self):
        """Test that 'dist' directory at root level is detected as build output."""
        path = Path("dist")
        codeflash_output = is_build_output_dir(path) # 4.69μs -> 2.14μs (119% faster)

    def test_single_out_directory_at_root(self):
        """Test that 'out' directory at root level is detected as build output."""
        path = Path("out")
        codeflash_output = is_build_output_dir(path) # 4.46μs -> 2.18μs (104% faster)

    def test_next_js_directory_at_root(self):
        """Test that '.next' directory at root level is detected as build output."""
        path = Path(".next")
        codeflash_output = is_build_output_dir(path) # 4.46μs -> 2.22μs (100% faster)

    def test_nuxt_directory_at_root(self):
        """Test that '.nuxt' directory at root level is detected as build output."""
        path = Path(".nuxt")
        codeflash_output = is_build_output_dir(path) # 4.51μs -> 2.17μs (107% faster)

    def test_build_directory_nested_one_level(self):
        """Test that 'build' directory nested one level deep is detected."""
        path = Path("project/build")
        codeflash_output = is_build_output_dir(path) # 4.92μs -> 2.23μs (120% faster)

    def test_build_directory_nested_multiple_levels(self):
        """Test that 'build' directory nested multiple levels deep is detected."""
        path = Path("src/project/subfolder/build")
        codeflash_output = is_build_output_dir(path) # 5.23μs -> 2.36μs (121% faster)

    def test_build_directory_in_middle_of_path(self):
        """Test that 'build' directory in the middle of a path is detected."""
        path = Path("project/build/output")
        codeflash_output = is_build_output_dir(path) # 4.77μs -> 2.22μs (114% faster)

    def test_dist_directory_nested_multiple_levels(self):
        """Test that 'dist' directory nested multiple levels is detected."""
        path = Path("my/project/structure/dist")
        codeflash_output = is_build_output_dir(path) # 5.09μs -> 2.30μs (121% faster)

    def test_out_directory_with_deep_nesting(self):
        """Test that 'out' directory deep in the hierarchy is detected."""
        path = Path("a/b/c/d/out/e/f")
        codeflash_output = is_build_output_dir(path) # 5.23μs -> 2.39μs (118% faster)

    def test_non_build_directory_single_level(self):
        """Test that non-build directories at root are not detected."""
        path = Path("src")
        codeflash_output = is_build_output_dir(path) # 4.15μs -> 2.13μs (94.3% faster)

    def test_non_build_directory_nested(self):
        """Test that non-build directories nested are not detected."""
        path = Path("project/source/main")
        codeflash_output = is_build_output_dir(path) # 4.64μs -> 2.33μs (98.7% faster)

    def test_non_build_directory_deeply_nested(self):
        """Test that non-build directories deeply nested are not detected."""
        path = Path("a/b/c/d/e/f/source")
        codeflash_output = is_build_output_dir(path) # 4.85μs -> 2.44μs (99.1% faster)

class TestEdgeCases:
    """Test edge cases and unusual conditions."""

    def test_empty_path(self):
        """Test that an empty path (current directory) does not match build directories."""
        path = Path("")
        codeflash_output = is_build_output_dir(path) # 4.15μs -> 1.94μs (113% faster)

    def test_dot_path_current_directory(self):
        """Test that a '.' path (current directory) does not match build directories."""
        path = Path(".")
        codeflash_output = is_build_output_dir(path) # 4.10μs -> 1.97μs (108% faster)

    def test_case_sensitivity_build_uppercase(self):
        """Test that 'BUILD' (uppercase) is not matched - function is case-sensitive."""
        path = Path("BUILD")
        codeflash_output = is_build_output_dir(path) # 4.12μs -> 2.13μs (93.0% faster)

    def test_case_sensitivity_dist_uppercase(self):
        """Test that 'DIST' (uppercase) is not matched - function is case-sensitive."""
        path = Path("DIST")
        codeflash_output = is_build_output_dir(path) # 4.16μs -> 2.10μs (97.6% faster)

    def test_case_sensitivity_build_mixed_case(self):
        """Test that 'Build' (mixed case) is not matched - function is case-sensitive."""
        path = Path("Build")
        codeflash_output = is_build_output_dir(path) # 3.98μs -> 2.08μs (90.8% faster)

    def test_partial_match_build_prefix(self):
        """Test that 'buildx' (starts with build but not exact) is not matched."""
        path = Path("buildx")
        codeflash_output = is_build_output_dir(path) # 4.09μs -> 2.07μs (97.1% faster)

    def test_partial_match_build_suffix(self):
        """Test that 'mybuild' (ends with build but not exact) is not matched."""
        path = Path("mybuild")
        codeflash_output = is_build_output_dir(path) # 4.00μs -> 2.08μs (91.8% faster)

    def test_partial_match_dist_prefix(self):
        """Test that 'distx' (starts with dist but not exact) is not matched."""
        path = Path("distx")
        codeflash_output = is_build_output_dir(path) # 4.08μs -> 2.07μs (96.6% faster)

    def test_partial_match_dist_suffix(self):
        """Test that 'mydist' (ends with dist but not exact) is not matched."""
        path = Path("mydist")
        codeflash_output = is_build_output_dir(path) # 4.07μs -> 1.99μs (104% faster)

    def test_next_missing_dot_prefix(self):
        """Test that 'next' without leading dot is not matched."""
        path = Path("next")
        codeflash_output = is_build_output_dir(path) # 4.12μs -> 2.10μs (95.7% faster)

    def test_nuxt_missing_dot_prefix(self):
        """Test that 'nuxt' without leading dot is not matched."""
        path = Path("nuxt")
        codeflash_output = is_build_output_dir(path) # 4.08μs -> 2.10μs (93.9% faster)

    def test_build_with_trailing_slash(self):
        """Test that 'build/' with trailing slash is detected correctly."""
        # Path normalizes trailing slashes
        path = Path("build/")
        codeflash_output = is_build_output_dir(path) # 4.49μs -> 2.05μs (119% faster)

    def test_build_with_multiple_trailing_slashes(self):
        """Test that 'build//' is handled correctly."""
        path = Path("build//")
        codeflash_output = is_build_output_dir(path) # 4.50μs -> 2.06μs (118% faster)

    def test_windows_style_path_with_backslashes(self):
        """Test that Windows-style paths with backslashes are converted correctly."""
        # Path.as_posix() converts backslashes to forward slashes
        path = Path("project\\build\\output")
        codeflash_output = is_build_output_dir(path) # 4.04μs -> 2.06μs (95.6% faster)

    def test_path_with_dots_in_directory_name(self):
        """Test that directories with dots in their names (not .next or .nuxt) are handled."""
        path = Path("my.project/src")
        codeflash_output = is_build_output_dir(path) # 4.49μs -> 2.09μs (114% faster)

    def test_next_js_with_nested_path(self):
        """Test that .next/server/pages is detected as build output."""
        path = Path(".next/server/pages")
        codeflash_output = is_build_output_dir(path) # 5.00μs -> 2.17μs (130% faster)

    def test_nuxt_with_nested_path(self):
        """Test that .nuxt/dist/app is detected as build output."""
        path = Path(".nuxt/dist/app")
        codeflash_output = is_build_output_dir(path) # 4.82μs -> 2.19μs (120% faster)

    def test_path_similar_to_next(self):
        """Test that '.nextjs' (similar to .next) is not matched."""
        path = Path(".nextjs")
        codeflash_output = is_build_output_dir(path) # 4.15μs -> 2.08μs (99.0% faster)

    def test_path_similar_to_nuxt(self):
        """Test that '.nuxtjs' (similar to .nuxt) is not matched."""
        path = Path(".nuxtjs")
        codeflash_output = is_build_output_dir(path) # 4.03μs -> 2.02μs (99.1% faster)

    def test_out_as_directory_name(self):
        """Test that 'out' is detected as a build directory."""
        path = Path("out")
        codeflash_output = is_build_output_dir(path) # 4.38μs -> 2.09μs (109% faster)

    def test_output_not_matching_out(self):
        """Test that 'output' (not exact 'out') is not matched."""
        path = Path("output")
        codeflash_output = is_build_output_dir(path) # 4.10μs -> 2.11μs (93.9% faster)

    def test_build_as_file_extension(self):
        """Test that 'something.build' is not matched - only directory names."""
        path = Path("project/file.build")
        codeflash_output = is_build_output_dir(path) # 4.55μs -> 2.17μs (109% faster)

    def test_dist_as_file_extension(self):
        """Test that 'something.dist' is not matched - only directory names."""
        path = Path("project/archive.dist")
        codeflash_output = is_build_output_dir(path) # 4.46μs -> 2.22μs (100% faster)

    def test_absolute_path_with_build_directory(self):
        """Test that absolute paths with build directories are detected."""
        path = Path("/home/user/project/build")
        codeflash_output = is_build_output_dir(path) # 5.81μs -> 2.45μs (137% faster)

    def test_absolute_path_without_build_directory(self):
        """Test that absolute paths without build directories are not detected."""
        path = Path("/home/user/project/src")
        codeflash_output = is_build_output_dir(path) # 5.20μs -> 2.22μs (134% faster)

    def test_single_slash_path(self):
        """Test that a single slash path (root directory) does not match build directories."""
        path = Path("/")
        codeflash_output = is_build_output_dir(path) # 4.72μs -> 2.08μs (126% faster)

    def test_path_with_numbers_similar_to_build_dirs(self):
        """Test that 'build1' or 'dist2' are not matched."""
        path = Path("build1")
        codeflash_output = is_build_output_dir(path) # 4.24μs -> 2.08μs (103% faster)

    def test_path_with_numbers_similar_to_build_dirs_dist(self):
        """Test that 'dist2' is not matched."""
        path = Path("dist2")
        codeflash_output = is_build_output_dir(path) # 4.17μs -> 2.09μs (99.0% faster)

    def test_hyphenated_directory_similar_to_build(self):
        """Test that 'build-output' is not matched."""
        path = Path("build-output")
        codeflash_output = is_build_output_dir(path) # 4.12μs -> 2.13μs (93.0% faster)

    def test_hyphenated_directory_similar_to_dist(self):
        """Test that 'dist-files' is not matched."""
        path = Path("dist-files")
        codeflash_output = is_build_output_dir(path) # 3.98μs -> 1.98μs (101% faster)

    def test_underscore_prefix_next(self):
        """Test that '_next' is not matched - needs leading dot."""
        path = Path("_next")
        codeflash_output = is_build_output_dir(path) # 4.10μs -> 2.04μs (100% faster)

    def test_underscore_prefix_nuxt(self):
        """Test that '_nuxt' is not matched - needs leading dot."""
        path = Path("_nuxt")
        codeflash_output = is_build_output_dir(path) # 4.09μs -> 2.00μs (104% faster)

    def test_build_directory_appears_twice_in_path(self):
        """Test that 'build/build' has build detected (first occurrence)."""
        path = Path("build/build")
        codeflash_output = is_build_output_dir(path) # 4.81μs -> 2.11μs (127% faster)

    def test_multiple_different_build_dirs_in_path(self):
        """Test that 'dist/build' is detected (contains build directory)."""
        path = Path("dist/build")
        codeflash_output = is_build_output_dir(path) # 4.73μs -> 2.06μs (129% faster)

    def test_multiple_different_build_dirs_in_path_reverse(self):
        """Test that 'build/dist' is detected (contains both build directories)."""
        path = Path("build/dist")
        codeflash_output = is_build_output_dir(path) # 4.65μs -> 2.05μs (126% faster)

    def test_path_with_space_in_directory_name(self):
        """Test that spaces in directory names are handled correctly."""
        path = Path("my project/src")
        codeflash_output = is_build_output_dir(path) # 4.67μs -> 2.19μs (113% faster)

    def test_build_directory_with_space_prefix(self):
        """Test that ' build' (space prefix) is not matched."""
        path = Path(" build")
        codeflash_output = is_build_output_dir(path) # 4.13μs -> 2.05μs (101% faster)

    def test_build_directory_with_space_suffix(self):
        """Test that 'build ' (space suffix) is not matched."""
        path = Path("build ")
        codeflash_output = is_build_output_dir(path) # 4.02μs -> 2.00μs (101% faster)

    def test_path_with_unicode_characters(self):
        """Test that paths with unicode characters are handled correctly."""
        path = Path("projet/src")  # projet with accent would be unicode
        codeflash_output = is_build_output_dir(path) # 4.41μs -> 2.17μs (103% faster)

class TestLargeScale:
    """Test the function's performance and scalability with large inputs."""

    def test_very_deeply_nested_path_without_build_dirs(self):
        """Test performance with a very deep path hierarchy without build directories."""
        # Create a path with 100 nested levels
        parts = ["level" + str(i) for i in range(100)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 14.9μs -> 5.82μs (155% faster)

    def test_very_deeply_nested_path_with_build_at_end(self):
        """Test performance with a very deep path with build directory at the end."""
        # Create a path with 100 nested levels, build at the end
        parts = ["level" + str(i) for i in range(99)] + ["build"]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 14.4μs -> 5.54μs (161% faster)

    def test_very_deeply_nested_path_with_build_at_start(self):
        """Test performance with a very deep path with build directory at the start."""
        # Create a path with build at the start and 100 nested levels after
        parts = ["build"] + ["level" + str(i) for i in range(99)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 9.26μs -> 2.56μs (261% faster)

    def test_very_deeply_nested_path_with_build_in_middle(self):
        """Test performance with a very deep path with build directory in the middle."""
        # Create a path with build in the middle of 100 nested levels
        parts = ["level" + str(i) for i in range(50)] + ["build"] + ["level" + str(i) for i in range(50, 100)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 12.2μs -> 4.22μs (190% faster)

    def test_very_deeply_nested_path_with_dist_at_end(self):
        """Test performance with a very deep path with dist directory at the end."""
        # Create a path with 100 nested levels, dist at the end
        parts = ["folder" + str(i) for i in range(99)] + ["dist"]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 14.7μs -> 5.43μs (172% faster)

    def test_very_deeply_nested_path_with_out_in_middle(self):
        """Test performance with a very deep path with out directory in the middle."""
        # Create a path with out in the middle of 100 nested levels
        parts = ["dir" + str(i) for i in range(50)] + ["out"] + ["dir" + str(i) for i in range(50, 100)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 12.3μs -> 4.27μs (187% faster)

    def test_very_deeply_nested_path_with_next_at_end(self):
        """Test performance with a very deep path with .next directory at the end."""
        # Create a path with 100 nested levels, .next at the end
        parts = ["folder" + str(i) for i in range(99)] + [".next"]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 14.8μs -> 5.57μs (165% faster)

    def test_very_deeply_nested_path_with_nuxt_in_middle(self):
        """Test performance with a very deep path with .nuxt directory in the middle."""
        # Create a path with .nuxt in the middle of 100 nested levels
        parts = ["component" + str(i) for i in range(50)] + [".nuxt"] + ["component" + str(i) for i in range(50, 100)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 12.4μs -> 4.04μs (208% faster)

    def test_all_build_directories_in_single_path(self):
        """Test a path containing all build directory types sequentially."""
        # Path: "build/dist/out/.next/.nuxt"
        path = Path("build/dist/out/.next/.nuxt")
        codeflash_output = is_build_output_dir(path) # 5.03μs -> 2.13μs (136% faster)

    def test_many_similar_but_non_matching_directories(self):
        """Test performance with many directories similar to build directories."""
        # Create a path with many similar-but-not-matching directory names
        parts = ["buildx", "dista", "outb", "nextx", "nuxtb"] * 20  # 100 directories total
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 13.6μs -> 4.72μs (188% faster)

    def test_mix_of_matching_and_non_matching_directories_first_match_early(self):
        """Test path with matching directory appearing early among many directories."""
        # Create a path with build as 5th directory among many non-matching directories
        parts = ["src", "lib", "util", "helper", "build"] + ["folder" + str(i) for i in range(95)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 10.1μs -> 2.88μs (250% faster)

    def test_mix_of_matching_and_non_matching_directories_first_match_late(self):
        """Test path with matching directory appearing late among many directories."""
        # Create a path with dist as 96th directory among many non-matching directories
        parts = ["folder" + str(i) for i in range(95)] + ["dist"] + ["file"]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 14.3μs -> 5.57μs (157% faster)

    def test_large_path_with_long_directory_names(self):
        """Test performance with very long directory names."""
        # Create paths with very long directory names
        long_name = "a" * 1000
        path = Path(long_name + "/src/lib")
        codeflash_output = is_build_output_dir(path) # 5.65μs -> 2.31μs (144% faster)

    def test_large_path_with_long_directory_names_containing_build(self):
        """Test performance with very long directory names that contain build keyword."""
        # Create a path with 'build' as a very long directory name
        long_name = "build" + "x" * 1000
        path = Path(long_name)
        codeflash_output = is_build_output_dir(path) # 4.57μs -> 2.14μs (113% faster)

    def test_special_characters_in_deeply_nested_path(self):
        """Test performance with special characters in a deeply nested path."""
        # Create a path with special characters in directory names
        parts = ["dir-" + str(i) for i in range(100)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 14.3μs -> 5.78μs (148% faster)

    def test_special_characters_with_build_directory(self):
        """Test performance with special characters and build directory."""
        # Create a path with special characters and a build directory
        parts = ["dir-" + str(i) for i in range(50)] + ["build"] + ["dir_" + str(i) for i in range(50, 100)]
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 12.1μs -> 4.19μs (189% faster)

    def test_repeated_non_build_directories_very_deep(self):
        """Test performance with a very long path of repeated non-build directories."""
        # Create a path with 500 repeated 'src' directories
        parts = ["src"] * 500
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 43.5μs -> 13.3μs (228% faster)

    def test_repeated_build_directory_very_deep(self):
        """Test performance with a very long path of repeated build directories."""
        # Create a path with 500 repeated 'build' directories - early exit optimization
        parts = ["build"] * 500
        path = Path("/".join(parts))
        codeflash_output = is_build_output_dir(path) # 22.8μs -> 2.85μs (703% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from codeflash.setup.detector import is_build_output_dir
from pathlib import Path

def test_is_build_output_dir():
    is_build_output_dir(Path())
🔎 Click to see Concolic Coverage Tests

To edit these changes git checkout codeflash/optimize-pr1288-2026-02-03T06.26.27 and push.

Codeflash Static Badge

The optimized code achieves a **141% speedup** (from 590μs to 244μs) through three key optimizations:

**1. Eliminated repeated set construction**
The original code recreated the `build_dirs` set on every function call. Moving it to a module-level `frozenset` constant (`BUILD_DIRS`) eliminates this overhead entirely. The line profiler shows this saved ~160ns per call (5.3% of original runtime).

**2. Replaced expensive string conversion with native Path API**
The original used `path.as_posix().split("/")` which:
- Converts the Path to a POSIX string representation
- Allocates a new string
- Splits it into a list of parts

The optimized version uses `path.parts`, which directly returns a tuple of path components without any string conversion or allocation. This alone saved ~1.68μs per call (55.9% of original runtime), making it the single biggest performance gain.

**3. Early-exit explicit loop vs generator expression**
Replacing `any(part in BUILD_DIRS for part in parts)` with an explicit loop enables early exit as soon as a build directory is found. While `any()` also short-circuits, the explicit loop avoids generator object creation overhead. The line profiler shows the loop overhead is now distributed across multiple lines but totals less time than the original `any()` expression.

**Performance characteristics from test results:**
- **Shallow paths** (2-3 components): ~2-2.5x faster consistently
- **Deep paths without build dirs** (100 levels): ~2.5x faster due to avoiding string allocation on every check
- **Early matches**: Up to **7x faster** when build directory appears first (703% speedup for repeated "build" directories), demonstrating the power of early exit
- **Late matches**: Still ~2.5x faster, showing the string conversion savings dominate even without early exit

The optimization is particularly effective for codebases with deep directory structures or when scanning many paths, as each path check now uses native tuple operations instead of string manipulation.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Feb 3, 2026
@KRRT7 KRRT7 closed this Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant