Skip to content

Comments

Initial winnow implementation#871

Open
lu-zero wants to merge 3 commits intoreubeno:mainfrom
lu-zero:winnow
Open

Initial winnow implementation#871
lu-zero wants to merge 3 commits intoreubeno:mainfrom
lu-zero:winnow

Conversation

@lu-zero
Copy link
Contributor

@lu-zero lu-zero commented Dec 25, 2025

It has two implementations, one still using the tokenizer the other directly parsing a str input.

Most of the code had been machine-ported from the peg implementation and then refined so it is a bit faster than the original, there is still potentially more to refactor but we can start reasoning about it.

@github-actions
Copy link

github-actions bot commented Dec 25, 2025

Performance Benchmark Report

Benchmark name Baseline (μs) Test/PR (μs) Delta (μs) Delta %
clone_shell_object 18.07 μs 17.70 μs -0.37 μs 🟢 -2.04%
eval_arithmetic 0.13 μs 0.13 μs -0.00 μs ⚪ Unchanged
expand_one_string 1.60 μs 1.57 μs -0.02 μs ⚪ Unchanged
for_loop 22.47 μs 22.44 μs -0.03 μs ⚪ Unchanged
full_peg_complex 55.92 μs 56.05 μs 0.13 μs ⚪ Unchanged
full_peg_for_loop 6.37 μs 6.47 μs 0.10 μs 🟠 +1.55%
full_peg_nested_expansions 16.08 μs 16.24 μs 0.16 μs ⚪ Unchanged
full_peg_pipeline 4.25 μs 4.25 μs 0.00 μs ⚪ Unchanged
full_peg_simple 1.77 μs 1.76 μs -0.02 μs 🟢 -0.96%
function_call 2.37 μs 2.30 μs -0.08 μs ⚪ Unchanged
instantiate_shell 52.67 μs 52.95 μs 0.28 μs ⚪ Unchanged
instantiate_shell_with_init_scripts 24469.83 μs 24163.61 μs -306.22 μs ⚪ Unchanged
parse_peg_bash_completion 2188.82 μs 2231.00 μs 42.18 μs 🟠 +1.93%
parse_peg_complex 18.27 μs 18.79 μs 0.52 μs 🟠 +2.87%
parse_peg_for_loop 1.89 μs 1.89 μs -0.00 μs ⚪ Unchanged
parse_peg_pipeline 1.98 μs 1.99 μs 0.01 μs ⚪ Unchanged
parse_peg_simple 1.06 μs 1.06 μs -0.00 μs ⚪ Unchanged
run_echo_builtin_command 11.71 μs 11.33 μs -0.38 μs 🟢 -3.26%
tokenize_sample_script 3.47 μs 3.48 μs 0.00 μs ⚪ Unchanged

Code Coverage Report: Only Changed Files listed

Package Base Coverage New Coverage Difference
Overall Coverage 🟢 74.56% 🟢 74.56% ⚪ 0%

Minimum allowed coverage is 70%, this run produced 74.56%

Test Summary: bash-completion test suite

Outcome Count Percentage
✅ Pass 1559 73.92
❗️ Error 18 0.85
❌ Fail 178 8.44
⏩ Skip 339 16.07
❎ Expected Fail 13 0.62
✔️ Unexpected Pass 2 0.09
📊 Total 2109 100.00

@lu-zero lu-zero force-pushed the winnow branch 8 times, most recently from 95731cd to 39dbcaa Compare December 27, 2025 07:15
@reubeno
Copy link
Owner

reubeno commented Dec 27, 2025

Thanks for cleaning up the changes. I still need to actually read through the winnow-based parsers, but what do you think about arranging a runtime switch to select the parser we want to use? If you're open to it, I'm happy to help with some of the glue code for this too.

I'm thinking we could extract a Parser trait (or similar name) that must implement a fn that takes text as input and yields an AST. We could implement this for the PEG-based impl as well as the winnow-based impl(s). An option on Shell could select the desired impl, with default going to the (legacy) PEG one.

We could then arrange to run all of our YAML-based integration tests against all parsers without needing to recompile brush, and present results. This would give us a way to track functional parity and help with side-by-side comparison. (And ditto for the benchmarks you've set up.) I'd want to update the PR checks to run tests against multiple parsers, but only block if we fail tests on the PEG one. This would give us a clear way to track progress and evaluate when we'd be ready to switch over the default.

@lu-zero
Copy link
Contributor Author

lu-zero commented Dec 28, 2025

Thanks for cleaning up the changes. I still need to actually read through the winnow-based parsers, but what do you think about arranging a runtime switch to select the parser we want to use? If you're open to it, I'm happy to help with some of the glue code for this too.

Sounds feasible and shouldn't be too hard to achieve given what we currently have. Just we could clean up a little the options.

I'm thinking we could extract a Parser trait (or similar name) that must implement a fn that takes text as input and yields an AST. We could implement this for the PEG-based impl as well as the winnow-based impl(s). An option on Shell could select the desired impl, with default going to the (legacy) PEG one.

Yes, I have the two winnow impls because I gradually did the translation and benchmarked and cleaned up till I was confident winnow can be faster and overall nicer than the status quo. If we can ditch the tokenization from the public API we should spare quite a bit of allocations and we can focus on cleaning up even more the winnow_str impl.

We could then arrange to run all of our YAML-based integration tests against all parsers without needing to recompile brush, and present results. This would give us a way to track functional parity and help with side-by-side comparison. (And ditto for the benchmarks you've set up.) I'd want to update the PR checks to run tests against multiple parsers, but only block if we fail tests on the PEG one. This would give us a clear way to track progress and evaluate when we'd be ready to switch over the default.

Yes, we might want to do a bit of refactor so that we can make the location a no-op for the tests since that part will have discrepancies.

@lu-zero lu-zero marked this pull request as ready for review January 11, 2026 12:32
@reubeno reubeno added the area: parsing Issues or PRs related to tokenization or parsing label Jan 17, 2026
@lu-zero lu-zero force-pushed the winnow branch 5 times, most recently from 57f48f3 to 1e54749 Compare January 21, 2026 20:54
@lu-zero
Copy link
Contributor Author

lu-zero commented Jan 23, 2026

@reubeno this weekend we could try to do a first review pass to make sure all the not too beautiful bits are needed for performance, but it seems that we are getting closer to an acceptable version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: parsing Issues or PRs related to tokenization or parsing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants