Skip to content

Left recursion new#335

Merged
yhirose merged 10 commits intomasterfrom
left-recursion-new
Mar 7, 2026
Merged

Left recursion new#335
yhirose merged 10 commits intomasterfrom
left-recursion-new

Conversation

@yhirose
Copy link
Copy Markdown
Owner

@yhirose yhirose commented Mar 7, 2026

No description provided.

yhirose and others added 10 commits March 6, 2026 23:33
The char8_t overloads were passing *path instead of path, which would
dereference a null pointer when path=nullptr (the default).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test file and grammar definitions for left recursion support.
Tests are currently non-compiling (parser constructor changes needed).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Definition::is_left_recursive flag
- Add enable_left_recursion parameter (default: true) to parser/ParserGenerator
- When enabled, mark LR rules instead of reporting errors
- Fix operator bool() const for parser class
- Update test2.cc existing LR tests to explicitly disable LR support
- Add test_left_recursive.cc to CMakeLists

Note: LR grammars are now accepted but parsing them will hang until
Phase 2 (seed growing) is implemented.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add seed growing algorithm for left-recursive rules:
- LRMemo in Context for mutable memoization during growth
- Cycle detection via lr_refs_hit with self-insertion for transitive cycles
- lr_active_seeds to protect outer growers from inner growers' memo erasure
- Shared do_parse lambda eliminates duplication between LR and non-LR paths
- Extract write_packrat_cache/clear_packrat_cache helpers from Context

Handles direct, indirect, mutual, and cascading left recursion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add ComputeCanBeEmpty visitor with fixed-point iteration to determine
  which rules can match the empty string
- Fix DetectLeftRecursion::visit(Reference) to use can_be_empty: when a
  referenced rule can match empty, don't mark the sequence position as
  done, allowing detection of hidden LR through nullable prefixes
- Add Definition::can_be_empty flag

Fixes: A <- B C A / 'a' with B <- C?, C <- 'c'? was not detected as
left-recursive when C was visited twice (once through B, once directly),
causing infinite recursion at parse time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix typo: LeftRecursionText -> LeftRecursionTest
- Enable 12 epsilon-hidden LR tests (previously DISABLED), with corrected
  expectations (EXPECT_TRUE for parser compilation with LR enabled)
- Consolidate 20 DISABLED tests down to 12 (remove redundant variants)
- Consolidate 3 packrat tests into 1 (PackratWithLeftRecursion)
- Delete Phase1_DefIdMapping (tested nothing LR-specific)
- Remove 4 non-LR regression tests redundant with existing test suite
  (JSON, CDeclarations, DeclarationSyntax, PythonExpressions)
- Fix indirect LR TODO: parser.parse("dbacba") now works
- Add LeftAssociativity test (1-2-3=-4, 8/4/2=1)
- Extract setup_arithmetic_actions helper to reduce duplication
- Keep 2 non-LR regression tests (right-recursive expr, TypeScript types)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- DetectLeftRecursion now traverses macro parameter references via
  macro_args_stack_, detecting LR through parameterized rules (e.g.
  Expr <- Apply(Expr, '+', Number) / Number). Nested macros are
  resolved by walking up the args stack with resolve_macro_arg().

- Runtime re-entry protection prevents stack overflow from any
  undetected left recursion:
  - Packrat mode: pre-register cache entry as failure before parsing,
    so re-entry at the same position returns failure (zero overhead).
  - Non-packrat mode: use lr_memo as temporary re-entry guard.

- Update README.md with left recursion documentation and feature list.
- Update benchmark/README.md with latest results (3.6x YACC ratio).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The `parser("...", {}, false)` calls in left recursion tests were
ambiguous on MSVC because `{}` could match Rules, string_view, or
size_t. Switch to explicit `load_grammar()` with `std::string_view{}`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the enable_left_recursion parameter on constructors and
load_grammar with an enable_left_recursion() method, matching the
existing enable_packrat_parsing() pattern. This eliminates MSVC
overload ambiguity and simplifies the API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@yhirose yhirose merged commit 9a81a9c into master Mar 7, 2026
10 checks passed
@yhirose yhirose deleted the left-recursion-new branch March 11, 2026 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant