Skip to content

Latest commit

 

History

History
91 lines (74 loc) · 2.56 KB

File metadata and controls

91 lines (74 loc) · 2.56 KB

neumann_parser Benchmarks

The parser is a hand-written recursive descent parser with Pratt expression parsing for operator precedence.

Tokenization

Query Type Time Throughput
simple_select 182 ns 99 MiB/s
select_where 640 ns 88 MiB/s
complex_select 986 ns 95 MiB/s
insert 493 ns 120 MiB/s
update 545 ns 91 MiB/s
node 625 ns 98 MiB/s
edge 585 ns 94 MiB/s
path 486 ns 75 MiB/s
embed 407 ns 138 MiB/s
similar 185 ns 118 MiB/s

Parsing (tokenize + parse)

Query Type Time Throughput
simple_select 235 ns 77 MiB/s
select_where 1.19 us 47 MiB/s
complex_select 1.89 us 50 MiB/s
insert 688 ns 86 MiB/s
update 806 ns 61 MiB/s
delete 464 ns 62 MiB/s
create_table 856 ns 80 MiB/s
node 837 ns 81 MiB/s
edge 750 ns 74 MiB/s
neighbors 520 ns 55 MiB/s
path 380 ns 58 MiB/s
embed_store 650 ns 86 MiB/s
similar 290 ns 76 MiB/s

Expression Complexity

Expression Type Time
simple (a = 1) 350 ns
binary_and 580 ns
binary_or 570 ns
nested_and_or 950 ns
deep_nesting 1.5 us
arithmetic 720 ns
comparison_chain 1.3 us

Batch Parsing Throughput

Batch Size Time Queries/s
10 5.2 us 1.9M/s
100 52 us 1.9M/s
1,000 520 us 1.9M/s

Large Query Parsing

Query Type Time
INSERT 100 rows 45 us
EMBED 768-dim vector 38 us
WHERE 20 conditions 8.5 us

Analysis

  • Zero dependencies: Hand-written lexer and parser with no external crates
  • Consistent throughput: ~75-120 MiB/s across query types
  • Expression complexity: Linear scaling with expression depth
  • Batch performance: Consistent 1.9M queries/second regardless of batch size
  • Large vectors: 768-dim embedding parsing in ~38us (20K dimensions/second)

Complexity

Operation Time Complexity Notes
Tokenization O(n) Linear scan of input
Parsing O(n) Single pass, no backtracking
Expression parsing O(n * d) n = tokens, d = nesting depth
Error recovery O(1) Immediate error on invalid syntax

Parser Design

  • Lexer: Character-by-character tokenization with lookahead
  • Parser: Recursive descent with Pratt parsing for expressions
  • AST: Zero-copy where possible, spans track source locations
  • Errors: Rich error messages with span highlighting