-
Notifications
You must be signed in to change notification settings - Fork 22
Support for implicit whitespace handling? #22
Description
I couldn't find this mentioned in the documentation, and the only previous discussion on this that I've found was in issue #3, so asking here...
In my experience writing Citrus grammars is very productive, except for one thing: whitespace handling. Transcribing various standard _BNF grammars (SQL, SPARQL, etc) into Citrus form is presently more painful than it could be, given that every terminal needs an explicit space_ appended to it as these grammars all assume that the input has been tokenized.
To keep the production rule definitions sane, as well as to keep them consistent with those in the standard grammar being transcribed, this incentivizes workarounds like the following:
grammar Keywords
rule space [ \t\n\r] end
rule ALL `ALL` space* end
rule AS `AS` space* end
rule BY `BY` space* end
rule DISTINCT `DISTINCT` space* end
rule FROM `FROM` space* end
rule GROUP `GROUP` space* end
rule HAVING `HAVING` space* end
rule JOIN `JOIN` space* end
rule SELECT `SELECT` space* end
rule UNION `UNION` space* end
rule WHERE `WHERE` space* end
...
end
grammar Tokens
rule space [ \t\n\r] end
rule digit [0-9] space* end
rule double_quote '"' space* end
rule percent '%' space* end
rule ampersand '&' space* end
rule left_paren '(' space* end
rule right_paren ')' space* end
rule asterisk '*' space* end
rule plus_sign '+' space* end
...
end
grammar MyGrammar
include Keywords
include Tokens
rule query_specification
SELECT set_quantifier? select_list table_expression
end
rule set_quantifier
DISTINCT | ALL
end
...
end
The above approach works fine, of course, but seems rather redundant and not a little laborious.
Is there by any chance a magic option I've missed somewhere that would automatically consume any trailing whitespace after recognizing a terminal? Alternatively, is there perhaps a way to feed the #parse method with a sequence of tokens (at its simplest, an Enumerable of strings) instead of giving it an input string?
Thanks for taking the time to read this, and kudos for the awesome job you've done on Citrus so far: the documentation is superb and the source code is a pleasure to read.