Extend the expression parser with support for local blocks and variables #3642
Replies: 2 comments
-
This isn't too common, I think; usually a "global" declaration (or here we are suggesting "outer") accesses any visible identifier. So the "only one level up" idea means that if you add a layer of block nesting between a variable |
Beta Was this translation helpful? Give feedback.
-
|
I don't see anything clearly better than the proposal in the "Conclusion", but I just want to also point out that if in actual usage it turns out that blocks are more commonly used in mathjs expressions than plain object notation, we might prefer to use just If I had to guess, in the long run I do suspect that blocks will get more usage than object notation, since there's not all that much call or facility for manipulating objects in mathjs -- very few operations are defined for them, unlike for Arrays. So I would say that I tend to slightly prefer an alternative that uses straight |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, the expression parser of mathjs mostly supports single line expressions. It lacks a way to have a local block with variables and expressions. We want to be able to create blocks with multiple expressions when defining a function and at an arbitrary place in an expression like inside a conditional operator. Adding support for blocks will allow writing more complex functions and code.
1. the syntax of blocks
What we would like to have is a syntax that:
Some ideas for a possible syntax:
Some notes here:
Single curly brackets
{...}like in JavaScript cannot be used because mathjs supports implicit multiplication and rangesa:b. It is not possible to unambiguously distinguish a block{...}from an object{...}when parsing when both are allowed at any location in an expression. You can look ahead to see if a{character is followed by a key and value, but then the difficulty is: how to know it is a key and value and not a range? For example: is{ a : b }an object with keya, or an block with rangea:b? Maybe parsing as block can become the default and you have to add round brackets around it in some cases to interpret it as an object (like with JavaScript's arrow functions)? We think that's not going to work, for examplex = {a:2}could be both an object or a block.Using round parentheses
(...)would be very neat, natural, and compact. Then, "normal" usage of parentheses like in3 * (4 + 5)would just mean a block with a single line, returning the result of the last expression. But due to support for implicit multiplication, it is impossible to know whetherf(2)orf(x \n + y)is meant as a function invocation or an implicit multiplication offwith a block.Another difficulty that arises is how to be able to write an expression that continues on the next line, since parentheses
(...)cannot be used for that anymore. People have the expectation that whitespace inside parens(...)is insignificant, but if we use this for blocks, newlines become very important. One option would be to require to end all lines with a semicolon;like in PHP for example. However, this would introduce a big pitfall: when you forget to end the line with a;, your lines will be evaluated as a single expression and may be evaluated just fine but isn't at all what you intended. For examplea = 3 \n a + 2would be evaluated asa = 3a + 2. Another option would be to introduce a continuation character for that, which would be a neat solution. Or use single round brackets(...)for blocks, and use double round brackets((...))for overriding precedence, which would be confusing.A notation like
block ... end(similar to Matlab) isn't very suitable for use on a single line, it's not very compact and readable.2. local vs global variables
The behavior of local vs global variables is inspired by Python:
outer varname1, varname1at the start of a block. In case of multiple layers of nested blocks, it accesses variables only one level up.3. return value
To return a value, we can think of the following options:
return ...statement.We strongly prefer option (1) returning the result of the last expression, because this is a concise notation usable in inline blocks. We prefer not to introduce two different notations for either an inline block and multiline block (like in JavaScript you have regular function notation and compact arrow functions).
4. separating expressions
How to separate expressions? Right now:
(...)and matrices[...], newline has no meaning[...], a semicolon means next matrix rowFor context: the current
ResultSetsolution with\nand;was originally intended just to determine what results are rendered on screen in a workbook. It was not intened to be used like a list/set with results that can be used for further computations.Inside blocks there must be a clear way to separate expressions. We can use newlines and semicolon for the end of expressions inside blocks too: when used inline, one can use
;, and when used in a multiline block, one can use newlines. There is one important difference: a block will not return aResultSetlike at the root level. Instead, it will return the last expression as discussed under (3).5. multiline expressions
There must still be a way to spread a single expression over multiple lines. Currently, that can be achieved by using parentheses, like
a = (long + longer \n + longest). We can keep this behavior as it is, but an alternative could be to introduce a special continuation character to denote that the expression continues on the next line. This character could be either positioned at the end of current line, or at the start of next line. Ideas:Problem with putting the continuation character on the next line is that in an interactive environment you cannot finish and evaluate an expression when the user presses Enter (newline). We could allow the continuation character both at the end of the line or at the start of next line.
Note that the need for having one expression on two lines becomes smaller when we can create blocks, so we can more easily split a long expression into multiple smaller expressions.
Conclusion
So, so far a block notation with double curly braces
{{...}}or escaped curly braces\{...}looks like the best option. This is concise, doesn't conflict with existing syntax, is usable for both inline and multiline blocks, seamlessly extends the existing notation for function definitions, and will not look too alien since it looks similar to{...}which is used in may programming languages for blocks. A block will return the result of the last statement. We just keep using newlines and semicolons to separate expressions. Local variables will have a local scope by default and changing global variables will require explicit access via anouterdefinition. Optionally, we could introduce a continuation character but that can be discussed/implemented separately from implementing blocks.Any thoughts on this?
Beta Was this translation helpful? Give feedback.
All reactions