Extend the expression parser with support for local blocks and variables #3642

josdejong · 2026-02-06T11:06:05Z

josdejong
Feb 6, 2026
Maintainer

Currently, the expression parser of mathjs mostly supports single line expressions. It lacks a way to have a local block with variables and expressions. We want to be able to create blocks with multiple expressions when defining a function and at an arbitrary place in an expression like inside a conditional operator. Adding support for blocks will allow writing more complex functions and code.

1. the syntax of blocks

What we would like to have is a syntax that:

Is concise and can be used inline as well as spread over multiple lines. Ideally, it has the same syntax when used inline or as multiline.
Ideally it seamlessly extends the current notation for function definitions instead of needing a separate, second notation for that.
Has the same syntax no matter where it is used, so no separate syntax for example in function definitions vs a nested block somewhere else.
Must be feasible given the existing syntax.

Some ideas for a possible syntax:

# current single line function
f(x) = x^3

# single curly braces
f(x) = { y = 3; x^y }
f(x) = {
  y = 3
  x^y 
}
a = condition ? { y = 3; x^y } : 0

# double curly braces
f(x) = {{ y = 3; x^y }}
f(x) = {{ 
  y = 3
  x^y 
}}
a = condition ? {{ y = 3; x^y }} : 0

# round brackets
f(x) = ( y = 3; x^y )
f(x) = ( 
  y = 3
  x^y 
)
a = condition ? ( y = 3; x^y ) : 0

# words
f(x) = block y = 3; x^y end
f(x) = block 
  y = 3
  x^y
end
a = condition ? block y = 3; x^y end : 0

# more ideas:
f(x) = \{ y = x^2; x+y }
f(x) = @{ y = x^2; x+y }
f(x) = @( y = x^2; x+y )
f(x) = @ y = x^2; x+y }

Some notes here:

Single curly brackets {...} like in JavaScript cannot be used because mathjs supports implicit multiplication and ranges a:b. It is not possible to unambiguously distinguish a block {...} from an object {...} when parsing when both are allowed at any location in an expression. You can look ahead to see if a { character is followed by a key and value, but then the difficulty is: how to know it is a key and value and not a range? For example: is { a : b } an object with key a, or an block with range a:b? Maybe parsing as block can become the default and you have to add round brackets around it in some cases to interpret it as an object (like with JavaScript's arrow functions)? We think that's not going to work, for example x = {a:2} could be both an object or a block.
Using round parentheses (...) would be very neat, natural, and compact. Then, "normal" usage of parentheses like in 3 * (4 + 5) would just mean a block with a single line, returning the result of the last expression. But due to support for implicit multiplication, it is impossible to know whether f(2) or f(x \n + y) is meant as a function invocation or an implicit multiplication of f with a block.

Another difficulty that arises is how to be able to write an expression that continues on the next line, since parentheses (...) cannot be used for that anymore. People have the expectation that whitespace inside parens (...) is insignificant, but if we use this for blocks, newlines become very important. One option would be to require to end all lines with a semicolon ; like in PHP for example. However, this would introduce a big pitfall: when you forget to end the line with a ;, your lines will be evaluated as a single expression and may be evaluated just fine but isn't at all what you intended. For example a = 3 \n a + 2 would be evaluated as a = 3a + 2. Another option would be to introduce a continuation character for that, which would be a neat solution. Or use single round brackets (...) for blocks, and use double round brackets ((...)) for overriding precedence, which would be confusing.
A notation like block ... end (similar to Matlab) isn't very suitable for use on a single line, it's not very compact and readable.

2. local vs global variables

The behavior of local vs global variables is inspired by Python:

Variables created inside a block are local by default, and when it has the same name as a global variable it will be shadowed. This differs from the current behavior where you can create and alter global variables from within a custom defined function.
You can read but not change global variables by default from within a local block.
Getting write access to an outer variable requires an explicit statement outer varname1, varname1 at the start of a block. In case of multiple layers of nested blocks, it accesses variables only one level up.

3. return value

To return a value, we can think of the following options:

Return the result of the last expression (like in Rust)
Use an explicit return ... statement.
In the function header, specify one or multiple variables that will be returned, like in Matlab.

We strongly prefer option (1) returning the result of the last expression, because this is a concise notation usable in inline blocks. We prefer not to introduce two different notations for either an inline block and multiline block (like in JavaScript you have regular function notation and compact arrow functions).

4. separating expressions

How to separate expressions? Right now:

at root level, an expression ends at a newline or semicolon
inside parens (...) and matrices [...], newline has no meaning
inside a matrix [...], a semicolon means next matrix row

For context: the current ResultSet solution with \n and ; was originally intended just to determine what results are rendered on screen in a workbook. It was not intened to be used like a list/set with results that can be used for further computations.

Inside blocks there must be a clear way to separate expressions. We can use newlines and semicolon for the end of expressions inside blocks too: when used inline, one can use ;, and when used in a multiline block, one can use newlines. There is one important difference: a block will not return a ResultSet like at the root level. Instead, it will return the last expression as discussed under (3).

5. multiline expressions

There must still be a way to spread a single expression over multiple lines. Currently, that can be achieved by using parentheses, like a = (long + longer \n + longest). We can keep this behavior as it is, but an alternative could be to introduce a special continuation character to denote that the expression continues on the next line. This character could be either positioned at the end of current line, or at the start of next line. Ideas:

- `...` will give conflicts with future spread operator
- `|` normally used as pipe operator, confusing
- `~`used for bitNot
- tab or spaces. Invisible, tricky, we do not want to use indentation for syntax
- `>>>` maybe confusing when using mathjs in a terminal?
- `\`
- `\_`
- `___`
- `---`

Problem with putting the continuation character on the next line is that in an interactive environment you cannot finish and evaluate an expression when the user presses Enter (newline). We could allow the continuation character both at the end of the line or at the start of next line.

Note that the need for having one expression on two lines becomes smaller when we can create blocks, so we can more easily split a long expression into multiple smaller expressions.

Conclusion

So, so far a block notation with double curly braces {{...}} or escaped curly braces \{...} looks like the best option. This is concise, doesn't conflict with existing syntax, is usable for both inline and multiline blocks, seamlessly extends the existing notation for function definitions, and will not look too alien since it looks similar to {...} which is used in may programming languages for blocks. A block will return the result of the last statement. We just keep using newlines and semicolons to separate expressions. Local variables will have a local scope by default and changing global variables will require explicit access via an outer definition. Optionally, we could introduce a continuation character but that can be discussed/implemented separately from implementing blocks.

Any thoughts on this?

gwhitney · 2026-02-08T12:10:45Z

gwhitney
Feb 8, 2026
Collaborator

3. In case of multiple layers of nested blocks, it accesses variables only one level up.

This isn't too common, I think; usually a "global" declaration (or here we are suggesting "outer") accesses any visible identifier. So the "only one level up" idea means that if you add a layer of block nesting between a variable foo and its (already nested) use, you will be obliged to also add outer foo in that new block as well -- every variable would have to be connected to a non-local use by a sequence of outer declarations at every nesting level in between. Not certain that's what we want. We can certainly try it, but good to keep an open mind on this point, thanks.

0 replies

gwhitney · 2026-02-08T12:31:25Z

gwhitney
Feb 8, 2026
Collaborator

I don't see anything clearly better than the proposal in the "Conclusion", but I just want to also point out that if in actual usage it turns out that blocks are more commonly used in mathjs expressions than plain object notation, we might prefer to use just {...} only for blocks and require \{...} or {{...}} or ({...}) or ob{...} for objects. (Really any alphabetic sign is OK at the beginning to mark objects if we want to go down that path, because we would only be losing the ability to do implicit multiplication between one identifier we choose and the result value of a block. So if you really did have a variable named ob and wanted to do that kind of product, you could still just write ob*{x = 7; x^3 + x^2 +2}.)

If I had to guess, in the long run I do suspect that blocks will get more usage than object notation, since there's not all that much call or facility for manipulating objects in mathjs -- very few operations are defined for them, unlike for Arrays. So I would say that I tend to slightly prefer an alternative that uses straight {...} for blocks and obligatorily decorates objects in some way, although the fact that choice would change the meaning of previously existing formulas, whereas using decorated blocks and leaving objects alone would not, does make the decision very close.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend the expression parser with support for local blocks and variables #3642

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Extend the expression parser with support for local blocks and variables #3642

Uh oh!

Uh oh!

josdejong Feb 6, 2026 Maintainer

1. the syntax of blocks

2. local vs global variables

3. return value

4. separating expressions

5. multiline expressions

Conclusion

Replies: 2 comments

Uh oh!

Uh oh!

gwhitney Feb 8, 2026 Collaborator

Uh oh!

gwhitney Feb 8, 2026 Collaborator

josdejong
Feb 6, 2026
Maintainer

gwhitney
Feb 8, 2026
Collaborator

gwhitney
Feb 8, 2026
Collaborator