Skip to content

Implement PPC0030: undef aware equality#24304

Draft
leonerd wants to merge 8 commits intoPerl:bleadfrom
leonerd:ppc0030-undef-aware-equality
Draft

Implement PPC0030: undef aware equality#24304
leonerd wants to merge 8 commits intoPerl:bleadfrom
leonerd:ppc0030-undef-aware-equality

Conversation

@leonerd
Copy link
Copy Markdown
Contributor

@leonerd leonerd commented Mar 20, 2026

Now we're in the change freeze ahead of 5.44's release, this is an excellent time to start thinking about and reviewing changes for the 5.45.x cycle. With that in mind, this PR is currently still in draft, as I have a number of things to finish off first.

This adds a set of new operators for comparing values for equality, which consider undef to be a distinct value and not equal to either the empty string, or zero. These are specified in PPC0030.

Still TODO:

  • Consider if a feature flag is required, and if so what it should be called. The PPC suggests not
  • Write documentation on the new operators
  • Specifically, document the interaction with operator overloading
  • Write perldelta
  • Squash commits down to one main one + prereq support
  • This set of changes requires a perldelta entry, and it will be included by the time I undraft it

@leonerd leonerd added defer-next-dev This PR should not be merged yet, but await the next development cycle squash-before-merge Author must squash the commits down before merging to blead labels Mar 20, 2026
@tonycoz
Copy link
Copy Markdown
Contributor

tonycoz commented Mar 23, 2026

  • Specifically, document the interaction with operator overloading

I think they would need their own amg values, and I guess should fallback to the original op after the definedness comparison, if the fallback is enabled and the original op has an overload.

If fallback isn't enabled (edit) and the overload isn't defined (/edit) the overload should throw an exception like it does for any other op.

If fallback is enabled by neither the undef-aware nor the base op is overloaded the normal path should be taken (try_amagic_bin() numifies the result for the numeric comparisons and the normal op implementation does the right thing.

@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch from 52495ee to 8e8d3b2 Compare March 23, 2026 15:39
Comment thread pp.c Outdated
@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch from 8e8d3b2 to e27f610 Compare March 26, 2026 16:05
@leonerd
Copy link
Copy Markdown
Contributor Author

leonerd commented Mar 26, 2026

@tonycoz :

Right then. This new approach defines a whole new set of op codes and pp funcs for the undef-aware versions. I'm undecided about the names of them - OP_SEQU is a bit opaque and doesn't have to match the Perl-visible operator name of equ so I might rename it to something more obvious like OP_SEQ_UNDEF or somesuch. That will have knock-on effects on several other internal names.

The newly-added numerical comparison codes are annoyingly no longer in ordered sequence with the other ones, so it breaks the operation of OP_IS_NUMCOMPARE. I think the only fix for that is to disobey the top line of regen/opcodes which tells us to add new ones at the end, and instead add these four new ones directly after the non-undef versions of the same; where arguably people will expect to see them. I know that does upset the numbers of existing ones but I feel that's probably fine between release versions, right?

There is however one big snag that I haven't thought of a good solution to, and that's the intended operation of the try_AMAGIC_2 call that would need to be added, in order to support overloaded operators. If the (currently-commented) code were simply activated, it would result in two sets of GETMAGIC() calls for every use of the operator - once while testing for seq_undef_amg and once again while testing for seq_amg. This would also duplicate a bunch of "does this value have amagic and if so where's the overload table?" behaviour. I can imagine two solutions to this:

  1. Don't call the standard try_AMAGIC_2 from these pp funcs and instead inline and expand the required logic so it can have the right semanics re: GETMAGIC and efficiency. I don't like it, because it involves a lot of copypaste duplication of that logic, plus also means now the undef-aware operators can't just chain call to the non-undef versions for the "both sides are defined" case, and have to duplicate their logic here also.

  2. Implement the entire fallback semantics as part of the huge and sprawling mass of code that is Perl_amagic_call(), taking into account the tests for whether either argument is undef, falling back to regular string case, and so on... I also don't like that because it involves duplication of the "are the arguments undef?" logic, and further hides semantics away in that giant monstrosity known as Perl_amagic_call(). ((Did I ever mention, I don't like the design of Perl_amagic_call()? Can you tell? ;) ))

Aside from those I don't have any other ideas, nor do I have a good feel for which one would be preferrable. Something to think on perhaps...

(Oh and also, yes I know the branch fails sanity currently; this is due to the amagic tests I mention above)

@tonycoz
Copy link
Copy Markdown
Contributor

tonycoz commented Mar 29, 2026

Another thing to consider is use integer;:

tony@venus:.../git/perl6$ ./perl -Ilib -Minteger -E 'say 123 == 123.1'
1
tony@venus:.../git/perl6$ ./perl -Ilib -Minteger -E 'say 123 === 123.1'
 
tony@venus:.../git/perl6$

@tonycoz
Copy link
Copy Markdown
Contributor

tonycoz commented Mar 29, 2026

There is however one big snag...

I don't think the code in amagic_call() would be recursively calling the pp funcs.

I don't see a way to avoid messing with amagic_call() and it doesn't seem too bad:

First is the switch statement that handles finding fallbacks:

https://github.com/leonerd/perl5/blob/e27f6109f8a3cde7a6f67866f7c88b0aaf7ab0ee/gv.c#L3967-L3970

This checks for fallback ops, so we might add:

   case equ_amg:
       /* missing stack stuff in these returns? */
       if (!SvOK(left) && !SvOK(right))
           return &PL_sv_yes;
       if (!SvOK(left) || !SvOK(right))
           return &PL_sv_no;
       /* off is further checked below the switch */
       if (ocvp[eq_amg]) /* == */
           off = eq_amg;
       else
           off = cmp_amg;
       break;

Maybe put the undef checks in a macro.

Another switch further down.

https://github.com/leonerd/perl5/blob/e27f6109f8a3cde7a6f67866f7c88b0aaf7ab0ee/gv.c#L4271-L4275

again, add a case:

    case equ_amg:
        if (off == cmp_amg)
            ans = SvIV(res) == 0;
        else  /* presumably was eq_amg */
            ans = SvTRUE(res); // not sure if we can avoid this
        break;

Thankfully the use integer ops use the base amg values.

@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch from e27f610 to ff2c120 Compare April 1, 2026 17:54
@leonerd
Copy link
Copy Markdown
Contributor Author

leonerd commented Apr 1, 2026

Rightthen. A new round of updates. Having followed those notes and read over the way gv.c implements overloading, I guess it's not too bad. I still dislike the "separation of concerns" of having this code make decisions about the undef-ness of arguments here as well as in the PP funcs, but that is likely unavoidable.

This latest update now implements most of the use overload ability on these new ops.

While I was there I did realise that the logic could use an overloaded equ operator to synthesize a missing neu (and likewise the numerics), but it currently doesn't for the reason that the older non-undef versions don't do that either. I raised a discussion about that - Perl/PPCs#84. But that can be looked at separately.

I haven't looked into the use integer case yet but I imagine that won't be much work; and at least it's only for the numerical ones and not the stringy ones.

@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch from ff2c120 to 03fb011 Compare April 1, 2026 19:57
@leonerd
Copy link
Copy Markdown
Contributor Author

leonerd commented Apr 1, 2026

And now with use integer support:

$ ./perl -E 'use integer; say "OK" if 123.1 === 123'
OK

@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch 2 times, most recently from 78b1b43 to ababe66 Compare April 10, 2026 14:41
@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch from ababe66 to daac33e Compare April 15, 2026 15:15
@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch 3 times, most recently from d36c014 to 2a01134 Compare April 16, 2026 22:53
@leonerd
Copy link
Copy Markdown
Contributor Author

leonerd commented Apr 16, 2026

Based on Perl/PPCs#87 I've shuffled the implementation here so it doesn't add entire new use overload categories for these new operators.

I added the AMGf_no_GETMAGIC flag because I couldn't think of another way to do this. It seems a potentially useful addition though.

Comment thread pp.c Outdated
Comment thread pp.c
leonerd added 7 commits April 18, 2026 15:46
 * `use VERSION` of v5.42
 * 5.14-style `package NAME VERSION` syntax
 * No need for `no warnings "experimental::builtin"` now that refaddr is
   stable
As per PPC0030.

Intentionally ignores the comment at the top of `regen/opcodes` to add
new ops at the end, because the ops need to be grouped together with the
other ones.

TODO: Currently only parsed and tested for the stringy version, not the
  numerical version, even though numerical is implemented internally.

TODO: Currently lacks any attempt at documentation or perldelta.

TODO: Also lacks any consideration on how an `equ` operator would
  interact with `use overload`. Further thought is required here.
Support 'use integer' with ===/!== operators
@leonerd leonerd force-pushed the ppc0030-undef-aware-equality branch from 2a01134 to bee478b Compare April 18, 2026 14:49
@leonerd
Copy link
Copy Markdown
Contributor Author

leonerd commented Apr 18, 2026

I've now also added some basic docs in perlop.pod. I feel maybe some words should be said in lib/loverload.pm to explain why these operators aren't separately overloadable, but other than that I don't think there's much else to be written about them.

I won't add a perldelta yet because it will just keep conflicting along rebases around release time. I'll instead add something in a comment here to copy out at the time we get around to thinking about merging this.

@leonerd
Copy link
Copy Markdown
Contributor Author

leonerd commented Apr 18, 2026

Proposed perldelta:

=head1 Core Enhancements

=head2 Undef-aware equality operators

Four new operators have been added, which are similar to the regular equality operators except for their handling of C<undef>. Whereas the regular operators treat C<undef> as equal to the empty string or the number zero, these operators consider C<undef> to be a distinct value, equal to itself, but unequal to any defined value.

    if( $x equ $y ) {
     # $x and $y are both undef, or
     # $x and $y are both defined and equal
     ...
    }

This is approximately equal to the following, except that it is more efficient and avoids duplicate evaluation of operands or fetching of tied scalar values:

    if( (!defined $x and !defined $y) or
        (defined $x and defined $y and $x eq $y) ) {
      ...
    }

For more detail, see L<perlop/Equality Operators>.

Comment thread regen_perly.pl
use 5.006;
sub usage { die "usage: $0 [ -b bison_executable ] [ file.y ]\n" }

use v5.10;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's 5.12 that adds implicit strict (assuming that's the reason for removing the explicit strict).

tony@venus:~$ perl -e 'use v5.10; print $x'
tony@venus:~$ perl -e 'use v5.12; print $x'
Global symbol "$x" requires explicit package name (did you forget to declare "my $x"?) at -e line 1.
Execution of -e aborted due to compilation errors.

Comment thread pod/perlop.pod
its arguments is C<undef>. These operators, called the I<undef-aware equality
operators>, consider that C<undef> is equal to another C<undef> but not equal
to any defined value - even the number zero or the empty string. Furtheremore,
these operators will not invoke warnings of undefined values, even when their
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The warning complains about Use of uninitialized value rather than undefined.

I only bring this up because you say "warnings of undefined values" (warning about "undefined values") rather than "warnings on undefined values" (some warning when an undefined value is seen.)

Feel free to ignore this.

@tonycoz
Copy link
Copy Markdown
Contributor

tonycoz commented Apr 19, 2026

The overload.pm docs could probably use a mention of the way overloading works for the new ops, possibly an extra =item in "Overloadable Operations" which seems to be where the special cases are discussed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

defer-next-dev This PR should not be merged yet, but await the next development cycle squash-before-merge Author must squash the commits down before merging to blead

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants