Skip to content

Custom Operator Registry Support#456

Merged
evaleev merged 102 commits intomasterfrom
ajay/feature/runtime-op-support
Jan 26, 2026
Merged

Custom Operator Registry Support#456
evaleev merged 102 commits intomasterfrom
ajay/feature/runtime-op-support

Conversation

@ajay-mk
Copy link
Member

@ajay-mk ajay-mk commented Dec 24, 2025

Custom Operator Registry Support

This PR refactors Operator logic from depending on a predefined OpType enum to a runtime OpRegistry, enabling users to define custom operators beyond the predefined set.

Major Changes

  • OpRegistry: a registry for operator labels and their classifications
  • mbpt::Context now holds an OpRegistry along with the CSV setting.
  • There are two predefined registries available (minimal and legacy).
  • Some labels are reserved for internal use (antisymmetrizer, symmetrizer, transposition, kroneker, overlap), and cannot be registered.
  • OpType enum is removed, all related logic is gone:
    • OpMaker constructors now use strings.
    • OpConnections now uses strings to specify operator pairs. For example: {{L"f", L"A"}, {L"g", L"A"}, ...}
  • Built-in operators (both tensor and Operator level) check the registry if operators are registered.
  • Perturbation-related operators now support higher-order perturbations (up to 9).
  • mbpt::Context manipulations are thread-safe, and can be toggled by SEQUANT_CONTEXT_MANIPULATION_THREADSAFE option; follows the same pattern as core::Context.

Example Usage

    using namespace sequant;
    using namespace sequant::mbpt;
   
    auto registry = std::make_shared<OpRegistry>();

    registry->add(L"f", OpClass::gen)
        .add(L"g", OpClass::gen)
        .add(L"t", OpClass::ex)
        .add(L"x", OpClass::ex)
        .add(L"y", OpClass::ex);

    // set MBPT context
    auto ctx_resetter = set_scoped_default_mbpt_context(
        {.csv = CSV::No, .op_registry_ptr = registry});

    // use OpMaker to define a custom excitation operator of rank 2
    auto x = OpMaker<Statistics::FermiDirac>(L"x", 2)();

    // particle non-conserving excitation operator with custom IndexSpaces
    const auto& cre_space = get_particle_space(Spin::any);
    const auto& ann_space = get_hole_space(Spin::any);

    auto y = OpMaker<Statistics::FermiDirac>(L"y", ncre(2), nann(1),
                                             cre(cre_space), ann(ann_space))();

Additional Changes

  • The antisymmetrizer and symmetrizer labels have been updated to  and Ŝ, respectively. As a result, several tests and examples were updated. Most hardcoded strings were replaced with calls to the reserved label methods; however, some reference outputs and expressions are still constructed from plain strings.
  • In the TNC tests, the reference GraphViz output needed to be updated with the new label and the newly assigned color for the tensor. The rest of the graph remains unchanged.
  • The spin-tracing examples in tests/integration were not using mbpt::cardinal_tensor_labels() for expression canonicalization. After the label change, symmetrizer and antisymmetrizer tensors were moved to the end during canonicalization. Since the spin-tracing logic assumes all symmetrizer tensors share the same external indices, they must appear first. This was fixed by canonicalizing with mbpt::cardinal_tensor_labels(), which places symmetrizer tensors first and ensures consistent indexing.
  • Some evaluation logic also requires canonical ordering. Accordingly, the eval_{ta,btas} cases now use mbpt::cardinal_tensor_labels() with TNC.
  • To ensure consistent indexing of antisymmetrizer/symmetrizer tensors across a Sum, they need to appear first in a Product. When the labels were changed, they appeared at the end of a Product, which caused problems in spin-tracing and eval logic (see the striked points). To prevent this, antisymmetrizer/symmetrizer labels are prepended to the cardinal tensor labels list when set_cardinal_tensor_labels() is called.
  • If no list is set, the default cardinal tensor label list will contain antisymmetrizer/symmetrizer labels.
  • The trace_product spin-tracing logic assumed that the symmetrizer tensor always appears first; this has been generalized. cc: @ABesharat
  • The external interface was updated to use reserved labels instead of hardcoded ones, and the labels in the example inputs were updated accordingly.

All connectivity info is now encoded using plain strings
All connectivity info is now encoded using plain strings
…ation order

Constructors now take strings as Operator identifiers
…f perturbation order.

Perturbation order does not make any difference on how the operator acts, so it is only for bookkeeping here.
All methods except A and S will check registry make sure the Operator label is registered.
Clarify that lst is a free function and mention mbpt::Context.
…e-op-support

# Conflicts:
#	SeQuant/domain/mbpt/spin.cpp
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 61 out of 61 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@evaleev evaleev force-pushed the ajay/feature/runtime-op-support branch from 1bb4986 to 6d5e4f6 Compare January 26, 2026 12:11
evaleev and others added 2 commits January 26, 2026 09:11
cum_to_density should be checking for cumulant label not rdm
@evaleev
Copy link
Member

evaleev commented Jan 26, 2026

@ajay-mk
Copy link
Member Author

ajay-mk commented Jan 26, 2026

I added the cirumflex/hat character to the word_components object in parser, and that seems to fix it (44d9076). @Krzmbrzl is that okay to do?

auto word_components = x3::unicode::alnum
                       | x3::char_('_') | x3::unicode::char_(L'') | x3::unicode::char_(L'̃') | x3::unicode::char_(to_char_type(0x0302)) 
                       // Superscript and Subscript block
                       | (x3::unicode::char_(to_char_type(0x2070), to_char_type(0x209F)) - x3::unicode::unassigned)
                       // These are defined in the Latin-1 Supplement block and thus need to be listed explicitly
                       | x3::unicode::char_(L'¹') | x3::unicode::char_(L'²') | x3::unicode::char_(L'³')
                       // Arrow block
                       | (x3::unicode::char_(to_char_type(0x2190), to_char_type(0x21FF)) - x3::unicode::unassigned);
// A name begins with a letter, then can container letters, digits and
// underscores, but can not end with an underscore (to not confuse the parser
// with tensors á la t_{…}^{…}.

PS: the third case here is combining tilde, but my editor renders it poorly :(

@Krzmbrzl
Copy link
Collaborator

@ajay-mk yes that's perfectly fine. I would add a comment describing what that character is though. That way, it will be easier to understand the code

@ajay-mk
Copy link
Member Author

ajay-mk commented Jan 26, 2026

This is good to go.

  • Remaining usages of hardcoded labels have been updated to use reserved labels.
  • Registry accessors now check for empty registries.
  • Fixed a few docs and comments

@evaleev evaleev merged commit a9cd3a4 into master Jan 26, 2026
20 checks passed
@evaleev evaleev deleted the ajay/feature/runtime-op-support branch January 26, 2026 20:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 62 out of 62 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

: label_(label),
order_(order),
cre_spaces_(cre_list.begin(), cre_list.end()),
ann_spaces_(ann_list.begin(), ann_list.end()) {
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This OpMaker constructor accepts a raw order but does not validate it. If order > 9 and assertions are disabled, decorate_with_pert_order() will index past pert_superscripts and can cause undefined behavior. Suggest validating order here (or removing this overload in favor of the OpParams-based ctor so all call paths go through OpParams::validate()).

Suggested change
ann_spaces_(ann_list.begin(), ann_list.end()) {
ann_spaces_(ann_list.begin(), ann_list.end()) {
if (order_ > 9) {
throw std::out_of_range(
"OpMaker: perturbation order out of range [0,9]");
}

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +67
int pert_order = 0) {
if (pert_order == 0) return std::wstring(base_label);
SEQUANT_ASSERT(
pert_order >= 0 && pert_order <= 9,
"decorate_with_pert_order: perturbation order out of range [0,9]");

Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decorate_with_pert_order() indexes pert_superscripts[pert_order] after a SEQUANT_ASSERT range check. If assertions are compiled out (or pert_order is produced via an unchecked conversion from size_t), this can become out-of-bounds UB. Consider making the range check non-optional (e.g., throw/abort on pert_order > 9) and/or taking std::size_t to avoid lossy conversion from OpParams::order/OpMaker::order_.

Suggested change
int pert_order = 0) {
if (pert_order == 0) return std::wstring(base_label);
SEQUANT_ASSERT(
pert_order >= 0 && pert_order <= 9,
"decorate_with_pert_order: perturbation order out of range [0,9]");
std::size_t pert_order = 0) {
if (pert_order == 0) return std::wstring(base_label);
if (pert_order > 9) {
throw std::out_of_range(
"decorate_with_pert_order: perturbation order out of range [0,9]");
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants