feat(training, models): graph config for skipped noise injector and multiscale loss by ssmmnn11 · Pull Request #855 · ecmwf/anemoi-core

ssmmnn11 · 2026-02-03T09:40:11Z

Description
This PR adds new configuration support in the training and models components to enable graph-based settings for skipped noise injection and multiscale loss integration.

Main changed
Introduces graph configuration options to control:
- Noise injector behavior when using graph-based models
- Multiscale loss projections across graph representations
Adds associated tests covering:
- Graph-based multiscale loss scales
- Truncation and projection behavior
Implements support for noise projection in graph-based models
Includes additional fixes and refactors to improve graph config handling
Why?

These changes enhance flexibility in defining how graph-structured models handle noise injection and multiscale loss terms during training, leveraging existing functionality in anemoi-graphs to use sparse matrices generation. Facilitating reproducibility via graph recipes and reducing dependency on truncation files graph not properly tracked.
In addition, it will enable more involved interpolation workflows in the future.

📚 Documentation preview 📚: https://anemoi-training--855.org.readthedocs.build/en/855/

📚 Documentation preview 📚: https://anemoi-graphs--855.org.readthedocs.build/en/855/

📚 Documentation preview 📚: https://anemoi-models--855.org.readthedocs.build/en/855/

…d_and_multiscale # Conflicts: # training/src/anemoi/training/config/graph/encoder_decoder_only.yaml # training/src/anemoi/training/config/graph/hierarchical_2level.yaml # training/src/anemoi/training/config/graph/hierarchical_3level.yaml # training/src/anemoi/training/config/graph/limited_area.yaml # training/src/anemoi/training/config/graph/multi_scale.yaml # training/src/anemoi/training/config/graph/stretched_grid.yaml # training/src/anemoi/training/train/train.py

…ncation

…/graph_for_skipped_and_multiscale # Conflicts: # training/tests/unit/losses/test_combined_loss.py

…d_and_multiscale # Conflicts: # training/src/anemoi/training/schemas/training.py

…d_and_multiscale # Conflicts: # training/tests/integration/test_training_cycle.py

anaprietonem · 2026-03-26T08:39:32Z

models/src/anemoi/models/models/ens_encoder_processor_decoder.py

        self,
        *,
-        model_config: DotDict,
+        model_config: Any,


why is the typing changing?

anaprietonem · 2026-03-26T08:41:35Z

models/src/anemoi/models/schemas/models.py

    "Whether to use autocast for the noise projection matrix operations."

+    @model_validator(mode="after")
+    def validate_noise_projection(self) -> "NoiseConditioningSchema":


you could move this to

anemoi-core/training/src/anemoi/training/schemas/base_schema.py

Line 81 in 992a300

def model_post_init(self, _: Any) -> None:

so that the checks is then done both when config_validation is on and off

anaprietonem · 2026-03-26T08:44:18Z

models/src/anemoi/models/models/interpolator.py

+        else:
+            model_config_local = model_config
+
+        self.input_times = list(model_config_local.training.explicit_times.input)


is this related to this PR, or left over/not updated with main? Those seems unrelated so probably best not change this as part of the PR (same for the self.latent_skip)

anaprietonem · 2026-03-26T08:47:05Z

models/src/anemoi/models/models/base.py

            Graph definition
        """
        super().__init__()
+        if type(model_config) is dict and not OmegaConf.is_config(model_config):


why are we doing this here (and also on the interpolator)?

anaprietonem · 2026-03-26T08:51:22Z

training/src/anemoi/training/utils/graph_config.py

+from anemoi.models.utils.projection_helpers import residual_projection_truncation_node_name
+from anemoi.models.utils.projection_helpers import uses_fused_dataset_graph
+
+


Why do we need all of these utility functions? How easy is to maintain this?

After trying to look at this more, I believe this should be something in anemoi-graphs and projection helpers should also be part of anemoi-graphs. And then in graph_config we could just do one name resolution after the graph is build so that those are then used by the model or the losses when needed. If then the graph_factory returns:

graph_data, projection_data = graph_factory.build()

and then that object can be consumed by the losses and model with something like

model = instantiate(config.model, graph=graph_data, projection_data=projection_data) loss = get_loss_function(config.training_loss, ..., graph_data=graph_data, loss_matrices_graph=projection_data.multiscale_loss_matrices_graph)

Could something like this be considered?
With that then the GraphFactory would live in anemoi-graphs and also the projection helpers that would be significantly simplified since the resolve it's done once. And similarly to the GraphCreator object we could have ProjectionCreator that then it's called in the factory to resolve the projection_metadata

anaprietonem · 2026-03-26T08:53:47Z

training/src/anemoi/training/train/graph_data.py

+LOGGER = logging.getLogger(__name__)
+
+
+class TrainerGraphDataFactory:


is this something that should live in training? I'd would say this seems to me a very graph specific functionality that would be better placed in anemoi-graphs and just simply take the part of the training config related to the dataset names that it needs rather than whole config

anaprietonem · 2026-03-26T09:02:22Z

training/src/anemoi/training/losses/multiscale.py

 from anemoi.models.layers.graph_provider import ProjectionGraphProvider
 from anemoi.models.layers.sparse_projector import SparseProjector
+from anemoi.models.utils.projection_helpers import (
+    multiscale_loss_matrices_graph as derive_multiscale_loss_matrices_graph,


why changing the name here?

anaprietonem · 2026-03-26T09:13:55Z

training/src/anemoi/training/config/graph/projections/multiscale/none.yaml

@@ -0,0 +1,2 @@
+---


why here we have this and the truncation/none.yaml is just an empty file?

anaprietonem · 2026-03-26T09:19:33Z

training/src/anemoi/training/config/graph/projections/multiscale/4scale.yaml

+---
+defaults:
+  - _self_
+


could we simplify the definition of these configs by doing something like?

num_scales: 4 base_num_nearest_neighbours: 16 base_sigma: 0.00471 scale_factor: 2 # optional, defaults to 2 edge_weight_attribute: gauss_weight gaussian_norm: l1

and then just a helper that builds this?

def _expand_geometric_smoothers(projection_cfg: Any) -> dict[str, Any] | None: """Build an explicit smoothers dict from a compact geometric progression spec.""" num_scales = projection_cfg.get("num_scales") if num_scales is None: return None base_neighbours = projection_cfg["base_num_nearest_neighbours"] base_sigma = projection_cfg["base_sigma"] scale_factor = projection_cfg.get("scale_factor", 2) edge_weight_attribute = projection_cfg.get("edge_weight_attribute", 'gauss_weight') gaussian_norm = projection_cfg.get("gaussian_norm", "l1") smoothers = {} for i in range(num_scales): factor = scale_factor**i smoothers[f"smooth_{factor}x"] = { "edge_weight_attribute": edge_weight_attribute, "gaussian_norm": gaussian_norm, "num_nearest_neighbours": base_neighbours * factor, "sigma": round(base_sigma * factor, 5), } return smoothers

If we still want to provide support for other cases that don't have this pattern we could still allow for a dictionary of specific smoothers layer? But do you thing we'd need that case?

anaprietonem · 2026-03-26T09:30:31Z

training/src/anemoi/training/schemas/training.py

-            assert len(v) == len(info.data["loss_matrices"]), "weights must have same length as loss_matrices"
-        return v
+    @model_validator(mode="after")
+    def validate_matrix_source(self) -> Self:


Similarly this could be a function that gets called in the SchemaCommonMixin so it's checked in cases with config validation on and off

anaprietonem · 2026-03-26T09:45:10Z

training/src/anemoi/training/losses/multiscale.py

+
+        smoothing_matrices: list[ProjectionGraphProvider | None] = []
+        for entry in loss_matrices_graph:
+            if entry is None or entry is False or entry == "None":


could this be simplified with the suggestions of moving some of the validation checks to the SchemaMixin?

anaprietonem · 2026-03-26T09:56:28Z

training/src/anemoi/training/losses/loss.py

+NESTED_LOSS_CLASS_NAMES = {
+    "MultiscaleLossWrapper",
+}
+GRAPH_DATA_WRAPPER_CLASS_NAMES = {


What about an alternative solution here, where we could add a specific needs_graph_data: bool = True to the losses and then when instantiating the loss do:

if "per_scale_loss" in loss_config: per_scale_loss_config = loss_config.pop("per_scale_loss") per_scale_loss = get_loss_function( OmegaConf.create(per_scale_loss_config), scalers, data_indices, graph_data=graph_data, **kwargs, ) loss_config["per_scale_loss"] = per_scale_loss target_cls = hydra.utils.get_class(loss_config["_target_"]) if getattr(target_cls, "needs_graph_data", False) and graph_data is not None: kwargs["graph_data"] = graph_data loss_function = instantiate(loss_config, _recursive_=False, **kwargs)

anaprietonem · 2026-03-26T10:15:46Z

models/src/anemoi/models/utils/config.py

 from omegaconf import OmegaConf

-DEFAULT_DATASET_NAME = "data"
+from anemoi.models.utils.projection_helpers import DEFAULT_DATASET_NAME


I don't think this default name should be defined by the projection_helpers. Why did you choose this? I might be missing something.

ssmmnn11 added 7 commits February 2, 2026 16:33

Add graph-based projections for multiscale loss and truncation

fa2e4ac

Add tests for graph-based multiscale scales and truncation

fa69799

Add projection row-sum test and file-based multiscale overrides

0891697

Add graph-based noise projection support

24ea6a1

fix

4f32dac

fixes

26010dd

fix

9681f16

ssmmnn11 requested review from JPXKQX and anaprietonem February 3, 2026 09:40

ssmmnn11 self-assigned this Feb 3, 2026

ssmmnn11 added training models ATS Approval Not Needed No approval needed by ATS labels Feb 3, 2026

github-project-automation bot added this to Anemoi-dev Feb 3, 2026

github-project-automation bot moved this to To be triaged in Anemoi-dev Feb 3, 2026

github-actions bot added graphs and removed graphs labels Feb 3, 2026

ssmmnn11 marked this pull request as draft February 3, 2026 09:40

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

921874f

github-actions bot added the graphs label Feb 3, 2026

ssmmnn11 changed the title ~~feature(training, models): graph config for skipped, noise injector and multiscale loss~~ feature(training, models): graph config for skipped noise injector and multiscale loss Feb 3, 2026

ssmmnn11 changed the title ~~feature(training, models): graph config for skipped noise injector and multiscale loss~~ feat(training, models): graph config for skipped noise injector and multiscale loss Feb 3, 2026

ssmmnn11 added 3 commits February 6, 2026 07:36

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

85e4e22

config refactor

906335a

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

a468fc1

ssmmnn11 marked this pull request as ready for review February 6, 2026 11:20

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

feac329

anaprietonem modified the milestone: Training with Observations and gridded datasets Feb 8, 2026

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

31e6cc1

ssmmnn11 added 12 commits March 12, 2026 16:36

refactor(training): simplify graph projections for multiscale and tru…

11c37b4

…ncation

fix

52002f5

clean up

03961d1

fix projection helper edge cases

903faa5

Merge branch 'main' of https://github.com/ecmwf/anemoi-core into feat…

f8e39e1

…/graph_for_skipped_and_multiscale # Conflicts: # training/tests/unit/losses/test_combined_loss.py

docs: simplify projection docstrings

aa1087e

fixes

ccb0b84

Merge remote-tracking branch 'origin/main' into feat/graph_for_skippe…

f2b740c

…d_and_multiscale # Conflicts: # training/src/anemoi/training/schemas/training.py

Merge remote-tracking branch 'origin/main' into feat/graph_for_skippe…

b24f2d3

…d_and_multiscale # Conflicts: # training/tests/integration/test_training_cycle.py

updates

99a8315

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

333504e

ssmmnn11 mentioned this pull request Mar 25, 2026

configuration errors of ensemble training with multi-scale loss function #980

Open

anaprietonem mentioned this pull request Mar 25, 2026

feat: automatic data nodes creation based on dataset entries #968

Open

anaprietonem reviewed Mar 26, 2026

View reviewed changes

Merge branch 'main' into feat/graph_for_skipped_and_multiscale

77ae6c3

anaprietonem reviewed Mar 26, 2026

View reviewed changes

anaprietonem requested changes Mar 26, 2026

View reviewed changes

github-project-automation bot moved this from To be triaged to Under Review in Anemoi-dev Mar 26, 2026

		from anemoi.models.utils.projection_helpers import residual_projection_truncation_node_name
		from anemoi.models.utils.projection_helpers import uses_fused_dataset_graph

		LOGGER = logging.getLogger(__name__)


		class TrainerGraphDataFactory:

Conversation

ssmmnn11 commented Feb 3, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anaprietonem Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anaprietonem Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ssmmnn11 commented Feb 3, 2026 •

edited by github-actions bot

Loading

anaprietonem Mar 26, 2026 •

edited

Loading

anaprietonem Mar 26, 2026 •

edited

Loading