Skip to content

Comments

Use standardized parameter estimates for metrics instead of unstandardized ones#1337

Open
tsalo wants to merge 17 commits intoME-ICA:mainfrom
tsalo:rm-unstandardized-pes
Open

Use standardized parameter estimates for metrics instead of unstandardized ones#1337
tsalo wants to merge 17 commits intoME-ICA:mainfrom
tsalo:rm-unstandardized-pes

Conversation

@tsalo
Copy link
Member

@tsalo tsalo commented Jan 29, 2026

Closes #1336.

Changes proposed in this pull request:

  • Add new metrics:
    • map SPE T2 clusterized: Use the standardized parameter estimate maps instead of the unstandardized ones.
    • map SPE S0 clusterized: Use the standardized parameter estimate maps instead of the unstandardized ones.
    • dice_FT2_SPE: Use the standardized parameter estimate maps instead of the unstandardized ones.
    • dice_FS0_SPE: Use the standardized parameter estimate maps instead of the unstandardized ones.
  • Deprecate the following metrics with warnings.
    • map beta T2 clusterized
    • map beta S0 clusterized
    • dice_FT2
    • dice_FS0
    • varex kappa ratio
    • d_table_score

I might need to figure out alternatives to varex kappa ratio (which is calculated within the decision trees) and d_table_score (both calculated outside and within decision trees).

@tsalo tsalo added TE-dependence issues related to TE dependence metrics and component selection breaking-change WIll make a non-trivial change to outputs labels Jan 29, 2026
@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

❌ Patch coverage is 46.66667% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.06%. Comparing base (41d3fb3) to head (04badab).

Files with missing lines Patch % Lines
tedana/metrics/collect.py 46.66% 10 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1337      +/-   ##
==========================================
- Coverage   90.30%   90.06%   -0.25%     
==========================================
  Files          30       30              
  Lines        4726     4750      +24     
  Branches      790      797       +7     
==========================================
+ Hits         4268     4278      +10     
- Misses        309      318       +9     
- Partials      149      154       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request updates tedana to use standardized parameter estimates (map weight) instead of unstandardized parameter estimates (map optcom betas) for calculating several key metrics. This change addresses issue #1336 which identified that unstandardized parameter estimates are confounded by data variance scaling.

Changes:

  • Modified metric dependency configuration to use standardized parameter estimates for variance explained, map beta T2 clusterized, and map beta S0 clusterized calculations
  • Updated code to ensure both standardized and unstandardized parameter estimate maps are always calculated (the latter is still needed for visualization and percent signal change)
  • Updated documentation terminology to use more precise language about parameter estimates

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tedana/resources/config/metrics.json Updated three metric dependencies to use map weight (standardized) instead of map optcom betas (unstandardized)
tedana/metrics/collect.py Modified metrics initialization to ensure both map types are calculated; updated function calls and log messages to use standardized estimates
docs/outputs.rst Updated terminology from "beta coefficients" to "parameter estimates" in component visualization description
docs/denoising.rst Changed variable names from "betas" to "pes" for clarity in example code
docs/approach.rst Updated documentation to clarify when unstandardized vs standardized parameter estimates are used
tedana/tests/test_integration.py Updated expected classification from "Likely BOLD" to "Accept borderline" due to metric calculation changes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Collaborator

@eurunuela eurunuela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look good to me, thanks. That said, I have not tried running it on data. I wonder how much it changes the classification, and therefore the denoising. Have you seen big changes? If so, it might be worth posting some comparison figures here.

@tsalo
Copy link
Member Author

tsalo commented Jan 30, 2026

I did compare metrics on some data I have locally. There were two components out of 88 where the classification changed from rejected to accepted, but that's it for classification changes. The variance explained and Dice metrics did change. I think the change to the variance explained measure makes sense since I don't think variance explained (really "coefficient energy") should be conflated with signal amplitude.

I dug into the Dice metrics as well. As far as I can tell, Dice similarity drops down substantially when using standardized parameter estimates because the unstandardized parameter estimate maps, the F-T2 maps, and the F-S0 maps are all more confounded with the voxel-wise scale (standard deviation) of the data, so they end up having higher values in higher-variance areas.

@tsalo
Copy link
Member Author

tsalo commented Jan 30, 2026

The fact that the F-statistic maps are more strongly correlated with the standard deviation of the optimally combined data does kind of worry me. Here is the distribution of correlations between the (1) unstandardized PE maps, (2) standardized PE maps, (3) T2 model F-statistic maps, and (4) S0 model F-statistic maps against the optcom data's SD map. The F-statistic correlations are all higher and not zero-centered, while the unstandardized PE correlations are roughly zero-centered but more extreme than the standardized PE correlations.

image

@handwerkerd
Copy link
Member

I'm a bit hesitant to change a key parameter without giving users a way to compare. I'm fine making this a breaking change and making this change the default, but maybe retain a way to calculate and save the older metrics.

This would also require clear changes and explanations in the documentation.

@tsalo
Copy link
Member Author

tsalo commented Feb 5, 2026

I'm a bit hesitant to change a key parameter without giving users a way to compare. I'm fine making this a breaking change and making this change the default, but maybe retain a way to calculate and save the older metrics.

I agree. I don't love having a metric that doesn't make sense, but given how much using a more appropriate metric changes things, it seems like we still need to support the old one. I can have the original ones called <measure>_pe and use them in tedana_orig.

@handwerkerd
Copy link
Member

I'm a bit hesitant to change a key parameter without giving users a way to compare. I'm fine making this a breaking change and making this change the default, but maybe retain a way to calculate and save the older metrics.

I agree. I don't love having a metric that doesn't make sense, but given how much using a more appropriate metric changes things, it seems like we still need to support the old one. I can have the original ones called <measure>_pe and use them in tedana_orig.

Maybe we can also put implementations of all the current trees with the current metrics onto figshare so that it's easy for someone to replicate older analyses without needing to confuse our main code base with a growing number of built-in trees. For our next release notes and mass email, I'd like to be able to say, if you want to replicate any of the previous trees use --tree X (Note to anyone else reading this, we have code that will pull a tree by name from figshare.)

Once you have the rest of this done, I can do the figshare part. We'll need to update figshare anyway because any change in default trees should be paired with a new version number on our figshare repo.

@tsalo
Copy link
Member Author

tsalo commented Feb 6, 2026

I've reduced the changes to (1) add warnings when using the deprecated metrics and (2) add new metrics that use the standardized parameter estimates. The bad metrics are still incorrectly named (i.e., using "betas", which implies standardized parameter estimates instead), but I'm not sure there's a good way around that while still keeping things working as they did in the past.

@tsalo
Copy link
Member Author

tsalo commented Feb 6, 2026

Now that the PR is streamlined, we can think about how to make breaking changes that don't break things too badly.

@handwerkerd I think stashing decision trees is a good idea. That leaves us with the opportunity to change things in the built-in decision trees in the repo. Which decision trees are you comfortable with me changing? My first thought was to update all of them except tedana_orig.

@handwerkerd
Copy link
Member

@handwerkerd I think stashing decision trees is a good idea. That leaves us with the opportunity to change things in the built-in decision trees in the repo. Which decision trees are you comfortable with me changing? My first thought was to update all of them except tedana_orig.

I think meica is the one we really don't want to change because that's the one that's supposed to replicate the original meica code. tedana_orig should theoretically change except having orig in the name makes it seem like we're keeping it fixed.

The big Q is what we want the default to be. The default is tedana_orig If we want to keep that overall structure the default but to use better-defined metrics, then we might be stuck making a new name and keeping tedana_orig as is. Maybe just tedana for the default, but tedana might then change over time as we alter the default.

Fine with changing the metrics in minimal

The core issue here, and it will keep coming up is that it is hard to predict how trees will develop. We don't have a naming scheme and I don't think there are just too many variables for how things may or may not change in the future to make a naming scheme we'd be able to use consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change WIll make a non-trivial change to outputs TE-dependence issues related to TE dependence metrics and component selection

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove/replace metrics calculated from unstandardized parameter estimates

3 participants