Skip to content

Add/Modify ModelResult to Dataset class, for Model fitting and preparing for Multi-Dataset fittin#1015

Open
newville wants to merge 2 commits intomasterfrom
with_dataset
Open

Add/Modify ModelResult to Dataset class, for Model fitting and preparing for Multi-Dataset fittin#1015
newville wants to merge 2 commits intomasterfrom
with_dataset

Conversation

@newville
Copy link
Copy Markdown
Member

@newville newville commented Aug 6, 2025

Description

This is the start of an effort to support multiple-dataset fitting. As outlined, discussed a bit at #1012, the basic idea is to make a Dataset class that has data, model, and params, and the appropriate independent variables and can do a fit.

As pointed out in #1012, this is essentially what ModelResult is, so the first part of the PR here simply:

a) renames ModelResult to Dataset, but retains an alias and functions such as save_modelresult() (now an alias for save_dataset()).
b) adds a label attribute, defaulting to None.

This is not ready for merging, but to be part of the discussion at #1012.

Still needed:

a) use label as a prefix-prefix for all model component prefixes.
b) add a MultiDataset class (maybe sublcassing from Dataset, but probably needing several over-written methods) that contains a dictionary of Datasets (keyed by label), and can do the prefix-management to organize the Parameters and Datasets.

At this point, all tests pass. I did fix one test for load_modelresult that tests an exception message, which now has better grammar. So far, I think we can say "nothing is broken by this" ;).

Type of Changes
  • Bug fix
  • New feature
  • Refactoring / maintenance
  • Documentation / examples
Tested on
Verification

Have you

  • included docstrings that follow PEP 257?
  • referenced existing Issue and/or provided relevant link to mailing list?
  • verified that existing tests pass locally?
  • verified that the documentation builds locally?
  • squashed/minimized your commits and written descriptive commit messages?
  • added or updated existing tests to cover the changes?
  • updated the documentation and/or added an entry to the release notes (doc/whatsnew.rst)?
  • added an example?

@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 6, 2025

Codecov Report

❌ Patch coverage is 96.42857% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.93%. Comparing base (640eb19) to head (fdb3818).
⚠️ Report is 49 commits behind head on master.

Files with missing lines Patch % Lines
lmfit/model.py 96.42% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1015   +/-   ##
=======================================
  Coverage   92.92%   92.93%           
=======================================
  Files          10       10           
  Lines        3871     3876    +5     
=======================================
+ Hits         3597     3602    +5     
  Misses        274      274           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@erik-mansson
Copy link
Copy Markdown

erik-mansson commented Aug 27, 2025

For thinking about workflows and example code, I feel that the next step could be to consider how one is supposed to construct Dataset instances. Currently you would get one as a result of Model.fit() (one Model to one data-array). From the #1012 discussion (and my #1018 perspective) the idea would also be to find ways where several Dataset instances would be constructed and parameters probably adjusted (values, bounds, expressions) before fitting, and then used in a multi-model multi-dataset fit (a.k.a. global fit).

Currently, Dataset.__init__() requires model and params while the data itself is optional. Especially if we consider cases where a model has a good guess()-ability, it would seem nice to provide just data and model and then perhaps create a Dataset.guess() that calls self.model.guess() and updates self.init_params (not self.params which are for the fit result). If you wanted a fit_history attribute, perhaps the act of guessing could also be logged there? But, sure, a minimal approach could be that the Parameters need to be prepared completely before constructing the Dataset that just bundles them up with data and model.

If the idea would be that Dataset.label would automatically become the prefix of the model (to distinguish its parameters in a multi-fit scenario), we should either let Dataset be the one calling the Model-constructor (i.e. pass a model type rather than an instance as argument, e.g. as in repeat_model() of PR #1018 ) or do some major changes in Model to allow its prefix to be changed after it has been constructed, or introduce a way to wrap an existing Model instance to make it appear to have an extra prefix.

It's possible that some of the issues I think about are not meant to be solved by Dataset itself, but by an analogous future MultiDataset. I am also not sure whether my comment belongs on the PR (as feedback) or in #1012 (as kind of general discussion). Can move it if desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants