Ensures source and target IDs are paired up in SGD#296
Merged
weefuzzy merged 5 commits intoflucoma:mainfrom Apr 6, 2025
Merged
Ensures source and target IDs are paired up in SGD#296weefuzzy merged 5 commits intoflucoma:mainfrom
weefuzzy merged 5 commits intoflucoma:mainfrom
Conversation
Member
|
Thanks for this mammoth work! Maybe @lewardo wants to also try this in his spare time. That will have consequences for the WiP of DataSeries for which @weefuzzy has another idea IIRC but for now it would be good to also get him to read it. Also if @AlexHarker wants to try this bespoke objects for his piece from that branch that might be a good idea. I will run the test on Monday as I am in compo/impro/code residency for the next days... |
Member
|
it all works here. Is it possible that it doesn't bounce as much as before, or have I been lucky? |
tremblap
approved these changes
Mar 22, 2025
Member
|
I think I've been lucky. In all cases it works fine. |
Member
|
Working for me! |
2aae3ba to
0454ce0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fixes #292
Ok, ok, I know this looks like a massive PR, but it's not really 😄
The problem
SGD::trainassumes that its source and target data are aligned, i.e. that each index represents the same ID. This not necessarily the case because we're using unordered containers by design. You can make training silently fail to do anything sensible by adding points to source and target in a different order, for instance. This is baaaaad.But why is this diff so large, you 'orrible man?
The core of this PR is decoupling the batching and sampling of data from
SGD, a bit like pytorch. I like thatSGDdoesn't want to know aboutDataSet, it just needed to go further.So we have
FluidDataSet,FluidTensorViewUser-side, nothing much should change besides a couple of error messages: so, reviewers: please test this proposition by training against your own stuff and making sure it still works as it did.
* Like, it would now be pretty easy to enable quicker n' dirtier MLP training (resp. the rest of the data objects) with CCE buffers alone for those cases where people don't want / care about IDs. I think this would have significant workflow and teaching upsides...