-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Deprecate dtype per sub config #42990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate dtype per sub config #42990
Conversation
|
run-slow: llava, llava_next, llava_next_video, llava_onevision |
|
This comment contains models: ["models/llava", "models/llava_next", "models/llava_next_video", "models/llava_onevision"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
CI Results✅ No failing test specific to this PR 🎉 ! |
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
src/transformers/modeling_utils.py
Outdated
| logger.warning_once( | ||
| f"The dtype for {sub_config_key}={sub_dtype} is different from the main dtype={main_dtype}. " | ||
| "Setting different dtypes per backbone model might cause device errors downstream, setting the " | ||
| f"dtype={main_dtype} for all modules. Note, using different dtypes per module is deprecated and will " | ||
| "be removed in future versions." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that in the future it will raise an exception, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, after a few released we can switch to an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that if a user sets dtype="auto", all sub modules will be forcibly cast to the specified main_dtype?
Cyrilvallez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, then we need to plainly remove dtype being passed as a dict as well!
Cyrilvallez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, if there is not many models with different dtypes in subconfigs, it's indeed better to only show the warning when explicitly passing a dict, and silently skipping otherwise!
* deprecate dtypes per sub config * fix test * Update tests/utils/test_modeling_utils.py Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com> * address comment --------- Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
What does this PR do?
Fixes #42968, #42641
As discussed, this PR deprecated different dtypes per backbone and falls back to the main dtype if a user is requesting different dtypes.
There is no clean way to support different dtypes when the backbone can have a task head. We either cast hidden states to task dtype in all model files, or we need a workaround to infer task head and force dtype on it when loading. So the better solution is to deprecate dict dtypes, apparently it is not used much