-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Generation config boolean defaults #43000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generation config boolean defaults #43000
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
run-slow: llama, gemma2, qwen2, bart, t5, whisper |
|
This comment contains models: ["models/bart", "models/gemma2", "models/llama", "models/qwen2", "models/t5", "models/whisper"] |
CI ResultsModel CI Report❌ Failed tests
|
| if key == "use_cache": | ||
| continue # common key for most models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, could we remove use_cache as a key from config and instead keep it only in generation_config? Imo it isn't really a model attribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm unsure. I understand the point but that would be hugely breaking for a lot of users and external libraries - on the other hand we could include this for v5 if you feel motivated 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will definitely break stuff 😿 Just a thought for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, this is a workaround we have to not throw errors that model config contains generation params. We won't throw errors anyway if use_cache is in model config signature but unfortunately many don't have it documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, maybe at some point we can 👀
src/transformers/trainer.py
Outdated
| if getattr(self.model, "config", None) is not None: | ||
| self.model.config.use_cache = self.args.use_cache | ||
| if getattr(self.model, "generation_config", None) is not None: | ||
| self.model.generation_config.use_cache = self.args.use_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model config might not contain use_cache unless it is in config's own init. We don't assign generation attributes anymore in base config class
So checking generation_config is more reliable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the current logic here then? kwarg > config > generation config? Seems like another case of bad relations here 😭
It does feel a bit out of place tho to me. I'd like to keep this PR strictly to the default values, this semi progresses the point of removing reliance on use_cache in the config. On another note, we use the check_model_inputs decorator with the config use_cache so it might be hard to disentangle this completely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are nor removing use_cache yet in this PR. I hacked by the above check that doesn't include use_cache in generation params
I will revert it back, forgot to do so after testing some stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic stays as is, so yes, kwarg > config > generation config
vasqu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments, I'd be pro changing the use_cache changes (at least for now) - see my last comment.
Also do we still need
transformers/src/transformers/generation/utils.py
Lines 1800 to 1812 in a616b91
| # Due to some values being boolean and not `None`, we need additional logic to overwrite | |
| # them explicitly (`defaults_only=False`) on the condition that it's only a previous | |
| # default value | |
| default_generation_config = GenerationConfig() | |
| generation_config.update( | |
| **{ | |
| k: v | |
| for k, v in self.generation_config.to_dict().items() | |
| if isinstance(v, bool) | |
| and hasattr(default_generation_config, k) | |
| and getattr(generation_config, k, None) == getattr(default_generation_config, k) | |
| } | |
| ) |
I added this only due to these not being none within their defaults, but now we should update them properly without that workaround, no?
Last note, semi-unrelated, let's also keep track of https://github.com/huggingface/transformers/pull/42702/files#r2635806800 in another PR cc @ebezzam
| if key == "use_cache": | ||
| continue # common key for most models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm unsure. I understand the point but that would be hugely breaking for a lot of users and external libraries - on the other hand we could include this for v5 if you feel motivated 🤔
| generation_mode = GenerationMode.CONSTRAINED_BEAM_SEARCH | ||
| elif self.num_beams is None or self.num_beams == 1: | ||
| if not self.do_sample: | ||
| if self.do_sample is not True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if self.do_sample is not True: | |
| if self.do_sample is False: |
super nit, seen this multiple times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it should be if not self.do_sample haha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not do_sample will result in false positives when do_sample is None 😢 Gotta be explicitly checking for not True, since the value of None is same as having False, i.e. the new default value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a small comment then, that's such a weird case lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes really relevant? Seems unrelated to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some tests were failing without it because the re-loaded model has a incorrect config type. And we couldn't find generation keys in it
It was luckily passing in main branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then let's keep it, not wrong was just surprised
src/transformers/trainer.py
Outdated
| if getattr(self.model, "config", None) is not None: | ||
| self.model.config.use_cache = self.args.use_cache | ||
| if getattr(self.model, "generation_config", None) is not None: | ||
| self.model.generation_config.use_cache = self.args.use_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the current logic here then? kwarg > config > generation config? Seems like another case of bad relations here 😭
It does feel a bit out of place tho to me. I'd like to keep this PR strictly to the default values, this semi progresses the point of removing reliance on use_cache in the config. On another note, we use the check_model_inputs decorator with the config use_cache so it might be hard to disentangle this completely
vasqu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, could you just double check if we still need
transformers/src/transformers/generation/utils.py
Lines 1800 to 1812 in a616b91
| # Due to some values being boolean and not `None`, we need additional logic to overwrite | |
| # them explicitly (`defaults_only=False`) on the condition that it's only a previous | |
| # default value | |
| default_generation_config = GenerationConfig() | |
| generation_config.update( | |
| **{ | |
| k: v | |
| for k, v in self.generation_config.to_dict().items() | |
| if isinstance(v, bool) | |
| and hasattr(default_generation_config, k) | |
| and getattr(generation_config, k, None) == getattr(default_generation_config, k) | |
| } | |
| ) |
Now that these values are defaulting to None as well, I suspect we might not need it anymore
| if key == "use_cache": | ||
| continue # common key for most models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, maybe at some point we can 👀
| generation_mode = GenerationMode.CONSTRAINED_BEAM_SEARCH | ||
| elif self.num_beams is None or self.num_beams == 1: | ||
| if not self.do_sample: | ||
| if self.do_sample is not True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a small comment then, that's such a weird case lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then let's keep it, not wrong was just surprised
Co-authored-by: Anton Vlasjuk <[email protected]>
my bad, didn't see the comment on it. Not needed! |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bart, clvp |
| self.num_return_sequences = kwargs.pop("num_return_sequences", 1) | ||
| self.output_attentions = kwargs.pop("output_attentions", False) | ||
| self.output_hidden_states = kwargs.pop("output_hidden_states", False) | ||
| self.output_scores = kwargs.pop("output_scores", False) | ||
| self.output_logits = kwargs.pop("output_logits", False) | ||
| self.return_dict_in_generate = kwargs.pop("return_dict_in_generate", False) | ||
| self.num_return_sequences = kwargs.pop("num_return_sequences", None) | ||
| self.output_attentions = kwargs.pop("output_attentions", None) | ||
| self.output_hidden_states = kwargs.pop("output_hidden_states", None) | ||
| self.output_scores = kwargs.pop("output_scores", None) | ||
| self.output_logits = kwargs.pop("output_logits", None) | ||
| self.return_dict_in_generate = kwargs.pop("return_dict_in_generate", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm a bit lost into the motivation the PR and making those values to None (which is usually not a super good idea if they are boolean by essence) - could you give me a bit of context on the PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation was to fix a ton of edge cases when creating a generation config. In the past we had ugly workaround to allow users to change a generation param back to default if different value was saved on the hub (for ex #42762).
That was fixed in #42702 and I deleted a lot of code, and defined a clear priority list, Next step will be to disallow users to pass both: a generation config and kwargs. Instead ask to put all kwargs in config. The below is accepted rn and is a bad practice imo
conf = GenerationConfig(do_sample=False)
text = model.generate(input_ids, generation_config=conf, max_new_tokens=128, temperature=1.6)* push * fix a few tests * why no tests are failing, suspicious * one tiny fix * maybe? * typo * clvp has incorrect config types * nit * docstring and revert unrelated trainer * forgot to revert this one * Update src/transformers/generation/configuration_utils.py Co-authored-by: Anton Vlasjuk <[email protected]> * delete the prev fix and add small comment --------- Co-authored-by: Anton Vlasjuk <[email protected]>
What does this PR do?
Continues on #42958 and makes sure the opposite case, when config has a non-default value saved but the user wants to set it back to a default boolean, is also covered