Generation config boolean defaults #43000

zucchini-nlp · 2025-12-22T12:22:09Z

What does this PR do?

Continues on #42958 and makes sure the opposite case, when config has a non-default value saved but the user wants to set it back to a default boolean, is also covered

HuggingFaceDocBuilderDev · 2025-12-22T12:30:43Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2025-12-22T12:55:20Z

run-slow: llama, gemma2, qwen2, bart, t5, whisper

github-actions · 2025-12-22T12:56:38Z

This comment contains run-slow, running the specified jobs:

models: ["models/bart", "models/gemma2", "models/llama", "models/qwen2", "models/t5", "models/whisper"]
quantizations: []

github-actions · 2025-12-22T13:22:13Z

CI Results

Workflow Run ⚙️

Model CI Report

❌ Failed tests

bart:
tests/models/bart/test_modeling_bart.py::BartModelIntegrationTests::test_xsum_config_generation_params

zucchini-nlp · 2025-12-23T08:25:01Z

src/transformers/configuration_utils.py

+            if key == "use_cache":
+                continue  # common key for most models


btw, could we remove use_cache as a key from config and instead keep it only in generation_config? Imo it isn't really a model attribute

Hmm unsure. I understand the point but that would be hugely breaking for a lot of users and external libraries - on the other hand we could include this for v5 if you feel motivated 🤔

it will definitely break stuff 😿 Just a thought for now

so, this is a workaround we have to not throw errors that model config contains generation params. We won't throw errors anyway if use_cache is in model config signature but unfortunately many don't have it documented

Gotcha, maybe at some point we can 👀

zucchini-nlp · 2025-12-23T08:26:06Z

src/transformers/trainer.py

-        if getattr(self.model, "config", None) is not None:
-            self.model.config.use_cache = self.args.use_cache
+        if getattr(self.model, "generation_config", None) is not None:
+            self.model.generation_config.use_cache = self.args.use_cache


model config might not contain use_cache unless it is in config's own init. We don't assign generation attributes anymore in base config class

So checking generation_config is more reliable

What's the current logic here then? kwarg > config > generation config? Seems like another case of bad relations here 😭

It does feel a bit out of place tho to me. I'd like to keep this PR strictly to the default values, this semi progresses the point of removing reliance on use_cache in the config. On another note, we use the check_model_inputs decorator with the config use_cache so it might be hard to disentangle this completely

we are nor removing use_cache yet in this PR. I hacked by the above check that doesn't include use_cache in generation params

I will revert it back, forgot to do so after testing some stuff

The logic stays as is, so yes, kwarg > config > generation config

vasqu

Left a few comments, I'd be pro changing the use_cache changes (at least for now) - see my last comment.

Also do we still need

transformers/src/transformers/generation/utils.py

Lines 1800 to 1812 in a616b91

    
           # Due to some values being boolean and not `None`, we need additional logic to overwrite 
        
           # them explicitly (`defaults_only=False`) on the condition that it's only a previous 
        
           # default value 
        
           default_generation_config = GenerationConfig() 
        
           generation_config.update( 
        
               **{ 
        
                   k: v 
        
                   for k, v in self.generation_config.to_dict().items() 
        
                   if isinstance(v, bool) 
        
                   and hasattr(default_generation_config, k) 
        
                   and getattr(generation_config, k, None) == getattr(default_generation_config, k) 
        
               } 
        
           )

I added this only due to these not being none within their defaults, but now we should update them properly without that workaround, no?

Last note, semi-unrelated, let's also keep track of https://github.com/huggingface/transformers/pull/42702/files#r2635806800 in another PR cc @ebezzam

vasqu · 2026-01-05T16:37:47Z

src/transformers/configuration_utils.py

+            if key == "use_cache":
+                continue  # common key for most models


Hmm unsure. I understand the point but that would be hugely breaking for a lot of users and external libraries - on the other hand we could include this for v5 if you feel motivated 🤔

src/transformers/generation/configuration_utils.py

vasqu · 2026-01-05T16:38:51Z

src/transformers/generation/configuration_utils.py

            generation_mode = GenerationMode.CONSTRAINED_BEAM_SEARCH
        elif self.num_beams is None or self.num_beams == 1:
-            if not self.do_sample:
+            if self.do_sample is not True:


Suggested change

if self.do_sample is not True:

if self.do_sample is False:

super nit, seen this multiple times

Actually, it should be if not self.do_sample haha

if not do_sample will result in false positives when do_sample is None 😢 Gotta be explicitly checking for not True, since the value of None is same as having False, i.e. the new default value

Can we add a small comment then, that's such a weird case lol.

src/transformers/generation/configuration_utils.py

vasqu · 2026-01-05T16:43:05Z

src/transformers/models/clvp/modeling_clvp.py

Are these changes really relevant? Seems unrelated to me

some tests were failing without it because the re-loaded model has a incorrect config type. And we couldn't find generation keys in it
It was luckily passing in main branch

Ok, then let's keep it, not wrong was just surprised

vasqu · 2026-01-05T16:46:05Z

src/transformers/trainer.py

-        if getattr(self.model, "config", None) is not None:
-            self.model.config.use_cache = self.args.use_cache
+        if getattr(self.model, "generation_config", None) is not None:
+            self.model.generation_config.use_cache = self.args.use_cache


What's the current logic here then? kwarg > config > generation config? Seems like another case of bad relations here 😭

It does feel a bit out of place tho to me. I'd like to keep this PR strictly to the default values, this semi progresses the point of removing reliance on use_cache in the config. On another note, we use the check_model_inputs decorator with the config use_cache so it might be hard to disentangle this completely

vasqu

LGTM, could you just double check if we still need

transformers/src/transformers/generation/utils.py

Lines 1800 to 1812 in a616b91

    
           # Due to some values being boolean and not `None`, we need additional logic to overwrite 
        
           # them explicitly (`defaults_only=False`) on the condition that it's only a previous 
        
           # default value 
        
           default_generation_config = GenerationConfig() 
        
           generation_config.update( 
        
               **{ 
        
                   k: v 
        
                   for k, v in self.generation_config.to_dict().items() 
        
                   if isinstance(v, bool) 
        
                   and hasattr(default_generation_config, k) 
        
                   and getattr(generation_config, k, None) == getattr(default_generation_config, k) 
        
               } 
        
           )

Now that these values are defaulting to None as well, I suspect we might not need it anymore

vasqu · 2026-01-08T13:08:54Z

src/transformers/configuration_utils.py

+            if key == "use_cache":
+                continue  # common key for most models


Gotcha, maybe at some point we can 👀

src/transformers/generation/configuration_utils.py

vasqu · 2026-01-08T13:11:59Z

src/transformers/generation/configuration_utils.py

            generation_mode = GenerationMode.CONSTRAINED_BEAM_SEARCH
        elif self.num_beams is None or self.num_beams == 1:
-            if not self.do_sample:
+            if self.do_sample is not True:


Can we add a small comment then, that's such a weird case lol.

vasqu · 2026-01-08T13:12:30Z

src/transformers/models/clvp/modeling_clvp.py

Ok, then let's keep it, not wrong was just surprised

Co-authored-by: Anton Vlasjuk <[email protected]>

zucchini-nlp · 2026-01-08T13:18:20Z

LGTM, could you just double check if we still need

my bad, didn't see the comment on it. Not needed!

github-actions · 2026-01-08T13:19:23Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: bart, clvp

Cyrilvallez · 2026-01-07T10:44:22Z

src/transformers/generation/configuration_utils.py

-        self.num_return_sequences = kwargs.pop("num_return_sequences", 1)
-        self.output_attentions = kwargs.pop("output_attentions", False)
-        self.output_hidden_states = kwargs.pop("output_hidden_states", False)
-        self.output_scores = kwargs.pop("output_scores", False)
-        self.output_logits = kwargs.pop("output_logits", False)
-        self.return_dict_in_generate = kwargs.pop("return_dict_in_generate", False)
+        self.num_return_sequences = kwargs.pop("num_return_sequences", None)
+        self.output_attentions = kwargs.pop("output_attentions", None)
+        self.output_hidden_states = kwargs.pop("output_hidden_states", None)
+        self.output_scores = kwargs.pop("output_scores", None)
+        self.output_logits = kwargs.pop("output_logits", None)
+        self.return_dict_in_generate = kwargs.pop("return_dict_in_generate", None)


Sorry, I'm a bit lost into the motivation the PR and making those values to None (which is usually not a super good idea if they are boolean by essence) - could you give me a bit of context on the PR?

The motivation was to fix a ton of edge cases when creating a generation config. In the past we had ugly workaround to allow users to change a generation param back to default if different value was saved on the hub (for ex #42762).

That was fixed in #42702 and I deleted a lot of code, and defined a clear priority list, Next step will be to disallow users to pass both: a generation config and kwargs. Instead ask to put all kwargs in config. The below is accepted rn and is a bad practice imo

conf = GenerationConfig(do_sample=False) text = model.generate(input_ids, generation_config=conf, max_new_tokens=128, temperature=1.6)

* push * fix a few tests * why no tests are failing, suspicious * one tiny fix * maybe? * typo * clvp has incorrect config types * nit * docstring and revert unrelated trainer * forgot to revert this one * Update src/transformers/generation/configuration_utils.py Co-authored-by: Anton Vlasjuk <[email protected]> * delete the prev fix and add small comment --------- Co-authored-by: Anton Vlasjuk <[email protected]>

push

bda4a76

zucchini-nlp added 2 commits December 22, 2025 13:47

fix a few tests

d2da9b0

why no tests are failing, suspicious

e59b321

zucchini-nlp added 4 commits December 22, 2025 14:37

one tiny fix

ca46e33

maybe?

70fdb01

typo

a635307

clvp has incorrect config types

91ae7d8

zucchini-nlp commented Dec 23, 2025

View reviewed changes

nit

b40d249

zucchini-nlp changed the title ~~[WIP] Generation config boolean defaults~~ Generation config boolean defaults Dec 23, 2025

zucchini-nlp requested review from ArthurZucker and vasqu and removed request for vasqu December 23, 2025 08:27

Merge remote-tracking branch 'upstream/main' into gen-config-defaults

3f07426

vasqu reviewed Jan 5, 2026

View reviewed changes

zucchini-nlp added 2 commits January 8, 2026 13:25

docstring and revert unrelated trainer

1aeb0ea

forgot to revert this one

a6e71a8

zucchini-nlp requested a review from vasqu January 8, 2026 12:31

vasqu approved these changes Jan 8, 2026

View reviewed changes

Update src/transformers/generation/configuration_utils.py

1608238

Co-authored-by: Anton Vlasjuk <[email protected]>

delete the prev fix and add small comment

6f68cc0

zucchini-nlp enabled auto-merge (squash) January 8, 2026 13:24

zucchini-nlp merged commit 93d7aff into huggingface:main Jan 8, 2026
25 checks passed

Cyrilvallez reviewed Jan 8, 2026

View reviewed changes

vasqu mentioned this pull request Jan 8, 2026

[Generate] Allow custom config values in generate config #43181

Merged

	# Due to some values being boolean and not `None`, we need additional logic to overwrite
	# them explicitly (`defaults_only=False`) on the condition that it's only a previous
	# default value
	default_generation_config = GenerationConfig()
	generation_config.update(
	**{
	k: v
	for k, v in self.generation_config.to_dict().items()
	if isinstance(v, bool)
	and hasattr(default_generation_config, k)
	and getattr(generation_config, k, None) == getattr(default_generation_config, k)
	}
	)

Generation config boolean defaults #43000

Generation config boolean defaults #43000

Uh oh!

Conversation

zucchini-nlp commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 22, 2025

Uh oh!

zucchini-nlp commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

Uh oh!

github-actions bot commented Dec 22, 2025

CI Results

Model CI Report

❌ Failed tests

Uh oh!

zucchini-nlp Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp commented Jan 8, 2026

Uh oh!

github-actions bot commented Jan 8, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

zucchini-nlp commented Dec 22, 2025 •

edited

Loading

zucchini-nlp Dec 23, 2025 •

edited

Loading

zucchini-nlp Dec 23, 2025 •

edited

Loading