Skip to content

Conversation

@albertvillanova
Copy link
Member

Set dtype default to float32.

Follow-up to:

@albertvillanova albertvillanova marked this pull request as ready for review January 6, 2026 19:07
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec
Copy link
Member

For the record, QLoRA with DPO will still force the model dtype to be in fp32:

from peft import LoraConfig
from transformers import AutoModelForCausalLM
from datasets import load_dataset
from trl import DPOTrainer
from transformers import BitsAndBytesConfig


model = AutoModelForCausalLM.from_pretrained(
    "trl-internal-testing/tiny-Qwen2ForCausalLM-2.5",
    dtype="float32",
    quantization_config=BitsAndBytesConfig(load_in_4bit=True),
)
dataset = load_dataset("trl-internal-testing/zen", "standard_preference", split="train")

trainer = DPOTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=LoraConfig(),
)
trainer.train()

but in my opinion it's fine, it will be fixed by #3906

@albertvillanova
Copy link
Member Author

Thanks for your review, @qgallouedec.

Although the CI was green, I see you set float32 dtype in some tests, but not in others. I am wondering what criteria you used.

@albertvillanova albertvillanova merged commit 4d52e02 into huggingface:main Jan 12, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants