Skip to content

[XPU] fix float32 matmul precision: use FC_FLOAT default instead of FC_TF32 on XRE5#78625

Open
YqGe585 wants to merge 1 commit intoPaddlePaddle:developfrom
YqGe585:xpu-worker1/GEY-25-xpu-precision
Open

[XPU] fix float32 matmul precision: use FC_FLOAT default instead of FC_TF32 on XRE5#78625
YqGe585 wants to merge 1 commit intoPaddlePaddle:developfrom
YqGe585:xpu-worker1/GEY-25-xpu-precision

Conversation

@YqGe585
Copy link
Copy Markdown
Member

@YqGe585 YqGe585 commented Apr 10, 2026

PR Category

Custom Device

PR Types

Bug fixes

Description

在 XPU XRE5 硬件上,FCCalcType<float>() 原来默认使用 FC_TF32(TensorFloat-32,仅 10 位尾数精度)。在矩阵 K 维度较大时(如 K=4096),TF32 的截断误差会累积,导致 paddle.matmul 的 XPU 结果与 GPU 结果之间的最大绝对误差超过阈值(实测 max_abs_diff=0.0139818 > atol=0.01)。

GPU 侧默认关闭 TF32(FLAGS_cublas_allow_tf32=false),使用完整 float32 精度。本 PR 将 XRE5 上的默认计算类型从 FC_TF32 改为 FC_FLOAT,使 XPU 与 GPU 精度行为保持一致。

需要 TF32 性能的用户仍可通过设置环境变量 XPU_PADDLE_FC_TF32 来启用。

修复前: max_abs_diff=0.0139818 → 精度检查失败
修复后: max_abs_diff=0.000133514 → 精度检查通过(精度提升约 100 倍)

是否引起精度变化

是。XPU float32 矩阵乘法精度与 GPU 保持一致,max_abs_diff 从 ~0.014 降至 ~0.00013。

…atch GPU precision

On XRE5 hardware, FCCalcType<float>() previously defaulted to FC_TF32,
which uses 10-bit mantissa TensorFloat-32 accumulation. For large matmuls
(e.g. [1,4096,4096] @ [4096,32000]), this causes max_abs_diff to exceed
the 0.01 atol threshold vs GPU results.

Change the default to FC_FLOAT (full float32 accumulation), matching the
GPU policy where FLAGS_cublas_allow_tf32=false disables TF32 by default.
Users who prefer TF32 performance can still set env var XPU_PADDLE_FC_TF32.

Verified: max_abs_diff dropped from 0.0139818 to 0.000133514 for the
paddle.matmul(Tensor([1,4096,4096],"float32"), Tensor([4096,32000],"float32"))
configuration that was previously failing the accuracy check.
@paddle-bot
Copy link
Copy Markdown

paddle-bot bot commented Apr 10, 2026

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@YqGe585
Copy link
Copy Markdown
Member Author

YqGe585 commented Apr 10, 2026

/re-run all-failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant