-
Notifications
You must be signed in to change notification settings - Fork 88
Open
Description
🐛 Describe the bug
Error info
2026-03-26 09:09:23 [INFO] Torch: 46bd5401d44e1b1bc59300e0fbe0dff4afdcffa8, Triton: 3.7.0+git33f782ef
2026-03-26 09:09:23 [DEBUG] Running: rm -rf ~/.cache/ ~/.triton/ /tmp/torchinductor_*
2026-03-26 09:09:24 [DEBUG] Running: /workspace/myvenv/bin/python /workspace/pytorch-src/benchmarks/dynamo/huggingface.py --accuracy --bfloat16 --training -d xpu -n 10 --only XLNetLMHeadModel --backend=inductor --cold-start-latency --timeout 10800 --disable-cudagraphs --output /workspa
ce/inductor_log/46bd5401d44e1b1bc59300e0fbe0dff4afdcffa8/huggingface/bfloat16/inductor_huggingface_bfloat16_training_xpu_1774516161.061208_accuracy.csv 2>&1 | tee -a /workspace/inductor_log/46bd5401d44e1b1bc59300e0fbe0dff4afdcffa8/huggingface/bfloat16/inductor_huggingface_bfloat16_trai
ning_xpu_1774516161.061208_accuracy.log
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
loading model: 0it [00:04, ?it/s]
xpu train XLNetLMHeadModel
W0326 09:10:15.998000 4181 torch/_inductor/utils.py:2757] [2/0_1] DeviceCopy in input program
ERROR:common:
Traceback (most recent call last):
File "/workspace/pytorch-src/benchmarks/dynamo/common.py", line 2376, in check_accuracy
new_result = self.run_n_iterations(
File "/workspace/pytorch-src/benchmarks/dynamo/common.py", line 2074, in run_n_iterations
model_iter_fn(mod, inputs, collect_outputs=False)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1050, in compile_wrapper
raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1053, in _compile_fx_inner
raise InductorError(e, currentframe()).with_traceback(
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1037, in _compile_fx_inner
mb_compiled_graph = fx_codegen_and_compile(
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1802, in fx_codegen_and_compile
return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1489, in codegen_and_compile
graph.run(*example_inputs)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1034, in run
return super().run(*args)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 200, in run
self.env[node] = self.run_node(node)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1867, in run_node
result = super().run_node(n)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 297, in run_node
return getattr(self, n.op)(n.target, args, kwargs)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1314, in call_function
make_fallback(target, warn=False, get_decomp_fn=self.get_decomp_fn)
File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 2444, in make_fallback
assert op not in check_decomps or override_decomp, (
torch._inductor.exc.InductorError: AssertionError: both a fallback and a decomp for same op: aten.index_add.default
TorchDynamo optimized model failed to run because of following error
fail_to_run
Reproducer
docker run -it --device=/dev/mem --device=/dev/dri --group-add video --privileged --shm-size=8g intelgpu/ubuntu-24.04-lts2:2523.40 bash
# Python env
curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR="/usr/local/bin" sh
uv venv myvenv --python 3.10 --clear
source myvenv/bin/activate
uv pip install pip numpy 'setuptools<81' wheel
# Get pre-built torch wheel
gh --repo intel/torch-xpu-ops run download 23579688451 -p "Torch-XPU-Wheel-*"
uv pip install Torch-XPU-Wheel-3185-23579688451-1/*.whl
uv pip install pip pandas psutil scipy requests
uv pip install transformers==4.55.2 accelerate
uv pip install -U numpy==1.26.4
pytorch_commit="$(python -c 'import torch; print(torch.version.git_version)')"
git clone https://github.com/pytorch/pytorch
cd pytorch
git checkout ${pytorch_commit}
python benchmarks/dynamo/huggingface.py --accuracy --bfloat16 --training -d xpu -n 10 --only AllenaiLongformerBase --backend=inductor --cold-start-latency --timeout 10800 --disable-cudagraphs
Versions
pytorch: 2.12.0a0+git46bd540
device: PVC
Reactions are currently unavailable