Skip to content

torch._inductor.exc.InductorError: AssertionError: both a fallback and a decomp for same op: aten.index_add.default #3191

@mengfei25

Description

@mengfei25

🐛 Describe the bug

Error info

2026-03-26 09:09:23 [INFO] Torch: 46bd5401d44e1b1bc59300e0fbe0dff4afdcffa8, Triton: 3.7.0+git33f782ef
2026-03-26 09:09:23 [DEBUG] Running: rm -rf ~/.cache/ ~/.triton/ /tmp/torchinductor_*
2026-03-26 09:09:24 [DEBUG] Running: /workspace/myvenv/bin/python /workspace/pytorch-src/benchmarks/dynamo/huggingface.py --accuracy --bfloat16 --training -d xpu -n 10 --only XLNetLMHeadModel --backend=inductor --cold-start-latency --timeout 10800 --disable-cudagraphs --output /workspa
ce/inductor_log/46bd5401d44e1b1bc59300e0fbe0dff4afdcffa8/huggingface/bfloat16/inductor_huggingface_bfloat16_training_xpu_1774516161.061208_accuracy.csv 2>&1 | tee -a /workspace/inductor_log/46bd5401d44e1b1bc59300e0fbe0dff4afdcffa8/huggingface/bfloat16/inductor_huggingface_bfloat16_trai
ning_xpu_1774516161.061208_accuracy.log
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
loading model: 0it [00:04, ?it/s]
xpu  train XLNetLMHeadModel
W0326 09:10:15.998000 4181 torch/_inductor/utils.py:2757] [2/0_1] DeviceCopy in input program
ERROR:common:
Traceback (most recent call last):
  File "/workspace/pytorch-src/benchmarks/dynamo/common.py", line 2376, in check_accuracy
    new_result = self.run_n_iterations(
  File "/workspace/pytorch-src/benchmarks/dynamo/common.py", line 2074, in run_n_iterations
    model_iter_fn(mod, inputs, collect_outputs=False)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 1050, in compile_wrapper
    raise e.remove_dynamo_frames() from None  # see TORCHDYNAMO_VERBOSE=1
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1053, in _compile_fx_inner
    raise InductorError(e, currentframe()).with_traceback(
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1037, in _compile_fx_inner
    mb_compiled_graph = fx_codegen_and_compile(
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1802, in fx_codegen_and_compile
    return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/compile_fx.py", line 1489, in codegen_and_compile
    graph.run(*example_inputs)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1034, in run
    return super().run(*args)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 200, in run
    self.env[node] = self.run_node(node)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1867, in run_node
    result = super().run_node(n)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/fx/interpreter.py", line 297, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/graph.py", line 1314, in call_function
    make_fallback(target, warn=False, get_decomp_fn=self.get_decomp_fn)
  File "/workspace/myvenv/lib/python3.10/site-packages/torch/_inductor/lowering.py", line 2444, in make_fallback
    assert op not in check_decomps or override_decomp, (
torch._inductor.exc.InductorError: AssertionError: both a fallback and a decomp for same op: aten.index_add.default
TorchDynamo optimized model failed to run because of following error
fail_to_run

Reproducer

docker run -it --device=/dev/mem --device=/dev/dri --group-add video --privileged --shm-size=8g intelgpu/ubuntu-24.04-lts2:2523.40 bash

# Python env
curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR="/usr/local/bin" sh
uv venv myvenv --python 3.10 --clear
source myvenv/bin/activate
uv pip install pip numpy 'setuptools<81' wheel

# Get pre-built torch wheel
gh --repo intel/torch-xpu-ops run download 23579688451 -p "Torch-XPU-Wheel-*"
uv pip install  Torch-XPU-Wheel-3185-23579688451-1/*.whl

uv pip install pip pandas psutil scipy requests
uv pip install transformers==4.55.2 accelerate
uv pip install -U numpy==1.26.4

pytorch_commit="$(python -c 'import torch; print(torch.version.git_version)')"
git clone https://github.com/pytorch/pytorch
cd pytorch
git checkout ${pytorch_commit}

python benchmarks/dynamo/huggingface.py --accuracy --bfloat16 --training -d xpu -n 10 --only AllenaiLongformerBase --backend=inductor --cold-start-latency --timeout 10800 --disable-cudagraphs

Versions

pytorch: 2.12.0a0+git46bd540
device: PVC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions