[Bug]: Attempts to quantize Granite 4.0-h-small to 4bits fail

### ⚙️ Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
### Environment Information ###
Operating System: `Linux-6.8.0-94-generic-x86_64-with-glibc2.39`
Python Version: `3.12.3 (main, Nov  6 2025, 13:44:16) [GCC 13.3.0]`
llm-compressor Version: `0.9.0.1`
compressed-tensors Version: `0.13.0`
transformers Version: `4.57.3`
torch Version: `2.9.1`
CUDA Devices: `['NVIDIA RTX PRO 6000 Blackwell Workstation Edition']`
AMD Devices: `None`
NPU Devices: `None`
```

</details>


### 🐛 Describe the bug

Attempts to quantize Granite 4.0-h-small to 4bits fail in a variety of ways:

- If I use simple w4a16, implemented following the FP8 example (source attached), I get:

```
✓ Quantization complete

Attempting 3D conversion...
Traceback (most recent call last):
  File "/root/test_w4a16_no_exclusion.py", line 95, in <module>
    main()
  File "/root/test_w4a16_no_exclusion.py", line 78, in main
    m.to_3d_expert()
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/modeling/granite4.py", line 40, in to_3d_expert
    self.weight.shape == torch.Size((dim0_mul, self.input_size))
AssertionError: Shape mismatch, please check.
```

[test_w4a16_no_exclusion.py](https://github.com/user-attachments/files/25141834/test_w4a16_no_exclusion.py)

If I try GPTQ (source also attached), I get:

```
Traceback (most recent call last):
  File "/root/test_gptq_no_exclusion.py", line 156, in <module>
    main()
  File "/root/test_gptq_no_exclusion.py", line 142, in main
    oneshot(
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/entrypoints/oneshot.py", line 357, in oneshot
    one_shot()
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/entrypoints/oneshot.py", line 172, in __call__
    self.apply_recipe_modifiers(
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/entrypoints/oneshot.py", line 222, in apply_recipe_modifiers
    pipeline(
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/pipelines/independent/pipeline.py", line 45, in __call__
    pipeline(model, dataloader, dataset_args)
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/pipelines/sequential/pipeline.py", line 73, in __call__
    subgraphs = trace_subgraphs(model, sample_input, sequential_targets, ignore)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/pipelines/sequential/helpers.py", line 135, in trace_subgraphs
    tracer.trace(
  File "/opt/venv/datasci/lib/python3.12/site-packages/llmcompressor/pipelines/sequential/transformers_helpers.py", line 1485, in trace
    self.graph.erase_node(user)
  File "/opt/venv/datasci/lib/python3.12/site-packages/torch/fx/graph.py", line 1257, in erase_node
    raise RuntimeError(
RuntimeError: Tried to erase Node getitem_169 but it still had 1 users in the graph: {output: None}!
```

[test_gptq_no_exclusion.py](https://github.com/user-attachments/files/25141848/test_gptq_no_exclusion.py)

I understand supporting 4bit for this model might be too complicated, but if support cannot be offered, I would suggest tha the readme and the error messages should indicate that this model is not supported in 4 bit (or only in int4 if that's the case - didn't really try mxfp4).

### 🛠️ Steps to reproduce

```
$ python test_w4a16_no_exclusion.py --model-name ibm-granite/granite-4.0-h-small --output granite-4.0-h-small-w4a16
$ python test_gptq_no_exclusion.py --model-name ibm-granite/granite-4.0-h-small --output granite-4.0-h-small-gptq-4bit
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Attempts to quantize Granite 4.0-h-small to 4bits fail #2338

⚙️ Your current environment

🐛 Describe the bug

🛠️ Steps to reproduce

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Attempts to quantize Granite 4.0-h-small to 4bits fail #2338

Description

⚙️ Your current environment

🐛 Describe the bug

🛠️ Steps to reproduce

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions