Skip to content

gemma3_text support in NormalizedConfigΒ #2393

@lmmx

Description

@lmmx

Feature request

I got the following error while trying to optimise these embeddings with ONNX

KeyError: 'gemma3_text model type is not supported yet in NormalizedConfig. Only albert, bart, bert, big_bird, bigbird_pegasus, blenderbot, blenderbot-small, bloom, falcon, camembert, codegen, cvt, deberta, deberta-v2, deit, dinov2, distilbert, donut-swin, electra, encoder-decoder, gemma, gpt2, gpt_bigcode, gpt_neo, gpt_neox, gptj, imagegpt, internlm2, llama, longt5, marian, markuplm, mbart, mistral, mixtral, modernbert, mpnet, mpt, mt5, m2m_100, nystromformer, olmo, olmo2, opt, pegasus, pix2struct, phi, phi3, poolformer, regnet, resnet, roberta, segformer, speech_to_text, splinter, t5, trocr, vision-encoder-decoder, vit, whisper, xlm-roberta, yolos, qwen2, qwen3, qwen3_moe, smollm3, granite, clip are supported. If you want to support gemma3_text please propose a PR or open up an issue.'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/louis/lab/emb/tarka/.venv/bin/optimum-cli", line 10, in <module>
    sys.exit(main())
             ~~~~^^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/commands/optimum_cli.py", line 219, in main
    service.run()
    ~~~~~~~~~~~^^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/commands/export/onnx.py", line 264, in run
    main_export(
    ~~~~~~~~~~~^
        model_name_or_path=self.args.model,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<20 lines>...
        **input_shapes,
        ^^^^^^^^^^^^^^^
    )
    ^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/exporters/onnx/__main__.py", line 399, in main_export
    onnx_export_from_model(
    ~~~~~~~~~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<18 lines>...
        **kwargs_shapes,
        ^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/exporters/onnx/convert.py", line 1096, in onnx_export_from_model
    optimizer = ORTOptimizer.from_pretrained(output, file_names=onnx_files_subpaths)
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/onnxruntime/optimization.py", line 126, in from_pretrained
    return cls(onnx_model_path, config=config, from_ortmodel=from_ortmodel)
  File "/home/louis/lab/emb/tarka/.venv/lib/python3.13/site-packages/optimum/onnxruntime/optimization.py", line 75, in __init__
    raise NotImplementedError(
    ...<2 lines>...
    )
NotImplementedError: Tried to use ORTOptimizer for the model type gemma3_text, but it is not available yet. Please open an issue or submit a PR at https://github.com/huggingface/optimum.

Motivation

The resulting embeddings are very fast on GPU but slow (20x slower) on CPU! This must surely be from lack of optimisation (-O2 and -O3 do not work)

Your contribution

I can look into it but the error message said to report it here first!

To reproduce:

# Try exporting with specific CPU optimization
uv pip install sentence-transformers optimum[onnxruntime-gpu]

optimum-cli export onnx \
  --model Tarka-AIR/Tarka-Embedding-150M-V1 \
  --task feature-extraction \
  --optimize O3 \
  tarka-150m-v1-onnx-o3/

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions