-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Unfortunately, transformers-CFG fails for Huggingface models that use a XLMRobertaTokenizerFast tokenizers:
Traceback (most recent call last):
[...]
AssertionError: Tokenizer not supported: XLMRobertaTokenizerFast, supported tokenizers: {<class 'transformers.models.t5.tokenization_t5_fast.T5TokenizerFast'>, <class 'transformers.models.qwen2.tokenization_qwen2_fast.Qwen2TokenizerFast'>, <class 'transformers.tokenization_utils_fast.PreTrainedTokenizerFast'>, <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>, <class 'transformers.models.codegen.tokenization_codegen_fast.CodeGenTokenizerFast'>, <class 'transformers.models.gemma.tokenization_gemma_fast.GemmaTokenizerFast'>, <class 'transformers.models.bart.tokenization_bart_fast.BartTokenizerFast'>, <class 'transformers.models.gpt2.tokenization_gpt2_fast.GPT2TokenizerFast'>}
What does it take to add support for XLMRobertaTokenizerFast? If it make sense conceptually, I am interested in contributing to the addition of this tokenizer.