When I train a tokenizer from scratch on my own dataset using the VP2-16384.config, the codebook usage is much lower than 100% which is reported in the XQ-GAN paper. Typically, low codebook usage results in suboptimal tokenizer reconstruction and negatively impacts downstream tasks. Do you have any suggestions on how I could address this issue?

When I train a tokenizer from scratch on my own dataset using the VP2-16384.config, the codebook usage is much lower than 100% which is reported in the XQ-GAN paper. Typically, low codebook usage results in suboptimal tokenizer reconstruction and negatively impacts downstream tasks. Do you have any suggestions on how I could address this issue?