Install packages for your virtual environment:
pip install -r requirements.txt
Set up API keys for openai (Non-Free GPT-4Vision API Usage) and google (for Free Gemini Vision API Usage) on terminal or .bashrc:
export OPENAI_API_KEY=<your key>
export GOOGLE_API_KEY=<your key>
- Download link: MVTecAD-Website
- Please put the dataset under the folder datasets/MVTecAD
Eval on Gemini
python main_gemini.py --dataset "datasets/MVTecAD/vlm_for_ad_dataset.json" --cache "./output/answer_genmini.json" --output "./output/answer_5.json" --google_api_key 'ADD_YOUR_GOOLE_API_HERE’ --prompt_template “./prompt_template/ad_prompt.txt”
Eval on GPT4-Vision
python main_gpt.py --dataset "datasets/MVTecAD/vlm_for_ad_dataset.json" --cache "./output/answer_gpt4v.json" --output "./output/answer_gpt4v.json" --openai_api_key ‘ADD_YOUR_OPENAI_API_HERE’ --prompt_template “./prompt_template/ad_prompt.txt”
Eval on InternVL2
- Follow the official guidance to set up the environment for
InternVL2and download the checkpoints. (By default, we usedInternVL2-8B.)
python main_internvl2.py --model "~/path/to/InternVL2-8B" --dataset "datasets/MVTecAD/vlm_for_ad_dataset.json" --cache "./output/answer_internvl2_8b.json" --output "./output/answer_internvl2_8b.json"
Eval on Qwen2VL
- Follow the official repo to set up the environment for
Qwen2VLand download the checkpoints. (By default, we usedQwen2-VL-7B-Instruct.)
python main_qwenvl2.py --model "~/path/to/Qwen2-VL-7B-Instruct" --dataset "datasets/MVTecAD/vlm_for_ad_dataset.json" --cache "./output/answer_qwenvl2_7b.json" --output "./output/answer_qwenvl2_7b.json"
Please cite our paper if you find this repo useful! 💛 💙 💛 💙
@INPROCEEDINGS{9776026,
author={Xu, Xiaohao and Cao, Yunkang and Zhang, Huaxin and Sang, Nong and Huang, Xiaonan},
booktitle={2025 IEEE 28th International Conference on Computer Supported Cooperative Work in Design (CSCWD)},
title={Customizing Visual-Language Foundation Models for Multi-modal Anomaly Detection and Reasoning},
year={2025},
}If you have any question about this project, please feel free to contact [email protected]
