This software is a research prototype, solely developed for and published as part of the publication DAPO: Defect-aware Prompt Optimization for Multi-type Anomaly Detection and Segmentation.
We introduce DAPO, a novel approach for Defect-aware Prompt Optimization based on progressive tuning for the zero-shot multi-type and binary anomaly detection and segmentation under distribution shifts. Our approach aligns anomaly-relevant image features with their corresponding text semantics by learning defect-aware prompts and introducing additional learnable tokens, without modifying the pre-trained model's parameters.
- Clone this repo
git clone https://github.com/<username>/<project>.git cd <project>
- Create & activate a conda environment
conda create -n <env> python=3.10 conda activate <env>
- Install packages from requirement files
pip install -r .\requirements.txt --index-url https://pypi.org/simple --extra-index-url https://download.pytorch.org/whl/cu118
Please download the datasets used in the experiments:
- MVTec-AD
- VisA
- MPDD
- Real-IAD
- MAD
Organize them under a ./data/ directory as follows:
data/
mvtec/
visa/
mpdd/
real_iad/
mad/
python test.py \
--dataset <dataset_name> \
--test_data_path <your_test_data_path> python train.py \
--dataset mvtec \
--train_data_path ./data/mvtec \
--checkpoint_dir ./checkpoint/ \
--epoch 5 --batch_size 8 --learning_rate 0.001 \
--image_size 518 --model_name "ViT-L-14-336-quickgelu" \
--depth 24 --prefix_token_cnt 4 \
--text_depth 12 --normal_token_cnt 5 --abnormal_token_cnt 5 --prompt_count 10 --layer_token_cnt 4When testing on a new dataset not included above, make sure to update the argument parameter by adding defect_type (e.g., hole, scratch, contamination) as below:
--defect_types crack hole damaged contamination \python test.py \
--dataset visa \
--test_data_path ./data/visa \
--metrics
--save_path ./results/ \
--checkpoint_path ./checkpoint/ \
--epoch 1 --batch_size 1 --learning_rate 0.001 \
--image_size 518 --model_name "ViT-L-14-336-quickgelu" \
--depth 24 --prefix_token_cnt 4 \
--text_depth 12 --normal_token_cnt 5 --abnormal_token_cnt 5 --prompt_count 10 --layer_token_cnt 4python test.py \
--inference_mode \
--image_path ./data/image.png \
--checkpoint_path ./checkpoint/ \
--epoch 0 \
--save_path ./results/ python test_multi_defect.py \
--dataset real_iad \
--data_path ./data/real_iad \
--save_path ./results/rea_iad_multi_type_seg/zero_shot/ \
--checkpoint_path ./checkpoint/ \
--model_name "ViT-L-14-336-quickgelu" \
--image_size 518