SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM,
Ming Nie, Dan Ding, Chunwei Wang, Yuanfan Guo, Jianhua Han, Hang Xu, Li Zhang
NeurIPS 2024
This is a official implementation of NeurIPS 2024 paper SlowFocus: Enhancing Fine-grained Temporal Understanding in Video LLM for video understanding.
For video-based dataset, please download ActivityNet dataset from here. If you want to perform evaluation, please also download corresponding files from here. You can download MSVD-QA from here and MSRVTT-QA from here.