Skip to content

Inquiry regarding dataset preparation and preprocessing for reproduction #3

@hunblingbling

Description

@hunblingbling

Thank you for sharing your insightful research. I am currently working on reproducing your study and would like to request some clarification regarding the dataset preparation.

Specifically, I have the following questions:

Dataset Acquisition: What is the recommended procedure for downloading the OpenVid-1M dataset?

File Preparation: How should I generate the openvid-1m.parquet file required to execute the build_rag_database.py script? If there are specific preprocessing steps or conversion scripts needed to format the raw data into this parquet file, could you please provide guidance on that?

I look forward to your guidance. Thank you for your time and for contributing to the community.

Best regards,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions