Add support for configuring CSV fields during import#287
Add support for configuring CSV fields during import#287
Conversation
de9ccb9 to
23a5c04
Compare
ehofesmann
left a comment
There was a problem hiding this comment.
Approving since it works as intended for this PR, but do have some usability feedback for the future.
What I was doing to test was:
- Export a CSV from an existing dataset (using FOE app)
- Create a new empty dataset
3a) Try to use the I/O plugin and just provide the CSV (Ran into some usability issues noted below)
3b) First import the media, then load in the labels from this CSV (this worked successfully)
Some of the issues I faced:
- Not related to this plugin, but with the CSV export: In step 1, the export doesn't contain the absolute path, so I needed to go and dig up the data_path for the image parent dir. This also meant all images need to be in the same dir
- Issues with 3a with a local CSV
- I can't use the
media and labelsimport type on an empty dataset with a local CSV. It doesn't provide an option to upload the labels CSV from my local machine (just from the cloud). This is why I had to switch to 3b instead
- I can't use the
- Even if I could upload the CSV, there is friction with the workflow of "I have a CSV with absolute filepaths and other fields that I want to load import into an empty dataset". When running `media and labels` on an empty dataset, I need to provide the data directory which I might not know, or might be variable for different samples in the CSV
- If I tried importing with `Labels only` on an empty dataset, then I get stuck here where it needs to verify existing filepaths to use as the data_path: https://github.com/voxel51/fiftyone-plugins/blob/23a5c04484c673a489cc73070f865db6139d8d73/plugins/io/__init__.py#L999
FYI the
I need to clarify that the "Labels only" option does in fact allow for adding new samples, not just adding labels to existing samples |
23a5c04 to
3a53fd0
Compare
Release Notes
@voxel51/io/import_samples, users are now provided the option of customizing which column(s) of data to import and specifying the column name that contains the filepaths (join key)