-
Notifications
You must be signed in to change notification settings - Fork 93
Closed
Labels
enhancementIndicates new improvementsIndicates new improvements
Milestone
Description
Summary
Following review feedback on PR #1311, we should refactor the storage layer to use consistent URL representation for all data sources, including local files.
Current Behavior
- Remote paths use URLs:
s3://bucket/path,gs://bucket/path - Local paths use raw filesystem paths:
/path/to/file is_remote_url()function distinguishes between the two
Proposed Change
- Accept both formats from users:
/path/to/fileandfile:///path/to/file - Normalize to URLs internally: Convert all paths to URL format (
file://for local) - Store URLs consistently in the database
- Leverage fsspec uniformity: fsspec already treats all backends (including local) uniformly via URLs
Benefits
- Coherent internal representation
- Simpler codebase - no special-casing for local vs remote
- Better alignment with fsspec's design philosophy
- Avoids potential bugs from inconsistent handling
Implementation Notes
- Add
file://toREMOTE_PROTOCOLS(or rename toURL_PROTOCOLS) - Create helper to normalize user paths to URLs
- Update
StorageBackendto work with URLs consistently - Ensure backward compatibility for existing stored paths
References
- PR DataJoint 2.0 #1311 review discussion: DataJoint 2.0 #1311 (comment)
- Related to fsspec's unified approach to filesystems
Metadata
Metadata
Assignees
Labels
enhancementIndicates new improvementsIndicates new improvements