Conversation
Automated Review URLs |
Co-authored-by: Johannes Soltwedel <[email protected]>
There was a problem hiding this comment.
Hi @jluethi,
thanks for the update! Maybe as a follow-up here, could you add/change the following (some of this is non-obvious and didn't occur to me before either ^^"):
- Add some test.json with valid/invalid plate/well metadata under
tests/attributes/spec/{valid/invalid}/{well/plate}to make sure these are correctly flagged as valid or invalid by the CI? - Do you have some example json to upload somewhere on S3 that one could point the validator to? We'd need a copy of ome/ome-ngff-validator#48 from @will-moore to point at this branch for this to work, though.
- Small mention of the proposed change in
version_history.md:)
You can just commit JSON to git, and use the github url to the raw JSON object. edit: or use a gist. |
In that case, adding valid/invalid examples for the validation CI is perfectly enough. Probably even better 👍 |
|
@jo-mueller Thanks for those follow-ups! I added relevant test jsons for wells, both valid and invalid. I think we can cover all the new option in 1 valid part, but should test the 4 ways to be invalid separately (such that we would catch each issue, not just a combination thereof). This does not affect the plate schemas, because they only point to where the well is, not where the paths in the well are. I also added something to |
Sounds good to me 👍
Jup! |
|
i agree! missed this and was just proposing similar in #433... 100% agree, and my use case also centered around using Some quick questions/thoughts about this proposal though: "path": {
"description": "The path for this field of view subgroup",
"type": "string",
"minLength": 1,
"not": {
"anyOf": [
{ "const": "." },
{ "const": ".." },
{ "pattern": "^__" },
{ "pattern": "/" }
]
}This might be too permissive? The current
It's worth pointing out that |
|
Funny how we converge on things here!
Personally, I'd be totally fine with this being more restrictive. My motivation for going as broad as suggested above was to follow the Zarr recommended node names (https://github.com/zarr-developers/zarr-specs/blob/main/docs/v3/core/index.rst#node-names) and to have the same level of OME-Zarr spec restrictions on these image names as for other, standalone images. The Zarr rules also include the should for alphanumeric and .-_. Thus I added:
I initially drafted schemas that would enforce this, but wasn't sure if someone had important use cases for some extra characters that are allowed (even if not recommended) in Zarr. And in some discussions, I wasn't sure we'd want to restrict image in well much differently than another standalone images. |
that's a good motivation! And for zarr (a generic, totally agnostic storage layer) i can see why they set their minimum constraints extremely broadly (you don't want to impose constraints that would artificially limit valid use cases right off the bat). It does, however, definitely leave open the possibility for fundamental incompatibility on certain file systems. Zarr is basically saying "look, if you want to make something that can't be used on windows or in bash scripts, we're not going to stop you". but I think the point of a domain-specific subschema like ngff is to add constraints that enable interoperability. and the zarr spec acknowledges this sort of thing:
And the zarr node name spec itself offers a recommendation:
and I think that's the "right" constraint for something like ngff. 👍 |
|
Agreed on all of that. And personally, I'd be fine to limit the image in well spec to just that. The discussion on how all OME-Zarrs can be named across the whole spec seems to be a broader discussion though that I didn't intend to open for this PR. While I'm also not super happy about only having a should that stops people from having |
|
Agreed. And I definitely wasn't suggesting broadening the discussion to the whole spec (agree we should only be discussing well fields at the moment). But given that well field paths are already super strictly defined, it seems less breaking on the longer term to just slightly relax the spec now, and accept the asymmetry with dataset.paths, rather than fully open it up (for symmetry with dataset paths) and then break both of them in the future to enforce file system compatibility. In other words: it's always easier to be more permissive, but it's a breaking change to retract permissions. Am I understanding your point correctly? Or were you talking about something different there? In other words: what's your current preference for well image paths at the moment: go with your permissive spec that uses the general zarr node spec? Or just slightly relax, going with the zarr node recommendation (allow just under score, dash, period) |
|
You understood my point well.
This seems very convincing to me. I'll update the PR accordingly. |
|
@tlambert03 @jo-mueller I've updated the proposal according to the discussion above, limiting the allowed characters to alphanumeric . _ -. @imagejan @thewtex @will-moore Given that you reviewed this in an earlier form, pinging you again for this. Instead of allowing almost any characters in well paths, we decided it would be more reasonable to relax the requirements less drastically to allow "only" ._- (dot, underline, dash) in the image paths in wells. The main motivation is that many of the other potential characters cause too many issues across different storage systems / operating systems and we don't want to overly broaden the allowed spec (from only alphanumeric) if it would have to be restricted again later due to such issues. See detailed discussion above |
will-moore
left a comment
There was a problem hiding this comment.
Looks good - makes sense 👍
This PR suggests relaxing the allowed names of images in wells, following the discussion here: ome/ngff#380
While I understand restrictions on names for rows and columns of HCS plates, the images in HCS plates can be many things. I don't think it's reasonable to restrict it more than general Zarr node naming restrictions. As proposed by @d-v-b , this PR proposes following the Zarr spec recommendations for characters in node names.
I don't know what the rule on linking to other pages like the Zarr spec page is / should be, so we can also remove that link if needed.
@thewtex Could you also have a look at this and whether this would interfere with anything ngff-zarr related?
We already implemented this in ngio (we actually had always been using
_in path names and only noticed last fall that this clashes with the strict path definitions in HCS wells.cc @imagejan @lorenzocerrone