-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] Clarify case collision intolerance as a file naming principle #858
Conversation
That sounds better.
Few people read the spec linearly, most people will read this potential new section after hitting an error in the validator (once we add support there), so I think a "great placing" of the section is less important than having it as a section (because it allows us, as you say, to more easily generalize the rule)
I like case collision intolerant |
I think we're asking for trouble by making this a Users can promote some options to promote warnings to errors (for example, fMRIPrep considers missing T1w images an error and OpenNeuro considers missing author lists an error), and some flags appear as check boxes on the web interface. I would support doing that for this to raise awareness of the potential problem, without going so far as to automatically invalidate legacy datasets. |
re @effigies
warnings often get ignored. I checked existing public openneuro datasets -- seems to be no hits. Those which are yet to be released, better have it fixed first (and likelihood of encountering such cases is small)(git)smaug:/mnt/datasets/datalad/crawl/openneuro[master]
$> for ds in ds*; do files=$(/usr/bin/find $ds/*); nf=$(echo $files | wc -l); fl=$(echo $files | tr 'A-Z' 'a-z' | sort | uniq -c); nfl=$(echo $fl | wc -l); echo -n "$ds $nf $nfl "; [ "$nf" == "$nfl" ] && echo "ok" || echo "PROBLEM"; done resulted in none with PROBLEM A sample run on HBN one mentioned
Situation is IMHO of the same caliber of enforcement as " |
Thanks everyone for the feedback! So I am to
|
As I said, OpenNeuro could promote the warning to an error and prevent this from arising in our data. The question is whether the standard should break backwards compatibility in this way. |
without this "fix" to the standard, such a BIDS dataset might not be compatible with major OSes. So, as to me, it is not "breaking backward compatibility" (as if a problematic dataset could be used "correctly" before) but rather a "bugfix" (as @sappelhoff noted in #857 (comment) noted) to the standard to avoid the issue of a BIDS-compliant dataset incompatibility with the OSes/filesystems. |
That was the argument that convinced me to consider this change a bugfix ... that and the fact that we - apparently - wouldn't break a lot of datasets if any. I would be interested in further opinions though. |
I am usually pretty conservatives for these things but I would be tempted to bite the bullet on this one and go for a |
I don't disagree with the arguments for breaking compatibility, but so far it seems like we have a small group approving it, mostly from a developer perspective. For me to sign off, I'd at least want steering group input or a ping to the discussion list. |
Good point. That's more than fair. @yarikoptic would you mind hailing the google group on this issue ? |
roger that, I will RF this PR into adding a short section, and then hail the list |
PR updated, RFC is posted: https://groups.google.com/g/bids-discussion/c/ixm2aFgAJy0 |
The RFC (so far) has yielded two comments that I interpret to be in favor of this (breaking) change. I think an input from @bids-standard/steering would be nice to have here. |
This change sounds very reasonable to me. |
Thinking about this now, this is consistent with the previous decision that
It's mixing rules for dataset curators and BEP leads a little bit. |
Sounds great to me -- feel welcome @effigies to make it into a recommendation or just a follow up commit if placement should be different (section) |
It also sounds good to me. Understanding it as a bug fix and indeed preventing future errors. It's nice there isn't any known public dataset affected. And that the same applies for both labels and suffixes. |
@yarikoptic Pushed a commit directly to your branch. Please have a look. |
Thanks!! |
Thank you @effigies , looks good for me, but I guess I shouldn't approve my own PR ;) |
Well, between the two of us, we can count as one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thank you!
filed https://github.com/bids-standard/bids-validator/issues/1330 , and got surprised that current master is a whooping v1.6.0-203-g811e17e2 - so lots of changes/development since the last release, without any bugfix or minor release. |
It reminds me of an earlier discussion on units, where we identified that Greek characters used in units (ohm) and prefixes (micro) are ambiguous. See https://bids-specification.readthedocs.io/en/stable/99-appendices/05-units.html. An important raison d'être for the specification is to disambiguate and to make choices where individual researchers themselves might go for opposite choices. In this case "a" and "A" are ambiguous under some conditions. Hence I also approve of this "breaking change" as a bug fix to the specification. |
Given the general agreement and the postings on the mailing list to make a wider audience, I feel like we can merge this PR in 3 days, thus observing our 5 days rule. It'll then be part of the next release unless anything changes in the meantime. |
Thanks for identifying this issue and pushing for a solution @yarikoptic and all reviewers. |
Closes bids-standard/legacy-validator#857
An alternative could be to establish a new section, like "Case sensitivity" in the same document, which would allow to generalize the rule to cover the entire file path, and then just refer to it from
<label>
. But was not sure if it would not just get lost among already a good number of sections there, and where to place it. Sounds like a foundational aspect to me and should be close on top but "what is not as important" there? ;)edit 1: is there a term to describe such desired behavior? it is neither "case sensitive" nor "case insensitive" . "case collision intolerant"?