Implement optional dos2unix feature #5246
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Hi again =)
Related to #4658, I've implemented the option to disable dos2unix fron the dvc configuration file.
We are starting to use dvc internally, but we are facing issues with pdf or image/tiff formats because they are accidentally identified as text. This means that every /r/n is replaced to /n and the md5 hash calculation no longer matches the file content.
We rolled out our own version of dvc internally with this patch, but it is hard to explain to people that they have to install it from another source. For that reason, I've invested a little bit more time to properly add the parameter and submit this PR. Hopefully, someone else will be able to enjoy this feature.
By default, the parameter dos2unix under core configuration will be True. So dvc will behave as usual for everyone. In case someone notices errors due to the dos2unix feature, they can simply set dos2unix to False.
I know that changing this value in an already existing repo may have weird issues, like dvc detecting changes on unchanged files. I think this won't be a usual case because either you have problems and disable it, or you never notice that this option exists. Usually, this value will be set at project creation and never modified again.
I will write the documentation page, but first, I would like to have your feedback.
Thanks for your support!