-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make TSV finally true TSV #923
Conversation
Very proud of it (this after the JSONlines). I will test it, in the next hours. Thank you very much |
I prefer to use tsvlite because of it simplicity. My definition of tsvlite is: fields can contain everything apart from TAB or newline chars. TAB is the field separator. Everything in between is part of the field (i.e., not quoting, no escape characters, nothing special). This makes the file format easy to output and input from/to different systems. The limitation that fields can not contain tabs and newlines was never a big issue. I have some issues with the new TSV implementation: The bigger issue is the new handling of \t \n \ . Is there a way to disable it? For example I might have files which contain Windows file paths with backslashes. This causes some quite unexpected behaviour.
|
@masgo ok -- i will restore |
Thank you. When I read the changes, I also did not think of it. Just did a search&replace for tsvlite -> tsv and run my script. Everything looked normal. Turns out there a handful Overall I think it's a great idea to have the \x replacement, as long as it is optional. Maybe a separate function like |
@masgo for the moment (as a temporary workaround)
|
Since this PR, original file
tsv filetype
csv filetype with tab separator
version 6.0.0
Even if my file is somehow invalid, this probably should throw an error instead of silently truncating the file. Is there something I should be looking for in the file so I can make a reprex for you? |
Re-using CSV code, with comma replaced by tab, has never been a good idea due to issues with handling double quotes.
Here we finally make TSV its own format. See also https://miller.readthedocs.io/en/latest/file-formats/#csvtsvasvusvetc.
This is for #922 and #438.